Archive for July, 2011

Improving the DATA act – open govt in the USA

July 8th, 2011

Beth Noveck, NYU law professor and previously Deputy Chief Technology Officer of the US, and I have written a  blog posting that discusses the DATA act, a proposed US law to improve the transparency of US spending. I realized this may also be of interest to folks working on Linked Data, and especially linked open government data, so I thought I’d mention this here as well.

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Introducing Sterno, another RDF syntax (Really?)

July 3rd, 2011

In this blog post, I want to introduce a RDF syntax called Sterno. (Oh no… not another one… right? Please, read on.) Sterno is an extension of the N-triples syntax and a subset of the Turtle syntax aimed at improving compression over N-triples while also preserving the simplicity of N-triples. But what could possibly warrant defining yet another RDF syntax?

After winning the 2009 Billion Triple Challenge, Greg and I realized that a fair amount of time in our system was spent transferring data from disk. At that time, our system read N-triples documents because their simple syntax was amenable to parallel I/O, but N-triples documents are often very verbose. Turtle, however, introduces many features which improve compression, and N-triples is a syntactic subset of Turtle. So the idea arose, how much of Turtle (i.e., which features of Turtle) should we use to extend the N-triples syntax in order to improve parallel I/O? The details of our investigation into the matter can be found in our paper entitled “Reducing I/O Load in Parallel RDF Systems via Data Compression,” published at the 1st Workshop on High-Performance Computing for the Semantic Web (HPCSW2011). (We also compare the use of Sterno “compression” of RDF data with LZO compression for parallel I/O. The HPCSW proceedings can be found here for those who are interested.)

Admittedly, a RDF syntax designed for parallel I/O would seem to have a limited audience, but it turns out that Sterno may be of more general use. Sterno’s simplicity may be desirable for a multitude of purposes simply because it is easier to support than Turtle (that is, easier to produce and parse), particularly for use on the command-line. Note that Sterno is not meant to replace or compete with any other RDF syntax; instead, it simply gives a name and definition to a useful middle ground between N-triples and Turtle.

Sterno is normatively described as an extension of the N-triples syntax. In other words, the Sterno syntax subsumes the N-triples syntax, and the Sterno syntax is defined as the N-triples syntax with the addition of the following Turtle features:

  • UTF-8 Encoding: A Sterno document is a Unicode character string encoded in UTF-8.
  • Prefix declarations and QNames: A Sterno document allows for prefix declarations and QNames, but all prefix declarations must occur at the beginning of the document before any actual triples.
  • Implicit datatypes for xsd:integer, xsd:double, xsd:decimal, and xsd:boolean. For example, "1"^^xsd:integer may simply appear in the document as 1.
  • The a keyword may be used to replace rdf:type whenever it occurs in the predicate position of a triple.
  • The empty collection () may be used to replace rdf:nil whenever it occurs in the subject or object position of a triple.
  • An anonymous blank node [] may be used, although its usefulness is severely limited in Sterno.
  • Blank node labels may be as complex as in Turtle. That is, we do not maintain the restriction in N-triples that blank node labels be only word characters. (E.g., _:blank-node is valid in Turtle and Sterno, but not in N-triples.)

An actual grammar for the Sterno syntax can be found in the extended version of our HPCSW paper. All this may be a bit too much to think about in one's head, so following is a contrived example in N-triples, Sterno, and Turtle. (For a more realistic example, see my FOAF profile in N-triples, Sterno, and Turtle.)


<file:///foaf.rdf#me> <> <> .
<file:///foaf.rdf#me> <> "Andr\u00E9" .
<file:///foaf.rdf#me> <> "40"^^<> .
_:list <> <> .
_:list <> "line1\n\tline2 \"quoted string\" " .
_:list <> <> .
_:contrived <> <> .
# What a contrived triple.


@prefix mine: <file:///foaf.rdf#> .
@prefix rdf: <> .
@prefix foaf: <> .
mine:me a foaf:Person .
mine:me foaf:nick "André" .
mine:me foaf:age 40 .
_:list a rdf:List .
_:list rdf:first "line1\n\tline2 \"quoted string\" " .
_:list rdf:rest () .
[] a <> .
# What a contrived triple.

Turtle (with base URI <file:///foaf.rdf>):

@prefix foaf: <> .
<#me> a foaf:Person ; foaf:nick "André" ; foaf:age 40 .
@prefix rdf: <> .
( """line1
line2 "quoted string" """ ) a rdf:List .
[]a<> . # What a contrived triple.

Put roughly, the Sterno syntax maintains the simplicity of N-triples that each line contain at most one triple, and there must be whitespace between the RDF terms of a triple. Therefore, although it is not as concise as Turtle (e.g., property lists and object lists are not adopted), it is easier to parse and generate.

Feedback welcome, even encouraged.

(Why the name “Sterno”? The name “Sterno” originated as an abbreviation for sternotherus, a genus of aquatic turtle, the most common species of which typically grows to only 7.5-14 centimeters. The name is chosen to reflect that the Sterno syntax is a small, syntactic subset of the Turtle syntax. Additionally, it is an acronym meaning “Simple, TErse Rdf… NOthing else.”)

VN:F [1.9.22_1171]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)
Author: Categories: tetherless world Tags: