Home > tetherless world > Two Misconceptions about the Semantic Web

Two Misconceptions about the Semantic Web

November 18th, 2011

I recently presented at the Semantic Graph Database Processing BOF at SC2011, and I had the opportunity to discuss with others the needs for high-performance computing in web-scale computation and the benefits of Linked Data and ontologies on the World Wide Web. There was one participant there who was adamantly opposed to the semantic web.  (I think his exact quotes outside of the presentation were something like “I do not believe in the semantic web” and “only the semantic web cares about the semantic web”).  As I tried to make my case with him, it became increasingly clear to me that this person had a few misconceptions about the semantic web. I want to address those misconceptions here.

Before I continue, though, allow me to disclaim a bit. I am not a representative of the entire semantic web community, although I do consider myself a member of it. Additionally, I am not officially associated with the W3C. I write this blog entry simply in the capacity of a semantic web enthusiast (henceforth, semwebber), and not even as a member of the Tetherless World Constellation. I invite, nay, urge other semwebbers to contribute comments to this blog post in any capacity (agree, disagree, amend, etc.).

1. “One ontology to rule them all”

To my knowledge, nobody has ever claimed that there should be “one ontology to rule them all.” Instead, what is regularly promoted is ontology reuse and/or integration. For example, the FOAF ontology is widely used in the semantic web to describe persons; why create your own ontology when you can reuse a well-established one? Integration of ontologies allows for conciliation of perspectives, causing data that use these ontologies to become meaningfully related. Admittedly, there are some rather large, comprehensive ontologies out there, and there are some very popular and pervasive ones, too. However, there is no standard or recommendation that requires publishers of RDF data to comply with any particular ontology. You could even ignore the RDF vocabulary if you so please (yes, even rdf:type).

The primary purpose of an ontology (in my view) is to attach explicit semantics to your data. Just as the participant had stated (although he meant it in contrast to the semantic web), there are many ontologies. They compete in the ecosystem of the World Wide Web and evolve accordingly (or become extinct).

2. “Triples all the way down”

(First, let me say, this is not an affront to Planet RDF.)

This is a bit of a pet peeve of mine, and perhaps what I say here will offend some semwebbers (I hope not). The semantic web (in my view) is not about “triples all the way down.” What do I mean by that? Let me explain.

RDF brings primarily two things to the table when it comes to publishing and integrating data on the web: names in the form of URIs, and a simple data model that is flexible enough for (arguably) nearly any kind of data. (I would like to add a third, meaningful links, but I will avoid that for now.) So when data is published to the web, publishing it as RDF allows you: (1) to identify the things in your data across the World Wide Web, and (2) to structurally (and possibly semantically) integrate your data with other data on the World Wide Web. (I emphasize “World Wide” here to bring to attention the vast scope of publication, identification, and integration that is being achieved.) Fantastic.

Does this mean that everything can be efficiently (or rather, ideally) represented in RDF? No. Then why would you ever want to handle triples? You probably don’t. Let me explain.

RDF is meant to solve the problem of meaningfully publishing data (not just documents) on the World Wide Web. Beyond that, do what you want. More specifically, when you crawl and/or aggregate data from the World Wide Web, you don’t have to keep the RDF data as triples in your system. It is no longer on the global stage of the World Wide Web; rather, it is now in your system where you are king. So optimize away! Store it or process it however you like! Relational databases? Sure! Rewrite URIs as shorter terms? Whatever floats your boat! Ignore the explicit semantics and treat it like an unlabeled graph? I wouldn’t recommend it, but you’re the king! Do whatever it takes to meet your use case, and if your use case has something to do with RDF data, then fine, leave it as triples if you want. My point is, it’s not necessarily “RDF all the way down,” but it is “RDF at the top” where “top” is the place of publication, the World Wide Web. The universal naming mechanism of URIs and the generic data model enables data publishers to get data out there in a way that can be explicitly understood by machines (for example, when I say “Beast is furry,” am I talking about Mark Zuckerberg’s dog or the fictional X-Man Dr. Henry Philip “Hank” McCoy?), but as the creator of that machine, it’s up to you how to utilize those explicit semantics.

Beast, Mark Zuckerberg's DogBeast, the fictional X-Man (They both look furry to me.)

To be clear, though, I am promoting RDF as a way to publish structured, semantic data as opposed to not publishing structured, semantic data.  In the future, it is conceivable that there may exist other good ways to publish structured, semantic data, but RDF exists today and is widely used.

So I will leave it at that. Again, I invite comments, rebuttals, accolades, disparagements, etc.

Jesse Weaver

VN:F [1.9.22_1171]
Rating: 9.4/10 (5 votes cast)
VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)
Two Misconceptions about the Semantic Web, 9.4 out of 10 based on 5 ratings
Author: Categories: tetherless world Tags:
  1. November 19th, 2011 at 09:10 | #1

    You needed an accolade, I am happy to give it to you:-)

    Not much to add. The first misconception, the ‘one ontology rule them all’ is very widespread. We made something seriously wrong many years ago to give this impression and we will pay for this mistake until the end of our existence…:-( A corresponding misconception is that you have to have complex, large and mathematically very precise ontologies (ie, OWL, possibly DL) to make any step in this area. Wrong…

    I must admit I was more surprised by the second item. Not that I’d disagree with what you write but, rather, because I did not hit this question that often; it seems so obvious that applications can do what they want behind the scenes… Anyway…

    Thanks Jesse

    Ivan

    VA:F [1.9.22_1171]
    Rating: 3.5/5 (2 votes cast)
    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
  2. November 19th, 2011 at 12:56 | #2

    @Ivan Herman
    Yes, the second point is in fact obvious, but for some reason, I sometimes encounter those who think that the semweb community is trying force everything into triples. I just wanted to make clear in this entry that the triples are for publication, not necessarily for the inner-workings of a particular system.

    VN:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  3. Bill Anderson
    November 20th, 2011 at 16:44 | #3

    Jesse, a quote I heard attributed to Oscar Wilde is “Anything worth saying is worth saying again, because no one is listening”. And the thing with misconceptions is that they have a way of never dying, so it is useful to repeatedly dispel misconceptions. Keep it up.

    In my own learning about the semantic web, and attempting to put it to use, I find the many, diverse ontologies a bit overwhelming. For example, a current problem I have is looking for an authoritative vocabulary or ontology about metrics and metrology to use for documentation purposes. So while it is easy to use a common dictionary to find meanings, it is not so easy (for me) to find ontologies to use. Instead of one ontology, are there a few ontologies to find a large number of common terms defined and identified with persistent URIs? I am thinking of something analogous to the top half-dozen English dictionaries. For me then, the problem is not one, but perhaps too many. What tools are available and what skills do I need to be successful.

    I like the “triples at the top” viewpoint. But underneath, in the semantics, what, exactly, is down there? “Turtles”?

    Bill

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  4. November 21st, 2011 at 04:37 | #4

    On the second point (triples all the way down), any language (e.g., RDF) is an exchange format shared in the public space (e.g., the Web). To exchange information in english, you don’t need to know how english is stored in speakers’ brain.
    https://plus.google.com/114406186864069390644/posts/FyKg7FDP4ZM

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: +3 (from 3 votes)
  5. November 22nd, 2011 at 06:22 | #5

    Bill Anderson :
    In my own learning about the semantic web, and attempting to put it to use, I find the many, diverse ontologies a bit overwhelming. For example, a current problem I have is looking for an authoritative vocabulary or ontology about metrics and metrology to use for documentation purposes. So while it is easy to use a common dictionary to find meanings, it is not so easy (for me) to find ontologies to use. Instead of one ontology, are there a few ontologies to find a large number of common terms defined and identified with persistent URIs? I am thinking of something analogous to the top half-dozen English dictionaries. For me then, the problem is not one, but perhaps too many. What tools are available and what skills do I need to be successful.

    The “what ontology should I use” problem has been around for a long time and I don’t think it has been solved yet. There are a number of places to look, though. E.g., you could try:

    http://schemapedia.com
    http://sindice.com
    http://schemacache.com

    And yes, I think it’s definitely turtles all the way down! ;)

    Knud

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  6. November 22nd, 2011 at 10:05 | #6

    I think your comment re. “not always being about triples” is really more like “not always about RDF”. The biggest problem with RDF is that positioning it as a Data Model and Syntax is awkward and problematic. The model is an EAV (Entity-Attribute-Value) based directed graph that incorporates hyperlinks in the Entity, Attribute, and Value slots (optionally). RDF is a syntax for expressing relations in directed graph form.

    Construction, Access, Integration, and Management of structured data at InterWeb scales is really what’s important. It trumps monikers such as Linked Data or The Semantic Web.

    For the Semantic Web and Linked Data communities, the bigger problem is the flawed political desire to conflate RDF and Linked Data. The fact of the matter is simply this (based on good old computer science history): you can use RDF syntax to construct Linked Data Objects published on a network. This works not because of RDF but rather the principles of Linked Data which mandate specific de-reference behavior on the part of hyperlinks en route to delivering indirect access (via name indirection) to fine grained-data objects (resources) in a manner that honors the semantic fidelity of equivalence by name or value.

    The biggest problem is RDF overeach in Linked Data and Semantic Web narratives. RDF syntax wars are an unnecessary distraction. What’s so wrong with RDF being an option? A useful one at that? Why does it have to be all or nothing?

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  7. November 22nd, 2011 at 12:56 | #7

    Kingsley Idehen :
    RDF is a syntax for expressing relations in directed graph form.

    I think RDF is more than just syntax. It also carries with it some very basic semantics on how RDF data should be interpreted.

    Kingsley Idehen :
    For the Semantic Web and Linked Data communities, the bigger problem is the flawed political desire to conflate RDF and Linked Data.

    I certainly do not conflate RDF and Linked Data. Linked Data can take many forms as long as they adhere to the Linked Data principles.

    Kingsley Idehen :
    The biggest problem is RDF overeach in Linked Data and Semantic Web narratives. RDF syntax wars are an unnecessary distraction. What’s so wrong with RDF being an option? A useful one at that? Why does it have to be all or nothing?

    I agree that RDF is an option, and a useful one. As I stated in the blog entry, it is conceivable that there will be other good options for publishing structured, semantic data in the future. In my opinion, though, RDF (in any of its syntaxes) is the best existing option for publishing Linked Data.

    VN:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  8. November 23rd, 2011 at 04:04 | #8

    Great post, Jesse. Myself have been recently blogging about similar ideas on my own blog – which is intentionally in Chinese for a smaller audience (at least for now). Now I begin to believe that another biggest misconception about semantic web is “formatism”: if you use RDF/OWL, then you are building a semantic web app, and vice versa. I cannot agree more with your “RDF at the top” thesis – and what matters most is typically not the top, but the things inside the app.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  1. November 22nd, 2011 at 12:01 | #1