Archive for May, 2008


May 28th, 2008

By a series of interesting coincidences in life, I have recently found myself in contact with Andrew Hugill, who is, among many other things, the Director of the Institute of Creative Technologies at DeMontfort University in Leicester, UK. Andrew sent me a copy of a CD of his music called “Pataphysical Piano” which I have truly enjoyed, and recommend to those interested in new directions in music. That, however, is not the intent of this blog (although I’m sure he won’t mind a few extra sales).

Rather, I was curious about the term “pataphysics” and was pleased to see a Wikipedia entry on the subject show up in the first page of the 45,000 or so Google finds. The original definition was “”the science of imaginary solutions, which symbolically attributes the properties of objects, described by their virtuality, to their lineaments” which didn’t shed much light. However, it was later stated to be the principles that rest on “the truth of contradictions and exceptions.” This latter, for those of you who know me, is way too good to believe — as I believe a crucial aspect of the Semantic Web is that we will have to learn to live with the truth of contradictions and exceptions, and that that is the main argument I’ve been having with the forces of neatness, many of whom have clustered in the OWL 2 WG.

The philosopher, playwright and general polymath Alfred Jarry, who coined the term pataphysics, stated that it was “as far from metaphysics as metaphysics extends from regular reality.” I am happy to report that Googling for the term “patadata” I only find 10 hits, none of which uses it to mean “data interpreted through the truth of contradictions and exceptions” which is “as far from metadata as metadata extends from a databased representation of reality.” So consider this term now to be coined with exactly that meaning, and I happily join the ranks of previous petaphysicists as I continue my study of the functions and properties of “patadata markup” — a long paper on which I will publish as soon as I work out a few more details.

Yours patalogically,
Jim Hendler

p.s. Interestingly, one of the interesting aspect of pataphysics throughout the past century has been a mix of seriousness and parody, often non-distinguishably entwined. I hope to continue that tradition with this blog post and my future writings on the subject.

VN:F [1.9.22_1171]
Rating: 9.8/10 (5 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Uncategorized Tags:

Fellowship of the (Semantic) Web: The Two Towers

May 25th, 2008

By popular request (okay, a couple of people asked for it), I have put my Talk from Semantic Technologies 2008 online – warning, it’s about 22M pdf (lots of gratuitous images to keep things fun)


Jim H.

VN:F [1.9.22_1171]
Rating: 7.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: AI, Semantic Web, tetherless world, Web Science Tags:

Earthquake, google, and more

May 22nd, 2008

Below is an email I sent to the group today.


Dear TW friends

You must have already known the huge earthquake occurred on May 12 in
Sichuan, China. It has caused enormous loss of lives: more than 50,000
confirmed death, around 30,000 missing, plus about 300,000 people
injured as of today. The whole nation, as well as Chinese all over the
world including me, are in deep sorrow for the tragedy.

For the memorial of the earthquake victims, on May 19 14:28pm (Beijing
Time), sharply one week after the earthquake, Chinese public held a
moment of silence. People stood silent for three minutes while air
defense, police and fire sirens, and the horns of vehicles, vessels
and trains sounded.

Google China released a traffic curve for the three minutes [1]. At
the deepest point, it dropped to 10% of the normal traffic. At the
time, millions of people stopped their work on computers, stood up and
lowered their heads to observe. The curve clearly conveys a message of
national unity of the Chinese people in a time of calamity. I’m pride
to be a part of the people.

Web plays an important role in the earthquake relief this time.
Messages and information are exchanged on the web much faster than
traditional ways in helping the rescue work. For example, when a girl
heard that army helicopters couldn’t find a landing site around her
home town, she immediately posted a good location on the internet, and
it was replicated thousands time across many sites in just a few
hours, until it reaches the army command. For another example, when
all communication avenues were cut off from the outside world, the
first message from the isolated area was from the website [2] of the
local government, which was revived by backup power and link; due to
reports from the website, it was decided to use airdrop instead of
land rescue for some area, otherwise it will be too late.

This can still be improved. With semantic web, such information can be
propagated, instead of by human forwarding, by software agents in just
seconds, to the handheld device of the pilot of helicopter. In
earthquake relief, every second saved in knowledge aggregation and
propagation means more hope for lives. I hope this dream of tetherless
world can become true as early as possible.

Thank you for reading this.



VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Uncategorized Tags:

Research challenges from TWINE

May 21st, 2008

An interesting interview(source), by John Breslin, revealed some interesting technology features behind Twine: privacy, data integration, and data storage. I got a mixed feeling on that none existing triple/quad stores are used and TWINE had developed its own. How do the current semantic web technologies fit in enterprise-level, small-group-level, and person-level applications, and which triple store solution is ready for supporting such applications? The eight-element tuple is designed for efficiency, but will that be a common model for other social semantic web sites? As for privacy, are there any new benefits or new challenges brought by the semantic web technologies, or we are still using (user, group) access control mechanisms widely used in Web 2.0. Finally, the data integration would be a very interesting challenge: do we have reasonably good automatic entity disambiguation tools; how to use “collective intelligence” to complement the automated tools; and how to present the integration results to end users without causing too much surprise. In general, the deployment of TWINE is promising; and that will produce more interesting and practical challenges to the research community.

Initially Radar had their own triple store, an LGPL one from the CALO project. They found that it didn’t scale towards web-scale applications, and it didn’t have the levels of transaction control you’d need from an enterprise application. They decided to go for a SQL database (PostgreSQL) with WebDAV. However, relational databases weren’t optimised for the “shape” of data that they were putting into it, so it needed to be tweaked. They’ve had no performance issues so far, but they may move to a federated model next year.

….Twine uses an eight-element tuple store (subject-predicate-object, provenance, time stamp, confidence value, and other statistics about the triple or item itself). They can do predicate inferencing across statements, access control, etc. …

… The key “secret sauce” is that everything in Twine is generated from an ontology. The entire site – user interface elements, sidebar, navbar, buttons, etc. – come from an application ontology…

Q: The first one was about privacy. What if you add something and then later you decide that you want to delete it – is it really deleted or does Twine keep it around?

A: Nova answered that currently, it is not really deleted, it goes into a non-visible triple. But they will be doing that (really deleting it) soon.

Q: As one imports information from various places, what exactly is there in Twine that will prevent a person having to merge any duplicate objects?

A: Nova said there is limited duplication detection at the moment, but this will be improved in a few months. Most people submit similar bookmarks and it is reasonably straightforward to identify these, e.g. when the same item is arrived at through different paths on a website and has different URLs.

Q: Why does Twine use tuple storage: why is it not using a quad?

A: Nova said it’s faster in their system, so for performance reasons they decided to avoid reification.


VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Semantic Web, Web Science Tags: