Archive

Author Archive

OWL 2 Reference Card released

October 18th, 2009

We’re pleased to announce the OWL 2 Reference Card [1]. The Card is meant to be a “cheat sheet” of OWL 2 features printable on a single piece of paper (on both sides). It is based on the OWL 2 Quick Reference Guide [1], which is now a Proposed Recommendation [2] in the OWL 2 Web Ontology Language document set.

Background: OWL 2 [4] is an extension to OWL 1 with a few new functionalities. Some of the new features are syntactic sugar (e.g., disjoint union of classes) while others offer new expressivity, including:

* keys;
* property chains;
* richer datatypes, data ranges;
* qualified cardinality restrictions;
* asymmetric, reflexive, and disjoint properties; and
* enhanced annotation capabilities

Comments and suggestions to the Card are welcome (please send to public-owl-comments@w3.org)

[1] http://www.w3.org/2007/OWL/refcard

[2] http://www.w3.org/2007/OWL/wiki/Quick_Reference_Guide

[3] http://www.w3.org/TR/2009/PR-owl2-quick-reference-20090922/

[4] http://www.w3.org/TR/owl2-overview/

Jie Bao

VN:F [1.2.0_562]
Rating: 10.0/10 (1 vote cast)
Author: Jie Bao Categories: tetherless world Tags:

I will pay delicious $100 for hierarchical tagging

June 19th, 2009

Just saw Jim’s post on What is the Semantic Web really all about?

I have been wondering about this problem too. What is Semantic Web? Yesterday I have asked a question “Why few (or none?) Web 2.0 sites provide hierarchical tagging?” on LinkedIn and get some pretty good answers:

http://www.linkedin.com/answers?viewQuestion=&questionID=496785&askerID=14212719

For your convenience, I attached my LinkedIn post at the end of this blog.

There are two things in the answers that draw my attention:
* Many do _not_ believe tags, or even hierarchical tags, are semantic; “semantics” means RDF or triples at least to them;
* Some believe that even implementing a hierarchical tagging system is not easy in engineering or social aspects.

I think these two beliefs, among many other reasons, may explain in part why the “Semantic Web” is still far from a reality. The first is about the overestimation of what is “semantics”: triple is one way to express semantics, but it is a question that whether it is _the_ way. The second is about the underestimation of “Web”-scale: realizing a knowledge system, even if is conceptually “simple”,  on the Web can lead to serious scalability problems, both for machine (can you make <1s response for all queries?) and for people (on changing their way of thinking).

Here is what I believe about “semantic web” (note no-capitalization). First, it is not necessarily “the Semantic Web” (just like there is no “the Mobile Web”), as defined by W3C standards or the layered cake model. Semantics is a way of organizing things, RDF and OWL are some ways to express it, but other ways should be encouraged too and sometime work better. Second, tools and services should be “web-ish”, something like a semanticized version of youtube or gmail; after all, “web users” are rarely a bioinformatician or can master a Java-based ontology editor.  Third, start deployment with very very basic semantics like trees (yeah, I know some will protest) and sameAs, but do it in a very very efficient way - if we can’t even come up with a Web-efficient tree reasoner, then how realistic we can come up with a Web-efficient RDF or OWL reasoner?

Now I’m prepared to dodge tomatoes :D

by Jie Bao

===============

My original post on LinkedIn (reorganized a bit)

Why few (or none?) Web 2.0 sites provide hierarchical tagging?

Gmail label and delicious tagging are flat, which is troublesome all the time for me. I have to add (unnecessarily) many tags even if they can be easily inferred. I didn’t find an alternative that allows me to organize my tags in a tree or network. Is there any technical or marketing reason?

People have been talking about semantic web a for a while and are looking for a killer app. It’s apparent that hierarchical tagging is semantic, is in high demand, and is relatively easy to do. Why there is none in popular sites?

PS 1: Let me clarify some situations when hierarchical tagging will save me a lot of time: recently I’m reading a book of Qian Mu, a historian, and tagging my notes on delicious with tags “qianmu“; I also want all those notes be tagged with “history“, but I have to always add both “qianmu” and “history”.

Sometimes I want more than one tags to be inferred. For example, when I add “wuxu” (the year of 1898), I want tags “qing“, “china” and “reform” to be added. You will find how trouble it is to add all 4 tags together when you have about 10 notes on “wuxu”.

In another example, I want to share my tags in both Chinese and English. If I can define two subclass relations between two tags, each in a different language, I will not have to always add the both tags.

Now I have about 1000 tags on delicious. I’m really really in despair need for a hierarchy. I’m willing to pay delicious $100 for such a service.

PS 2: Further clarification: I don’t believe I will need a tagging system that always requires me to pick up terms from a tree, DAG, or a network. I can still freely add tags. But I need some way to clean up my tags from time to time, and organize them. It is just like how i clean up my “download” folder: put them into different folders, and if a folder is too big make some subfolders.

VN:F [1.2.0_562]
Rating: 0.0/10 (0 votes cast)
Author: Jie Bao Categories: Semantic Web Tags: ,

A little, tiny semantics in action — from Google

February 20th, 2009

I just read about Google’s Canonical Link Tag. It’s a little application of RDFa’s “rel” property. It is not a big thing, but I’m happy it is from Google, who seems quite remote from semantic web technologies.

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

“Last week Google, Yahoo, and Microsoft announced support for a new link element to clean up duplicate urls on sites. The syntax is pretty simple: An ugly url such as http://www.example.com/page.html?sid=asdf314159265 can specify in the HEAD part of the document the following:

That tells search engines that the preferred location of this url (the “canonical” location, in search engine speak) is http://example.com/page.html instead of http://www.example.com/page.html?sid=asdf314159265 .”
VN:F [1.2.0_562]
Rating: 0.0/10 (0 votes cast)
Author: Jie Bao Categories: Semantic Web Tags:

AAAI Fall Symposium on Automated Scientific Discovery

November 11th, 2008

Day One

I (Joshua Taylor) am now back in Troy after spending the weekend (Friday through Sunday) in Arlington at the AAAI Fall Symposium on Automated Scientific Discovery. Presented papers, keynote address, and slides will become available on the supplementary symposium page over the next week or so.

After opening remarks by symposium chairs Selmer Bringsjord and Andrew Shilliday, Doug Lenat gave an opening keynote address entitled Looking Both Ways. Doug has a long history in automated discovery, from his 1976 PhD thesis on Automated Mathemetician (AM) and later Eurisko, to present day work within Cycorp. Doug explained a great deal about the techniques behind AM and Eurisko, and also talked about some of the criticisms that these systems received. He then spoke about the development of Cyc. We learned just how much Cyc has evolved, moving from frame-based systems and description logics to much more expressive formalism, leaving behind a theoretically desirable global consistency for more pragmatic and cognitively plausible locally consistent microtheories, and how Cyc now has enough (manually-encoded) knowledge, that high-level machine learning and automated knowledge acquisition are possible. He stressed that one of the factors making such knowledge acquisition and learning possible is the widespread adoption of the world wide web, and I noted that in some of his high level diagrams included SQL and SPARQL. (While I knew about the OpenCyc project, I only just now became aware that a great deal of OpenCyc is Semantic Web friendly.)

Andrew Shilliday followed Doug’s keynote with a history of automated scientific discovery, particularly as it relates to his own research and upcoming thesis. He also described the Elisa system for assisting users is discovery in scientific and mathematical domains.

After lunch, Alexandre Linhares discussed Douglas Hofstadter’s notion of Fluid Concepts and an implementation thereof.

Siemion Fajtlowicz spoke about development of Graffiti, a well-known system for conjecture generation within the domain of graph theory. Siemion also discussed more recent work connecting graph theory conjectures and molecular structure conjectures.

Jean-Gabriel Ganascia presented A Reconstruction of Some of Claude Bernard’s Scientific Steps, which also documents the development of Cybernard. In a manner that I am particularly fond of, Jean-Gabriel, in order to automate scientific discovery, takes as a starting point Claude Bernard, a human who both made many scientific discoveries, and also documented just what he did. One of the interesting aspects of this work is that it involves developing an ontologies of the scientific process of discovery, as well as of the scientific concepts with which Bernard worked. The importance of ontology evolution was also stressed, for as scientific knowledge increases, scientific conceptualization must also change.

After Jean-Gabriel, Susan Epstein discussed Knowledge Representation in Automated Scientific Discovery. She discussed how concepts in automated scientific discovery are often expressed as sets, and that as a result, the conjectures that are generated are usually those that can be expressed in set theoretic terms. For instance (and I’m choosing an example that I remember from Doug Lenat’s talk), the conjecture that perfect squares have at least three divisors (every perfect square n has as divisors at least 1, n, and the root of n), could be made based upon the observation that the set of perfect squares is a subset of the set of numbers with at least three divisors. She proposed a representation, different from sets, that uses testers and generators, which are, respectively, predicates for determining whether an example is an example of a concept and functions that produce examples of a concept.

Epstein’s talk concluded with an example of student discovery and conceptual refinement for the game Pong Hau K’i (or Umulkono, 우물고노, in Korea). As AI researchers, seeing the progression of formalizations that students went through in diagramming the state space of a game was quite interesting. (A colleague and I played the game on paper during the plenary session. I lost.)

Day Two

Alan Bundy started the day with a keynote called Why Ontology Evolution is Essential in Modeling Scientific Discovery. The title is a good overview, and some of the comments made about Jean-Gabriel’s work apply here too. The importance of the provision for ontologies which can change and evolve along with scientific conceptualization must not be underestimated. Examples were drawn from physics and astronomy, particularly the discovery that heat and temperature are not equivalent, or the precession of the perihelion of Mercury. Michael Chan’s later talk would also touch upon their work on ontology evolving and repair systems.

David Jensen spoke about Automatic Identification of Quasi-Experimental Designs for Discovering Causal Knowledge. I think that this work is important, particularly for science performed using Semantic Web technologies, where, ideally, data collection could be automated rather than planned. From his abstract, [Quasi-Experimental Designs] are a family of methods for exploiting fortuitous situations in observational data that emulate control and randomization.. For instance, although a dataset as a whole may have significant bias or sampling issues, subsets of the data may actually suggest another, and with better experimental method. Jensen discussed an example in which two groups of researchers made different conclusions about links between early sexual activity and juvenile delinquency. The first group of researchers used the dataset as a whole, while the second examined twins in the dataset, a subset of the observational data, but one in which it was practically guaranteed that most variables would be identical (e.g., twins in the same household, same type of family life, &c.)

The posted schedule had Konstantine Arkoudas speaking next, but he and I switched places, so I gave the next presentation, Discovery Using Heterogeneous Combined Logics. The title was a bit misleading (though it matched the extended abstract that I submitted) as I actually spoke about how specialized reasoners might be invoked on decidable subproblems of an overall goal. Particularly, I showed the Dreadsbury Mansion Mystery and the typical first-order logic formalization that students at RPI will generate for it. Of course, with an arbitrary FOL formalization, there aren’t many guarantees about what an automated reasoning system can do with it. I then showed that there’s a natural translation in the description logic ALBO which has a decision procedure. I have not yet done any work in automating this process, but neither does it seem impossible to automatically recognize such a reduction. This is all preliminary work, so I was grateful for references from Alan Bundy and Simon Colton to related work.

After lunch, Michael Chan presented more work on ontology repair. It seems that their research group has a framework in which ontology repair plans are components. Chan discussed one called Inconstancy, in addition to Where’s My Stuff? that Alan Bundy had mentioned earlier.

Selmer Bringsjord showed his continuing work on an automated discovery of Gödel’s first incompleteness result.

Day Three

Day three began with a keynote from Simon Colton called Joined-Up Reasoning for Automated Scientific Discovery: A Position Statement and Research Agenda. Simon discussed how HR. Doug Lenat had to leave the symposium early, which is unfortunate, as Simon’s presentation made some comparisons between AM and HR. Simon also made some connections between scientific discovery and artistic creativity. By this time, the symposium was drawing to a close, and so there was not as much time as we would have liked, but the presentation slides, and perhaps an audio recording, should be available on the supplementary symposium site relatively soon.

The symposium ended with a practical discussion of funding, publications, and possible future work and collaborations.

//JT

VN:F [1.2.0_562]
Rating: 0.0/10 (0 votes cast)
Author: Jie Bao Categories: AI, Uncategorized Tags:

Why Bother…

October 28th, 2008

From Talis: “Jim Hendler at the INSEMTIVE 2008 Workshop”

“that people will (and do) create metadata when there are obvious and immediate benefits in them doing so. No-one really consciously sits down to share or create metadata: they sit down to do a specific task and metadata drops out as a side-effect.”

I can not agree any more. I have tried to tag all my blogs once upon a time, after a few weeks, I found myself bored because there is no clear, immediate benefits for doing so. I would only tag things that I have to, like to tell my friends a list of posts of the same topic.

The only tagging system that is consistently successful upon me is the gmail labeling: I organize mails related to the same task (like writing a paper) on daily bases, because it is very useful, and immediately useful. Even though, I only label a tiny fragment of all my emails.

I have seen too many people have their desktop full of files and too lazy to organize them - myself is one of them. Every year I have to spare a day or two to reorganize my harddisk, and dig out the hidden treasures of my “Downloads” folder. I believe for semantic web to be successful, creating an ontology should be at least as easy as and as useful as organizing files on a harddisk.

In fact, people are creating meta data or even ontology everyday: every email sorting, every contact on the cell phone, every folder creating, every calender item, every wiki post, … We just need to make them explicit, and most of all, without bothering the user to click even one more button.

Jie Bao

VN:F [1.2.0_562]
Rating: 0.0/10 (0 votes cast)
Author: Jie Bao Categories: Uncategorized Tags: