Archive

Author Archive

My report on Open Government Data camp 2011

November 2nd, 2011

A few days ago I (Alvaro Graves) participated in the Open Government Data Camp 2011 in Warsaw, Poland, where people from different groups, organizations and governments met to discuss issues related to Open Data at government level. Here are some of the most important issues found in theese talk, in my opinion.

The current state of OGD

David Eaves, an activist who advises the city of Vancouver, Canada in issues about Open Data, gave a keynote in which he described his views on the current state of Open Data movement. First, it is striking that the success stories are not just a few anymore (as Data.gov or Data.gov.uk) but there are dozens (perhaps hundreds), both at national, regional and local levels. Similarly, the term Open Government Data is becoming increasingly popular, which is good because it is easier to stop explaining the ‘what’ and start focusing in the ‘how’.

Another interesting point is how the movement of Open Government Data already passed an inflection point, where it is no longer seen as people demanding from the outside, but being increasingly being invited to help working on these initiatives from within the government. For many, this change in perspective can be confusing and may create some concerns of Open Data being absorbed in a bureaucratic system that makes impossible to implement Open Data initiatives. However, it is clear that in order for these changes to occur, the movement can not reject to collaborate with governments.

Local initiatives, by locals

A talk that I really liked was by Ton Zylstra, who lives in the city of Enschede, the Netherlands. This city has only 150,000 inhabitants. He wanted an Open Data initiative there, however, it was difficult to convince the authorities, so he with a group of people decided to start working on their own. Inviting a handful of hackers to a bar, they created their first application that used data from Twitter, Foursquare, and the venues of a local festival. Eventually they convinced the municipal government that the default option for local data ought to be open.

From this experience, Ton showed several important lessons: You have to create something concrete, no matter if it is small: This implies something that requires little funding (the first beers at the bar were free) and short-term (no more than a couple of weeks). It does not matter if it is something original or not, there are some great ideas out there that deserve to be copied and are very useful for the local community.

How the Open Data died

Another very interesting keynote was by Chris Taggart, founder of OpenCorporates, who warned of the risks that the Open Data movement is facing today. His main concern is the lack of relevance in terms of impact Open Data has on society. For example, he mentioned that so far no one’s business depends on Open Data (although this is not true, there are a few out there, but I have to concede they are rare examples). In general, making data available is not enough, it is necessary for it to be used either in applications, by data journalists, etc. Also, it is fundamental to link different sites with Open Data (something quite uncommon in the movement), so that people can find out more information. Finally, I liked his idea that if the Open Data does not cause problems to its incumbents, then it is not working.

Redefining what is public

Finally another talk that I found interesting was the idea of ​​Dave Rasiej, founder of Personal Democracy, and Nigel Shaldbolt, professor at University of Southampton, to redefine “the public” in terms of data that “is available on the Web in machine-processable formats.” That is, uploading a bunch of PDFs with scanned tables does not make that information public, because it is not easily accessible. This initiative raises the bar of what public data is, especially when compared to the FOIA (Freedom of Information Act) that allows you to request information from government. Note that this applies to all information, as Rasiej so vehemently described it.

So… what did you talked about at OGDCamp?

In my case, I presented a system for publishing Linked Data called LODSPeaKr, which can be used for the rapid publication of government data and to create applications based on Linked Data. In the near future I will be writing more about this framework, but for now you can see my presentation here.

VN:F [1.9.13_1145]
Rating: 9.5/10 (2 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

Building a mobile app for ISWC2010

November 9th, 2010

One of the things I find annoying while attending conferences is the huge amount of papers we receive: Talks, maps, “metadata” about the conference in general. So, for ISWC2010, my solution was to create a mobile application where people attending a conference could retrieve the information they wanted.

The first question you may ask when you create a mobile app is which niche you want to cover: The mobile ecosystem contains a wide range of devices, each of them with different capabilities, features, etc. This implies that a developer should choose which platforms to support and which feature he or she can use.

With ISWC in mind, my impression was that most of the attendees would use a smartphone, in particular iPhones or Android. Since I wanted to cover both platforms, I decided not to create native applications (for now).

I based my work on Sencha Touch which is a nice library that uses CSS3, HTML5 and Javascript. The app works fine in iPhone, iPod, iPad, Android devices, as well as Chrome and Safari browsers.

In this app it is possible to obtain and navigate through information about authors, papers, workshops (sadly, I could not obtain the data about workshop papers on time), scheduling, rooms and sessions. The data is obtained from a SPARQL endpoint containing the ISWC 2010 metadata. Each action implies a SPARQL query end the results are retrieved as a JSON object. I also obtained picture of authors from Arnetminer.


Finally I added a twitter feed with all the relevant hahstags (#iswc, #iswc2010, #cold2010, #seres2010, #c3lsw2010, etc.), so you can read what people is tweeting about the latest events (you shouldn’t have problems with firewalls, etc).
I encourage you to go to http://iswc.mobi and try ISWC Mobile. Of course, comments suggestions (and bug reports!) are always welcome.

Alvaro Graves

VN:F [1.9.13_1145]
Rating: 10.0/10 (3 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)
Author: Categories: iswc, tetherless world, twitter Tags:

RPI Hackathon: Linking government data

December 9th, 2009

This is an invitation to participate in the RPI Hackathon 2009 for linking government data. For more detailed information check our wiki.

Part of the work done here in the Tetherless World Constellation consists in translating the government datasets available from data.gov into RDF. This effort has produced billions of triples from (at the moment of writing this post) more than 130 datasets. This data can used in multiple ways: It can be queried from a SPARQL endpoint, used in visualizations such as maps or it can be combined with other datasets (whether from data.gov or other sources) to find correlations, clustering or other types of analysis.

However, we think that the data is more interesting and useful when is linked: For example, a system can answer a specific query and also suggest other sources of information that may be relevant to the user. Thus we think that while we keep translating datasets, it also would be nice to link these datasets to the Linked Data cloud and, in order to do that, we are asking your help.

During December 12th and 13th we will host a Hackathon (i.e., an event where people gather together to work on a specific computational problem). This event is part of the Great American Hackathon promoted by Sunlight Labs. We will host this event at Winslow Building, RPI, in Troy NY. It will start from 10AM to 5PM , but if you have only a few spare hours, you are also welcome! As I mentioned above, our main goal is to link the available data to the Linked Data cloud, but if you have also other ideas to develop using one or more of the datasets, please join us too! The only requirement is to bring your computer and register by email to gravea3[@]rpi.edu or difrad[@]rpi.edu. Because we know big brains needs energy, food and beverages will be provided. Even if you can’t attend physically you can help us working online.

Everyone is invited to participate. If you have any comments, questions, etc. please don’t hesitate to contact me at gravea3[@]rpi.edu or check the announcement in data-gov.

Alvaro Graves and the Data-gov team.

VN:F [1.9.13_1145]
Rating: 8.3/10 (3 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

Rankings, Google and Semantic Web

November 12th, 2008

During the last centuries, humankind has experimented an exponential increase in the information available. This is even more perceptible in the Web, which makes information to be reached with just a few clicks….. far beyond what we as humans can process and assimilate.

Thus we need to discriminate among a gargantuan amount of information available to find wha are we looking for (or the closest to it). The traditional Information Retrieval idea for this is based on searching keywords. However, it is difficult to differentiate among several –potentially billions– of pages which has more useful information to what we are looking for. In order to do that, we need to discriminate. The general idea of discriminate is based on the concept of ranking(*): This is an order (whether partial or total) of some entities based on a set of criteria.

This is a good way of handling information because we don’t have the resources (time, memory, etc..) to navigate through all the available data. And that is exactly what Google does: we ask “I need to find some pages that contains keywords X, Y and Z” and Google answers “Look, according to my algorithm and the data I have here is a list order from what I think is the most relevant page for your query”.

The Semantic Web brings similar challenges, but in this case we are not talking about pages and links, but about any entity (people, cars, webpages, ontologies) related by different predicates (people has firends, cars has parts, webpages has authors, ontologies describe other entities and so on). Thus the problem is far more complex, since there is more information available.

Also, there are other questions we can ask: What ontology should I choose for a certain work, given dozens of possible candidates? When using that ontology if I have a SPARQL query that returns 1e6 results, are they all equally interesting to me? If not, which ones to show first?

The idea of opening your data, share it, mash it up, makes it everything more complex: It is not enough to have millions of answers, as a user I want the best suited for me (whatever that means).

Alvaro Graves

(*) Linguistic thought: Is interesting for a spanish native speaker as me that there is not translation for “ranking”: Does it means that the concept didn’t exist in the spanish-speaking world?

VN:F [1.9.13_1145]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)
Author: Categories: Semantic Web, Web Science Tags: