Exporting databases in the Semantic Web with SPARQL, D2R, dbview, ARC, and such
The developer track at WWW2006 last week in Edinburgh was really cool; you had to show up on time or you couldn't fit in the room! One of the coolest talks was D2R-Server - Publishing Relational Databases on the Web as SPARQL-Endpoints.. I see D2R Server is released now. Cool.
Yes, storing RDF in a SQL database using 3-column tables (or 4 or 5 or 6...) is cool as far as it goes, but I'm gland we're finally seeing more work on taking existing SQL databases (whose schemas are not designed with RDF in mind) and exporting them as RDF.
TimBL wrote a design note on Relational Databases on the Semantic Web in 1998. In 2002, I wrote dbview.py, a couple hundred lines of python that implements parts of it. Rob Crowell picked it up and the 2005/2006 version of dbview.py now does foreign keys and backlinks.
D2R gets points for using RDF for their configuration/mapping info. The slides showed turtle/n3. Why are the dbin brainlets in XML but not RDF? I wonder.
D2R Server has a mapping layer; dbview assumes that will be handled with rules. The choice of URIs for column names is interesting. D2R uses jdbc:mysql://127.0.0.1/wordpress#users1, but dbview is all about embedding a SQL database in HTTP space, so we use URIs like http://db.example/orders/customers/custno/1#item. In dbview, the decisions about when to use / and when to use # are made so that the result is browseable. In D2R, the default URIs don't matter as much because it's expected that they'll be mapped to a more well-known ontology/schema like foaf.
dbview is still just a few hundred lines of python; we haven't integrated the SPARQL parser that Yosi developed for cwm, nor integrated EricP's work on federated query.
Speaking of federated query... on Wednesday at the conference, I saw Tim Finin in the poster session. He showed me something the swoogle folks are cooking up: you give it a SPARQL query, and it looks at the terms used in your query and suggests documents you should put in your SPARQL dataset to run your query against. I hope to hear more about that.
Somewhere in EricP's work is one of the several SPARQL-to-SQL rewriters out there... oh... I thought the HP tech report, A relational algebra for SPARQL was another one, but it seems to be by Richard Cyganiak, one of the D2R guys.
Benjamin Nowack's Feb 2006 item announced a SPARQL-to-SQL rewriter for his ARC RDF store for PHP.
Hmm... maybe it's time for a ScheduledTopicChat on SPARQL, SQL, and RDF? If you're interested, suggest a couple times that would be good for you in a comment or in mail to me and a public archive.