Will the Real URI Please Stand Up
Will the Real URI Please Stand Up
Semantic Web: Documents vs. Elements
This is the 4th entry in our Semantic Web Series. By now, you know that we are shipping products and technologies to create a modern and pragmatic implementation of the Semantic Web. To accomplish this feat, we studied the Semantic Web vision and faithfully implemented a realistic solution which required a reinterpretation of some of the fundamental concepts.
In a previous post, I discussed meta data creation. In this entry, I will discuss our implementation of another fundamental concept: the URI... or IRI as it's now called. The <alt> Semantic Web implementation diverges in 3 fundamental issues as related to the URI concept:
1.Only URLs matter. In our implementation, we only care about page identifiers and not those mythical urn:isbn: URI/URN/IRI concepts. Why? Because we are generating meta data implicitly via the process of creating Mashups. So whenever a Mashup is created, meta data is generated.
Furthermore, URLs are important because a URL can be redirected, proxied, or otherwise intercepted. All other identifiers will not allow us to get the web resource.
2.HTML elements matter and not the Document. As discussed in the post about meta data, the traditional Semantic Web defines a publisher centric view of the web. Consequently, the first item which is discussed as important meta data is "who is the author", "when was the document created", and other Dublin Core pieces. Who cares?
In these Web 2.0 days, the web has evolved to focus on the user. The Mashup user just cares about the data within the web page and not the dubious "creator". More concretely, as you can see in this 2003 screenshot of our Mashup tools, a user named Joe wants to repurpose a list of coffee shops into a mobile site. The user does not care about the ads surrounding the list. So Joe just wants the facts, just the facts.
So meta data must describe the HTML elements that are important to the user. And therefore the user should be able to create meta data to describe those elements which are of importance. Some elements might be important to some users but not other users. And that's OK because we empower users to manage their own meta data. And it's theirs because they posses it.
The Document is therefore just a shell containing data. Even though the Document is what is created; the original context of search and computation; and the unit of storage and transfer, it is superseded by the element in importance to the user.
3.A URL provides context to elements and therefore meta data. Consequently it is more than an identifier. On the web, a URL is often used to resolve relative locations into absolute locations for image and link elements.
So far in this series you have seen that we have made some fundamental changes to the classical Semantic Web with respect to meta data authoring and URIs. In my next blog post, I will write about our implementation of meta data.
Friday, March 28, 2008