November 27, 2007

What can Semantic web do for you?

Hello everybody! It's been a long time since I visited my own blog and I am happy that my 941 readers are still subscribed to my feed! It was one of my newly acquired interests (or should I say craze?) that brought me back here. It's called Semantic web. Some of you may have heard the term from the startups lately that claim to bring Semantic web to reality. I have been reading a lot about Semantic web until I felt enlightened! So, I plan to write a series of introductory articles so that you can also appreciate how it can evolve the World Wide Web to next level.

Semantic web was proposed by the granddaddy of WWW, Sir Tim Berners-Lee, way back in 1999. Though academics have been thriving on it since then, developing standards and writing research papers, it took so many years for business folks to assimilate his wild ideas and gather courage to start companies. So what was so wild about it? He was proposing that we should transform the web completely so that machines can browse and find answers to all sorts of complex questions like "How long will Angelina and Brad stick together?". Quite crazy, ain't it? But if you take a closer look, he isn't that crazy (I am talking about Sir Tim :)

Today's web is a large collection of documents that are linked to each other. The hyperlinks have keywords that help you find what the link is about. All he said was "Hey, let's do the same for all our data, that is, put them on the web in a machine readable form and link them together to form a Web of data". So, what can machines read and process? If you write "I am 20 years old" in your blog, only you and I can understand it. But if you use XML to represent it (for ex. <age>20</age>), a piece of software code could understand it. XML is good for representing structured data but it ain't good enough for Semantic web. Why?

Two reasons:
- XML does not clearly identify the subject and object of the fact that it is representing.
- XML does not have unique identifiers that can be used to link data from multiple websites.

To handle these shortcomings, Sir Tim and people who liked his idea proposed a standard called Resource Description Framework (RDF). Data about all kinds of resources like People, Places, Documents, Products etc can be represented in a "subject, property, object" triple format and each resource will have a unique identifier. For example, the sentence "Jane is located in Paris" can be represented by a triple (Person:Jane, location, City:Paris). Person:Jane and City:Paris are the unique identifiers. This is a fact that Jane can declare in her website using RDF.

Let's say another public service provides the mappings between cities and countries and it has the fact that "Paris is in France" (City:Paris, country, Country:France). Then a software agent that processes both RDF triples can link them together and infer that Jane is in France. This is just a simple example of linked data. There are a lot of such public data that can be represented in RDF. DBpedia is a project which extracts such data from Wikipedia and stores them in RDF and also links them to other RDF data.

RSS feed is one of the major advances in Web due to such machine readable common data format. Long gone are the days when you need to visit your favorite blogs and news sites everyday to check new content. With feeds, you are notified of all the new content in one place - your feed reader. This has been made possible because several blogging providers and webmasters agreed upon one format for syndication. That's the power of convergence that Semantic web can exploit further. Some of the semantic web software agents that you can expect in future are:

- a travel agent that can book your vacation package on your behalf by obtaining your favorite destinations, preferred dates, budget etc and automatically searching through online travel sites to pick the best deal.
- a shopping agent that you can personalize to search for products on your behalf. Once you provide your preferred brands, price range, features etc for, say, Camera, Laptop, it will search for matching products in various online shopping sites and notify you about them everyday.
- a dating agent that can help you identify your dream girl (based on matching interests, location, age etc) no matter whichever social networking site she is in.

Did I make you dream? Then, I have succeeded in my attempt :) Let me know your comments and opinions (I have re-enabled comments in my blog).

Next up: What can YOU do for Semantic web?