February 13, 2008
Reuters Semantic Web Web service
Let me disambiguate that title: Reuters is offering a Web service, called Calais, that will parse text and return it in a form (RDF) that can be utilized by Semantic Web applications. It uses natural language processing (from ClearForest) to find structures of meaning such as places, jobs, facts, events, etc. It apparently has its own metadata schema, but it allows users to extend it. It’s an open API, and Reuters is being quite generous in how much they’ll let you submit during this beta period. It’s English only for now, although they plan to support other languages, opening the exciting prospect of being able to find items of interest in languages you don’t understand via a unified metadata framework.
I’m going by the site’s FAQ. I haven’t tried it and can’t tell how well it works, how accurate it is, how comprehensive or detailed its metadata are, and how much post-processing cleanup uses will want to provide (which of course depends on the application). There are some points I just don’t understand, such as the claim “Calais carries your own metadata anywhere in the content universe.” But if it works within some reasonable definition of “works,” and if it gets widely adopted, Calais could make a lot more information a lot easier to find, and to process for further meaning.