...making Linux just a little more fun!

<-- prev | next -->

Away Mission -- SEMTECH - The Semantic Web Technology Conference

By Howard Dyckoff

Semantics is a new, interesting, and demanding area for web technologists; it promotes the creation of ontologies and provides tools to conserve meaning within the data. The use of semantics in the development of data taxonomies is now a $2 billion per year market, and is projected to grow to over $50 billion by the year 2010. A large part of this is driven by 3-letter government agencies [ e.g., the NSA ] - unsurprisingly, several of the attendees hailed from that sector.

IT industry analysts have estimated that 35-65% of our 'untamable' system integration costs are due to semantic issues [semantic discontinuities]; this is making the overlap of semantics and web technologies an issue of critical interest to businesses on the cutting edge.

The fusion of semantics and web technologies goes back to the late 1990s and was proposed by Tim Berners-Lee. His proposal to the WC3, "A roadmap to the Semantic Web" in 1998 inspired other contributors.

In 2004, the W3C released the Resource Description Framework (RDF) and the Ontology Web Language (OWL) as W3C Recommendations. RDF is used to represent information and to exchange knowledge on the Web; in this system, the combination of Subject/Predicate/Object constitutes an RDF "Triple" where the 'object' uses URIs to encode this information and to point to a specific web location.

OWL is used to publish and share sets of terms called ontologies, which support advanced Web searching, software agents, and knowledge management. OWL and RDF are two of the main ingredients of today's Semantic Web. For more information, see this working definition of RDF.

In simple terms, RDF provides a standard and unambiguous way to make statements about resources, while OWL overlays RDF and provides the ability to make inferences about data. OWL, in turn, is used to build complex rule systems. Prior art for OWL comes from DAML, the so-called DARPA Agent Markup Language.

Semantic Web Technology

The big idea here is extend the HTML web with the addition of "meaningful" tags such that software can help us find things - and then relate those things to other things. The Semantic Web extends the utility of the World Wide Web through the use of standards, semantic markup languages, inference engines and related processing tools. That may well be the stuff of Web 3.0...

A Sunday evening primer on the Semantic Web, "Semantics in Perspective," was given by Dave McComb, President of Semantic Arts. His presentation drew from his 2005 tutorial, which is available from his company web site.

This is, admittedly, very dense material; after the opening keynote and a tech session, I felt the early onset of brain fatigue. But the final session I attended, on the challenges of classifying Disney's vast media collection, gave me much a better sense of why and how to apply these techniques. I also learned that semantic metadata can increase the value of content, especially media. For instance, that animation cel in a frame is worth more depending on 'who' painted it and 'when' and how many there were. [See this great slide pack for more].

That opening keynote with Ora Lassila [co-author of the RDF standard and Research Fellow at the Nokia Research Center] set the tone and context for the entire conference. He shared the stage with James Hendler [Director of the Joint Institute for Knowledge Discovery, University of Maryland], with whom he and Tim Berners-Lee had co-authored the seminal 2001 Scientific American paper "The Semantic Web" which discussed the concept of Semantic Webvision.

They had also independently written two of the earlier, defining articles on the concept of the Semantic Web, setting the terminology and direction for many of the researchers and developers to follow. We are not, however, talking about a long history here: Ora wrote about pervasive computing in his 1999 article, "Why your microwave oven needs to surf the WWW." Jim published his DARPA article on "Tetherless DAML and the Mobile Semantic Web" in Oct. 2000.

In an informal and interweaving style, both Ora and Jim described a timeline of Semantic Technology history and also gave themselves a report card on their pronouncements. [The slides for this keynote are availible here.]

The prehistory of the Semantic Web starts in 1990s, emphasizing metadata and the second wave of AI. Much of the discussion at that time focused on the key changes in traditional AI, its failings and its rebirth with space probes like Mars Rover and Deep Space-1.

Ora -- "We learn from the history of AI that only embedded AI was successful. Standalone AI did not succeed." He noted that reasoning tools are a 'means' and not an end. "And toward any end, content is still very important."

Ora -- [referencing the 2001 Scientific American article where he wrote about robust agents to work on the "behalf" of people] "... Sem-Tech web services are the 'plumbing' for these agents, and some trials are now underway. These must be able to handle unexpected and inconsistant data... Semantic systems will help here."

Later, Ora said, "Another thing about agents... I work for a company that makes little mobile devices... the way we use mobile devices for browsing is not the best possible paradigm... [there are, e.g., problems when driving]. So here is another case where you need systems that work for you."

Jim -- "...the miracle is not in a new magic box... it is when we start linking things together." He went on to describe SPARQL [an RDF Query Language] as a doc and data exchange mechanism. With a picture of massively interlinked systems on screen, he added "... we have both broad tech and deep technology -- we will use these things we all are building but also we must link them all together..."

Ora - "...most of what we predicted is already happening, and its happening faster than we anticipated ... the ontolgies are there but there is little linking so far... and little progress on agents..."

The closing keynote by Mills Davis was distinctly less academic. Subtitled "The Executive Guide to Billion Dollar Markets," it was a summary of a market research paper prepared by Project 10X: Semantic Wave 2006, Part 1. The presentation addressed the business value of the new semantic wave technologies and provided perspectives to help business executives seeking to use "distributed intelligence" for strategic advantage in global markets. [This conference site link provides a digest of the 55 page report and also provides a link for the full file.]

The WSDL connection

There was also much discussion about the connection between Web Services and Semantics. During the week of the SemTech conference, the shorter name that the organizers used frequently, the World Wide Web Consortium (W3C) reinforced its commitment to Web Services with continued work on SOAP, Web Services Description Language (WSDL), and addressing specs by chartering a new group to enhance WSDL with semantics to automate the composition and discovery of Web Services.

WSDL is used to describe the interface of a Web service with inputs and outputs. But WSDL alone does not provide enough information to determine automatically whether the interfaces of two arbitrary Web services are compatible. Semantic descriptions can make it easier to automatically identify services that have compatible inputs and outputs. With a description of a service for finding books and a description of a service for purchasing them, it becomes much easier to combine these into a useful composite. Similarly, machine-readable semantic descriptions facilitate the discovery of the correct Web services.

Semantic WSDL [or WSDL-S] was discussed in the context of "Large Scale Distributed Information Systems" by Amit Sheth of the University of Georgia. This talk focused on semantics for Web Services and SOAs, and offered analysis of the underlying architectures. He noted that today "... a lot of land work is required to link processes" across different business domains. This is because different businesses and public intstitutions use different vocabularies. There needs to be agreement on processes and on data and data formats. Said Sheth, "...semantic description is the way to integrate disparate services and data." He offered a semantic-aware archtitecture that used web taxonomies to link different domains.

Semantics Enrich Web Service Descriptions

WSDL 2.0 includes the hooks to allow the specification of semantic descriptions. The W3C has chartered the Semantic Annotations for Web Services Description Language (SAWSDL) Working Group to develop annotations that takes advantage of those hooks. This group will also provide a mapping of the new properties into a form compatible with the existing RDF mapping for WSDL.

Currently, more than 400 researchers and developers participate in the continuing Semantic Web Services Interest Group (SWSIG), a public forum created in 2003 to discuss the integration of the Semantic Web and Web Services. That should go some way toward putting working semantics under the covers of Web Services.

This is the working definition of RDF from the W3C XML (the "10 Points" doc):

   W3C's Resource Description Framework (RDF) is an XML text format that supports
   resource description and metadata applications, such as music playlists, photo
   collections, and bibliographies. For example, RDF might let you identify
   people in a Web photo album using information from a personal contact list; then
   your mail client could automatically start a message to those people stating that
   their photos are on the Web. Just as HTML integrated documents, images, menu
   systems, and forms applications to launch the original Web, RDF provides
   tools to integrate even more, to make the Web a little bit more into a
   Semantic Web.

The full text is available here.

And here is the web page for the W3C on the Semantic Web: http://www.w3.org/2001/sw/.

Finally, here is the Web Service Semantics - WSDL-S link: http://www.w3.org/Submission/2005/10/

Conference followups

Sem-Tech conference blog: http://semantic-conference.blogs.com/semtech06/

The business of Semantic Web Technology: http://www.dmreview.com/article_sub.cfm?articleID=1019819

Most of the session presentations and tutorials are availible for download here. If you are new to the subject matter, allow yourself some time to get through it all [well, lots of time. It's a complex subject.]

The 5th International Semantic Web Conference will be hosted in Athens, GA, USA [dates: November 5 (Sunday) - 9 (Thursday), 2006] and the 1st Annual Asian Semantic Web Conference (ASWC) which will be held in Beijing early September of 2006. SemTech 2007 will take place May 20-24 next year [but probably in another US city].

Talkback: Discuss this article with The Answer Gang


Bio picture Howard Dyckoff is a long term IT professional with primary experience at Fortune 100 and 200 firms. Before his IT career, he worked for Aviation Week and Space Technology magazine and before that used to edit SkyCom, a newsletter for astronomers and rocketeers. He hails from the Republic of Brooklyn [and Polytechnic Institute] and now, after several trips to Himalayan mountain tops, resides in the SF Bay Area with a large book collection and several pet rocks.

Copyright © 2006, Howard Dyckoff. Released under the Open Publication license unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 129 of Linux Gazette, August 2006

<-- prev | next -->
Tux