W3C Suomen toimisto

RDF and SPARQL

Ossi Nykänen (ossi@w3.org)
W3C Finnish office (W3C Suomen toimisto)
Tampere university of technology (TUT)

This presentation describes briefly the W3C RDF and SPARQL technologies which define the data model and a query language for the W3C Semantic web, respectively. A good way to learn more about the RDF is to read the RDF Primer.

Abstract

At the heart of the W3C Semantic Web technologies lies the abstract Resource Description Framework (RDF) data model. The Semantic Web layercake diagram (W3C). In short, the RDF model defines a network of statements (see an example, Fig 6. in the RDF Primer). Through the careful use of URI names, this system enables associating knowledge of different origin and writing machine-processable semantic descriptions (e.g. in RDF/XML). The basic idea in applications is adding RDF adapters to legacy systems (such as GRDDL, see the Health Care use case), effectively mapping existing data onto the Semantic web. As a simple example, consider the RSS 1.0 format for aggregating news items.

As the abstract graph-like structure of statements is not enough for applications, standard vocabularies are needed. The core RDF vocabularies (see, e.g., the names in the RDF/XML Syntax) include structures for expressing types of objects and simple structures such as bags and lists. When enhanced with the RDF Schema vocabulary, it becomes possible to describe systems including user-defined classes and properties which typically appear in, e.g., thesauri applications (see, e.g., SKOS). By agreeing upon more expressive predicates (and the associated standard concepts), more complex systems, such as domain ontologies (via OWL; see also the ontology examples in the OWL Guide) and rules (via RIF), can also captured.

Since the Semantic web may be perceived as a large database, it is natural to retrieve data upon queries. The SPARQL Protocol and RDF Query Language ( SPARQL) defines a SQL-like query language for making queries for RDF (see an example of restricted query). A typical SPARQL query is a text message that retrieves an XML document or an RDF graph.

By default, a standard SPARQL query only analyses the structure of the RDF graph and thus retrieves explicitly written data. With certain assumptions, a SPARQL processor may be equipped with a reasoning service that enables retrieving data also about the virtual statements, i.e. statements that can be logically deduced from the RDF graph(s). Note that the current SPARQL specification only defines the process of querying data, asserting (new) statements is done via a processor-specific API (e.g. with the Jena API).