==========================
 Querying RDF with SPARQL
==========================

Introduction
------------


General material on SPARQL can be found on the MASWS Wiki `SPARQL/ARQ page <http://sites.google.com/site/masws09/sparql-arq>`_.


Implementations
---------------

ARQ
    There are a number of SPARQL implementations around. So far, the best
    one I have found is ARQ, which is based on the `Jena toolkit
    <http://incubator.apache.org/jena>`_. It is available as the command :command:`arq` within the set of
    command-line tools which come with the Jena download. 

Installing ARQ
    See these `notes on Jena <jena.html>`_ for more information on installing and using Jena.

ARQ Documentation
    Documentation can be found in the `ARQ Tutorial
    <http://jena.sourceforge.net/ARQ/documentation.html>`_,
    which is pretty good, and links to a few other resources. 

ARQ 2.8.8
---------

ARQ-2.8.8 should also be available on DICE in the directory :file:`/usr/share/java/arq/bin/`. If you want to use this version,
you will have to set a different environmental variable, namely :envvar:`ARQROOT`::

    export ARQROOT=/usr/share/java/arq
    export PATH="${ARQROOT}/bin:${PATH}"

However, these notes all work for the earlier version of ARQ available :file:`/usr/share/java/jena/bin/`. 

Querying
--------

Here's an example of a simple SPARQL query (downloadable as :download:`example-01.rq
<../sparql/example-01.rq>`):

.. include:: ../sparql/example-01.rq
   :literal:

Most of this should be familiar to you, but the ``FROM`` clause may be new. This
says that the query should be run against the RDF data to be found at
``http://homepages.inf.ed.ac.uk/ewan/foaf.n3``. Note that the URI has to
be addressable via HTTP when the query is executed.

To call the ARQ SPARQL query engine using the SPARQL query in :file:`example-01.rq`, we use an instruction like the following on the
command-line, ::

    dice:~> arq --query example-01.rq 

Given my FOAF file, :file:`arq` will print the following output to the terminal::

    ---------------------------------
    | name1        | name2          |
    =================================
    | "Ewan Klein" | "Harry Halpin" |
    ---------------------------------

We are allowed to have more than one ``FROM`` clause in a query, and
the resulting graphs are merged. This is shown in the next example (downloadable as :download:`example-02.rq
<../sparql/example-02.rq>`),
where we query both my FOAF file and Harry Halpin's.

.. include:: ../sparql/example-02.rq
   :literal:

As you can observe here, SPARQL triple patterns allow the same abbreviatory syntax
as we have already seen for N3 / Turtle.

Running the query gives the following result set::

     --------------------------------------
     | name1          | name2             |
     ======================================
     | "Harry Halpin" | "Daniel Weitzner" |
     | "Harry Halpin" | "Tim Berners-Lee" |
     | "Harry Halpin" | "Dan Connolly"    |
     | "Harry Halpin" | "Ian Davis"       |
     | "Harry Halpin" | "Paolo Bouquet"   |
     | "Ewan Klein"   | "Harry Halpin"    |
     --------------------------------------

 
The next query is run against a couple of data files, 
namely :download:`knows.n3 <../rdf/knows.n3>` 
and :download:`cafes.n3 <../rdf/cafes.n3>`. 
In this case, I want to recover cafes that are loved by people who know someone or else
who someone knows (recall that we are not treating ``foaf:knows`` as symmetric).
In order to match
these two alternatives, I use the ``UNION`` keyword, as shown here (downloadable as :download:`example-03.rq
<../sparql/example-03.rq>`):

.. include:: ../sparql/example-03.rq
   :literal:

Notice that I don't have a ``FROM`` clause in the SPARQL query. In order to get the
data, I can specify one or more local files on the command line using the ``--data``
option::

    arq --query=example-03.rq --data=../rdf/knows.n3 --data=../rdf/cafes.n3 

Here are the results::

     --------------------
     | cafe      | x    |
     ====================
     | :vittoria | :stu |
     | :pyard    | :amy |
     | :aroast   | :stu |
     | :ebagel   | :amy |
     | :ebagel   | :bea |
     | :ebagel   | :bea |
     --------------------

As you can see, there's a duplicate line at the bottom of the results. To eliminate
this, we can use the ``distinct`` keyword.
 
.. include:: ../sparql/example-04.rq
   :literal:


Exercises
---------

#. Download the relevant queries and try running the examples given above.
#. Next, modify the queries in various ways and see what results you get.
#. Finally, run some queries against your own RDF data, both local and remote.