SPARQL
SPARQL, short for “SPARQL Protocol and RDF Query Language” and pronounced “sparkle,” is a query language that allows users to query triplestores.
SPARQL queries take the form of a string. They are directed at a SPARQL endpoint, a location on the internet that is capable of receiving and processing SPARQL queries.
It is useful to think of a SPARQL query as a set of sentences with blanks. The database will take this query and find every set of matching statements that correctly fills in those blanks. In other words, the query is looking for data that follows a pattern that you have described. What makes SPARQL powerful is the ability to create complex queries that reference many variables at a time.
SPARQL queries can be used to query named graphs, such as those created and maintained by LINCS.
To do SPARQL queries, you will need to know:
- How to construct queries
- What sorts of questions can be asked with a query
Check out the LINCS SPARQL Endpoint and run queries right without leaving the LINCS site.
Construct a Query
A SPARQL query is like a recipe. There are four main ingredients:
- Prefix(es)
- Type of Query
- Query
- Modifier(s)
Prefixes
Prefixes are shorthand abbreviations for the full Internationalized Resource Identifiers (IRIs) that tell the SPARQL endpoint where to go to look for the data. Prefixes are placed at the top of your query so that you do not have to type out the full IRIs every time you want to refer to them.
In the following example, a prefix has been added for the CWRC ontology, the Resource Description Framework (RDF), the Resource Description Framework Schema (RDFS), and the Simple Knowledge Organization System (SKOS):
PREFIX identity: <http://id.lincsproject.ca/identity#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
Type of Query
Following your prefixes, you need to declare the type of SPARQL query. There are four types of queries: ASK, SELECT, DESCRIBE, and CONSTRUCT. Each type of SPARQL query includes the same essential components, but each serves a different purpose and will give you a different type of results.
ASK Query
ASK queries return a yes or no answer.
SELECT Query
SELECT queries return a list of all of the things that match your query item.
DESCRIBE Query
DESCRIBE queries return all known information about a particular entity.
CONSTRUCT Query
CONSTRUCT queries return new triples by pulling information from multiple triples.
Coming soon! Example queries will be provided once the LINCS SPARQL Endpoint is live.
Query
Triples
After you have declared which type of query you are going to construct, you need to fill in the structure of the query. The query structure is composed of triples: a subject, predicate, and object. Each component of a triple is either a query variable or a Uniform Resource Identifier (URI).
A query variable is the object that you are searching for. Variables are indicated with a question mark followed by a word. The word you choose for a variable is arbitrary, but should be human-readable for ease of understanding if shared with others. It is important that you use a variable consistently within a query.
?name
?item
A URI is a unique identifier that represents a thing that exists in the LINCS triplestore. It can be a property, entity, graph, class in the ontology (or ontologies), or even a vocabulary term (type). URIs are typically shortened using a prefix, or namespace. For example, the full URI for the identity property “woman,” <http://id.lincsproject.ca/identity/woman>
, can be shortened to identity:woman
using the identity namespace.
WHERE Statement
Each query must have a WHERE statement. The WHERE statement follows the declaration of the query type and the list of predicates that will be used as headings in the table of results. It comes before the query pattern and indicates that what follows is WHERE to look for the pattern that the query must match.