This document discusses querying cultural heritage data stored as graphs using SPARQL. It provides examples of retrieving single and sets of triples from the graph and explains how a SPARQL server can perform additional reasoning. Exercises demonstrate querying for object owners and their names, exporting query results to CSV, and counting objects made of different materials.
1 of 23
Download to read offline
More Related Content
Mon norton tut_querying cultural heritage data
1. Querying
Cultural Heritage Data
Dr. Barry Norton,
Development Manager, ResearchSpace*
* Funded by the Andrew W. Mellon Foundation
* Hosted by the Curatorial Directorate, British Museum
2. Statements and Patterns
For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
3. Statements and Patterns
For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
We can declare/retrieve one (N)Triple:
4. Statements and Patterns
For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
We can declare/retrieve one (N)Triple:
Or write this in Turtle:
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix bm-obj: <http://collection.britishmuseum.org/id/object/> .
@prefix bm-id: <http://collection.britishmuseum.org/id/> .
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
5. Statements and Patterns
For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
We can write this in Turtle:
And check for it in SPARQL:
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
PREFIX crm: <http://erlangen-crm.org/current/>
PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/>
PREFIX bm-id: <http://collection.britishmuseum.org/id/>
ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum}
true
6. Statements and Patterns
For a set of edges:
bm-obj:EOC3130
bm-id:the-british-museum
?
crm:P51_has_former_or_current_owner
?
We can do the work on the client:
Or have the server do it by turning the
triple into a triple pattern:
bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
7. Exercise
?
Questions:
Why is the answer different?
Who are the two (other) one-time owners?
?
8. Solutions & Exercises
Why is the answer different?
Reasoning, part of the work by the server
(being a triplestore) means that if two things
are related by crm:P52_has_current_owner
then theyre related by
crm:P51_has_former_or_current_owner
This is part of the work that the server
(triplestore) can do for you
Exercise: query for the (strictly) former
owners ?
?
9. Solution 1/2
Using specific server functionality:
11. Solutions & Exercises
Who are the two (other) one-time owners?
Since people and institutions (and places) are
?
?
treated as are concepts, the names of the former
owners are attached using skos:prefLabel
Exercise: if you didnt already, include the
names in your query results
12. Solutions & Exercises
If you didnt already, include the names in
your query results:
Question:
Why are we back at two answers?
13. Answer
Answer:
Just as we can add triples together to make a
graph in RDF, so we can add triple patterns
together in SPARQL to make a graph pattern
By default all triple patterns must be matched,
but we can use the OPTIONAL {} pattern to
allow variation
Exercise:
Query for the owners and their names, if they
exist*
* N.B. this bug in the BM data will be fixed soon
15. Exercise
Take a look here:
Exercise: copy and run this query
16. CSV Exercise
Type:
Observe that one can now paste the query
including line breaks*
Type:
* N.B. for now you should first replace the "s with 's and
change the one occurrence of ecrm: with crm: - well fix this
* N.B. currently the query needs to be simplified as the BBC
data is not loaded this will be available soon
17. Data Analysis
One can import this CSV file into many
tools:
A spreadsheet can be a good way to carry out
basic visualisations
A scripting environment like (i)python/scipy or
R can allow more analysis before
visualisation, but:
both languages also have libraries to encapsulate
interaction via SPARQL (rdflib/sparqlwrapper and
SPARQL/RCurl respectively)
one should decide whether more analysis should
first be carried out using SPARQL
18. Exercise
If you havent so far, click on one of the
(HotW) 100 Objects (such as number 70,
Hoa Hakananai'a Easter Island Statue)
having run the main query
Choose a material and observe the query
for other objects in this material
Adapt this query to count how many BM
objects are made from basalt
19. Solution & Exercise
Exercise: Now count the top ten materials
and the number of objects for each
21. A Last Word
SPARQLing a native RDF database
(often called a triplestore) is not the only
option before defaulting to programming
A native graph database indexes the
graph in a different way, supporting
traversal-oriented queries