Introduction to the main concepts of Knowledge Engineering from the Semantic Web perspective, given at the APSEM 2018 days
1 of 57
Download to read offline
More Related Content
Knowledge Engineering: Semantic web, web of data, linked data
1. 1Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
F. Michel
Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Knowledge Engineering:
Semantic web, web of data, linked data
ANF APSEM2018 : Apprentissage et s辿mantique
Toulouse, 12-15 Nov. 2018
2. 2Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
More data sources More opportunities
3. 3Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
To you, your data may mean this
4. 4Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
To others,
your data may mean that
5. 5Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Interoperability Challenges
Structural heterogeneity
Uniform representation format
Semantic heterogeneity
Controlled vocabularies, thesaurus, ontologies
Common way to query the data
6. 6Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web
Linked Data and the Web of Data
Publishing legacy data in RDF
Agenda
7. 7Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web
8. 8Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web provides an environment where
applications can publish and link data, define vocabularies,
query data at web scale, and draw inferences. (adapted from W3C website)
Link
Querying
Vocabularies
Inference
Publish
9. 9Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Standards of the Semantic Web
Applications and Services
Trust
Identifiers: URI, IRI
Data representation:
RDF abstract model + syntaxes
Vocabularies:
RDFS, OWL, SKOSQuerying:
SPARQL
Rules:
SPIN, SWRL, SHACL
Unifying logic: First Order Logic
Proof
Security(crypto)
10. 10Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Standards of the Semantic Web
Applications and Services
Trust
Identifiers: URI, IRI
Data representation:
RDF abstract model + syntaxes
Vocabularies:
RDFS, OWL, SKOSQuerying:
SPARQL
Rules:
SPIN, SWRL, SHACL
Unifying logic: First Order Logic
Proof
Security(crypto)
Web of Data
11. 11Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Standards of the Semantic Web
Applications and Services
Trust
Identifiers: URI, IRI
Data representation:
RDF abstract model + syntaxes
Vocabularies:
RDFS, OWL, SKOSQuerying:
SPARQL
Rules:
SPIN, SWRL, SHACL
Unifying logic: First Order Logic
Proof
Security(crypto)
Reasonning
12. 12Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
RDF is a conceptual model based on triples,
i.e. any fact consists of 3 components:
( subject, predicate, object )
Source: C. Faron Zucker, O. Corby. Introduction au web de donn辿es et au web s辿mantique. S辿minaire INRA Open Data Dec. 2014.
The Resource Description Framework
13. 13Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
websem.html is a texte
websem.html has as author Fabien
websem.html has as author Olivier
websem.html has as author Catherine
websem.html has as subject Semantic Web
websem.html was written in 2011
The Resource Description Framework
14. 14Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
websem.html
SemanticWeb
Texte
Catherine
Olivier
Fabien
type
date
author
subject
author
author
2011
The Resource Description Framework
17. 17Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
RDF Schemas define
classes of resources,
their properties,
and organize their hierarchies
21. 21Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
OWL
The Web Ontology Language
22. 22Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
def. by enumeration
def. by intersection
def. by union
def. by complement
class disjunction
def. by restriction
def. by cardinality
def. by equivalence
!
1..1
[>=18] def. by value restrict.
OWL in one slide
(a)symetric prop.
prop. disjunction
cardinality1..1
!
indiv. prop. negation
chained prop.
(irr)reflexive prop.
transitive prop.
inverse prop.
23. 23Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Closed vs. Open Worlds Assumptions
Closed World
Everything there is to know about a thing is
stated in a single, closed DB.
Not asserted facts are false, i.e.
only asserted facts are true.
A schema may define what can be stated
(a schema may be violated).
Open World
Knowledge is distributed.
Each RDF graph may state facts about a thing,
irrespective of what others state.
Because a fact is not asserted does not
mean it is false.
Every asserted fact is true (no schema)
But some facts may lead to inconsistencies
24. 24Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Quering RDF with SPARQL
SPARQL Protocol and RDF
Query Language
25. 25Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL 1.1 Rec. 21 Mar. 2013
Query Language (using the Turtle syntax)
CRUD operations
Query results
Query Results Format XML, JSON, CSV/TCV
Protocols
SPARQL Protocol
SPARQL Graph Store HTTP Protocol
Entailment Regimes
26. 26Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL: triple patterns
Turtle syntax with ? or $ to mark variables:
?x rdf:type ex:Person
Describe patterns of triples that we look for:
SELECT ?subject ?type
WHERE { ?subject rdf:type ?type }
Default pattern: conjunction of triple patterns:
SELECT ?x WHERE
{ ?x rdf:type ex:Person .
?x ex:name ?name . }
?x
rdf:type
ex:Person
?name
ex:name
27. 27Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL: namespace prefixes
Declare prefixes of used vocabularies:
PREFIX mit: <http://www.mit.edu#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?student
WHERE {
?student mit:registeredAt ?x .
?x foaf:homepage <http://www.mit.edu> .
}
Declare a base namespace for relative URIs:
BASE <http://example.org/people#>
SELECT ?student
WHERE { ?student foaf:knows <Ted> . }
?student
mit:registeredAt
?x
http://www.mit.edu
foaf:homepage
http://example.org/
people#Ted
foaf:knows
28. 28Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL: language and typed literals
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?x ?f WHERE {
?x foaf:name "Steve"@en ; foaf:knows ?f .
}
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?x WHERE {
?x foaf:name "Steve"@en ;
foaf:age "21"^^xsd:integer .
}
29. 29Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL: optional pattern
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person ?name
WHERE {
?person foaf:homepage <http://fabien.info> .
OPTIONAL { ?person foaf:name ?name . }
}
Variable ?name is potentially unbound.
30. 30Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL alternative pattern
Merge the results of two graph patterns:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person ?name
WHERE {
?person foaf:name ?name .
{ ?person foaf:homepage <http://fabien.info> . }
UNION
{ ?person foaf:homepage <http://fabien.org> . }
}
31. 31Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL filters
PREFIX ex: <http://inria.fr/schema#>
SELECT ?person ?name
WHERE {
?person rdf:type ex:Person; ex:name ?name; ex:age ?age .
FILTER (xsd:integer(?age) >= 18)
}
Other examples:
FILTER(?name IN ("fabien", "olivier", "catherine"))
FILTER(langMatches(lang(?name),"en"))
32. 32Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
SPARQL additional features
Solution modifiers:
ORDER BY, LIMIT, OFFSET, DISTINCT
Aggregates
GROUP BY, HAVING
Negation
NOT EXISTS, MINUS, NOT IN
WHERE { ?x a ex:Person MINUS { ?x foaf:knows ex:John } }
Nested queries
Named graphs
Property paths
?x foaf:knows+ ?friend .
37. 37Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web
Linked Data and the Web of Data
Publishing legacy data in RDF
Agenda
38. 38Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Web of Data
aka. Data Web, Web 3.0,
Global Knowledge Graph
39. 39Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Web of Data
Applications and Services
Trust
Identifiers: URI, IRI
Data representation:
RDF abstract model + syntaxes
Vocabularies:
RDFS, OWL, SKOSQuerying:
SPARQL
Rules:
SPIN, SWRL, SHACL
Unifying logic: First Order Logic
Proof
Security(crypto)
First step in the deployment
of the Semantic Web
Detractors would say
the part of the
Semantic Web that works
40. 40Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web provides an environment where
applications can publish and link data, define vocabularies,
query data at web scale, and draw inferences.
Link
Querying
Vocabularies
Inference
Publish
41. 41Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Linked Data principles
1.Use URIs to name things
2.Use HTTP URIs so that people
can look up those names
3.When someone looks up a URI,
provide useful information using the standards (RDF, SPARQL)
4.Include links to other URIs, so they can discover more things
42. 42Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Linked Open Data Cloud: 1200+ linked datasets
Linking Open Data cloud diagram, 2018. J.P. McCrae, A. Abele,
P. Buitelaar, A. Jentzsch, V. Andryushechkin and R. Cyganiak.
http://lod-cloud.net/
On the web, under open licenses
Machine-readable (RDF)
URIs to name things
Common vocabularies
Linked with each other
Queryable
Iconic but partial view of the Web of Data
LOD Atlas: 25,000+ datasets
43. 43Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web
Linked Data and the Web of Data
Publishing legacy data in RDF
Agenda
44. 44Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Legacy
dataset
45. 45Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Legacy
dataset
describe
46. 46Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
47. 47Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
48. 48Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
49. 49Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Translate
heterogeneous
data into RDF?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
50. 50Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Translate
heterogeneous
data into RDF?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
51. 51Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Translate
heterogeneous
data into RDF?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
52. 52Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Ensure shared understanding?
Need for common vocabularies with well defined semantics
Controlled vocabulary, thesaurus, ontology
How to define/model a vocabulary?
Where to find existing vocabularies, how to reuse and/or them?
53. 53Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Translate
heterogeneous
data into RDF?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
54. 54Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Many methods for many types of data sources
AstroGrid-D, SPARQL2XQuery, XSPARQL
XML
XLWrap, Linked CSV, CSVW, RML
CSV/TSV/Spreadsheets
D2RQ, R2O, Ultrawrap, Triplify, SM
R2RML: Morph-RDB, ontop, Virtuoso
Relational Databases
RML, TARQL, Apache Any23, DataLift,
SPARQL-Generate
Multiple formats
RDFa, Microformats
HTML
TARQL, JSON-LD, RML
JSON
xR2RML (MongoDB), ontop (MongoDB),
[Mugnier et al, 2016] (key-value stores)
NoSQL
M.L. Mugnier, M.C. Rousset, and F. Ulliana. Ontology-Mediated Queries for NOSQL Databases. In Proc. AAAI. 2016.
SPARQL Micro-services, Linked REST APIs
Web APIs
55. 55Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Publishing legacy data in RDF raises tricky questions
Metadata
Data
Ensure shared
understanding?
Reference raw data
(signals, binary)
Translate
heterogeneous
data into RDF?
Legacy
dataset
describe
Catalogue,
data portal
What metadata?
Where/how to publish them?
56. 56Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Metadata vocabularies
Schema.org, DCAT, VOID, HCLS
Data portals and catalogues
CKAN, data.gov.*, Google Dataset Search
Vocabularies to describe datasets and dataset catalogues
57. 57Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Thank you!