際際滷

際際滷Share a Scribd company logo
1Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
F. Michel1, C. Faron-Zucker1, S. Tercerie2, O. Gargominy2
1Universit辿 C担te dAzur, CNRS, Inria, I3S, France. 2Service du Patrimoine Naturel, MNHN, CNRS, France.
Modelling Biodiversity Linked Data:
Pragmatism May Narrow Future Opportunities
SPNHC+TDWG 2018
Dunedin, New Zealand
2Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Source: Sangya Pundir. https://fr.wikipedia.org/wiki/Fichier:FAIR_data_principles.jpg
3Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
LOD Cloud: 1184 datasets, 150B Statements
Linking Open Data cloud diagram, 2018. J.P. McCrae, A. Abele,
P. Buitelaar, A. Jentzsch, V. Andryushechkin and R. Cyganiak.
http://lod-cloud.net/
 On the Web, under open licenses
 Machine-readable (RDF)
 URIs to name things
 Common vocabularies
 Linked with each other
 Queryable
4Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
TAXREF-LD
NCBI Taxon
TaxonConcept
GeoSpecies
Plant Ontology
ENVO
5Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
The Semantic Web provides an environment where
applications can publish and link data, define vocabularies,
query data at web scale, and draw inferences. (adapted from W3C website)
Linked
Data
Querying
Vocabularies
Inference
6Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
One class for all taxa
AGROVOC thesaurus (skos:Concept)
OpenBiodiv-O, EOL trait bank (dwc:Taxon)
Wikidata (wd:Q16521)
parent/broader
Delphinus
Delphinus
delphis
Taxon
Model a thing as a class or a class instance?
One class per taxonomic rank
GeoSpecies, TaxonConcept,
DBpedia, BBC WO, BioFid.de
hasGenus/broader
Delphinus
Delphinus
delphis
Species
Genus
Delphinus delphis
Delphinus
subClassOf
One class per taxon
NCBI Org. Classification,
VTO, TAXREF-LD
Flipper
Thesaurus perspective Biological perspective Taxonomic Rank perspective
Flipper
7Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Biological
 Formal conceptualization of domain knowledge in
a machine-processable format
 Define and organize terms
 Hierarchy of concepts using subclass of (subsumption),
is part of (composition), or other relations
 Concepts are classes = sets of individuals
 Classes can be described and/or defined
Thesaurus
 Representation of a domain knowledge in a
machine-processable format
 Define and organize terms
 Hierarchy of concepts using relations between concepts
(broader, narrower, match)
 Concepts are individuals = class instances
 Individuals are described
Thesaurus vs. Biological perspectives
8Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
An individual is described
by stating its properties
A class can be defined
by a set of necessary and/or sufficient
membership conditions
Description vs. Definition
Flipper
is a
Mammals
restriction
(habitat, marine)
restriction
(parental care, none)
habitat
marine
none
parental care
Delphinus
delphis
habitat
marine
Avg. body
length
2,44m
species
none
parental care
rank
Delphinus
delphis
Avg. body
length
2,44m
species
rank
subClassOf
9Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
Linked
Data
Querying
Vocabularies
Inference
Infer subsumption relationships between classes
Classify individuals:
compute instance relationships between individuals and classes
Improve query answering:
query expansion, infer new triples to improve performance
Align similar classifications
10Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
The Semantic Web is more than the Web of Data
Linked
Data
Querying
Vocabularies
Inference
Infer subsumption relationships between classes
Classify individuals:
compute instance relationships between individuals and classes
Improve query answering:
query expansion, infer new triples to improve performance
Align similar classifications
11Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Modelling LD requires tackling several questions
What is my modelling perspective? For what use?
Thesaurus? Ontology? Other?
Will I need some type of automatic reasoning eventually?
How to maximize interlinking with related datasets?
 Theoretical issue: DL best practices discourage aligning classes and class instances
 Linking thesauruses and ontologies not always possible, e.g.:
A taxon in NCBI (class)  A taxon in Agrovoc (instance of the SKOS concept class)
 Pragmatism  adopt the majority trend to maximize interlinking
Taxa = class instances in Agrovoc, EoL, Wikidata, OpenBiodiv-O,
DBpedia, GeoSpecies, TaxonConcept
Taxa = classes in VTO, NCBI, TAXREF-LD
Trade-off between interlinking and reasoning?
12Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Take-home messages
The Semantic Web is not just Linked Data.
Think of what inference may solve in my context.
Choose a modelling perspective for my LD:
controlled vocabulary, thesaurus, ontology, 
Pragmatism can be beneficial in the short term,
but may come with a price.
13Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France
Citation:
Michel F., Faron-Zucker C., Tercerie S. and Gargomony O. (2018).
Modelling Biodiversity Linked Data: Pragmatism May Narrow
Future Opportunities. Biodiversity Information Science and
Standards 2: e26235. https://doi.org/10.3897/biss.2.26235
Thank you

More Related Content

Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities

  • 1. 1Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France F. Michel1, C. Faron-Zucker1, S. Tercerie2, O. Gargominy2 1Universit辿 C担te dAzur, CNRS, Inria, I3S, France. 2Service du Patrimoine Naturel, MNHN, CNRS, France. Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities SPNHC+TDWG 2018 Dunedin, New Zealand
  • 2. 2Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France Source: Sangya Pundir. https://fr.wikipedia.org/wiki/Fichier:FAIR_data_principles.jpg
  • 3. 3Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France LOD Cloud: 1184 datasets, 150B Statements Linking Open Data cloud diagram, 2018. J.P. McCrae, A. Abele, P. Buitelaar, A. Jentzsch, V. Andryushechkin and R. Cyganiak. http://lod-cloud.net/ On the Web, under open licenses Machine-readable (RDF) URIs to name things Common vocabularies Linked with each other Queryable
  • 4. 4Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France TAXREF-LD NCBI Taxon TaxonConcept GeoSpecies Plant Ontology ENVO
  • 5. 5Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France The Semantic Web is more than the Web of Data The Semantic Web provides an environment where applications can publish and link data, define vocabularies, query data at web scale, and draw inferences. (adapted from W3C website) Linked Data Querying Vocabularies Inference
  • 6. 6Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France One class for all taxa AGROVOC thesaurus (skos:Concept) OpenBiodiv-O, EOL trait bank (dwc:Taxon) Wikidata (wd:Q16521) parent/broader Delphinus Delphinus delphis Taxon Model a thing as a class or a class instance? One class per taxonomic rank GeoSpecies, TaxonConcept, DBpedia, BBC WO, BioFid.de hasGenus/broader Delphinus Delphinus delphis Species Genus Delphinus delphis Delphinus subClassOf One class per taxon NCBI Org. Classification, VTO, TAXREF-LD Flipper Thesaurus perspective Biological perspective Taxonomic Rank perspective Flipper
  • 7. 7Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France Biological Formal conceptualization of domain knowledge in a machine-processable format Define and organize terms Hierarchy of concepts using subclass of (subsumption), is part of (composition), or other relations Concepts are classes = sets of individuals Classes can be described and/or defined Thesaurus Representation of a domain knowledge in a machine-processable format Define and organize terms Hierarchy of concepts using relations between concepts (broader, narrower, match) Concepts are individuals = class instances Individuals are described Thesaurus vs. Biological perspectives
  • 8. 8Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France An individual is described by stating its properties A class can be defined by a set of necessary and/or sufficient membership conditions Description vs. Definition Flipper is a Mammals restriction (habitat, marine) restriction (parental care, none) habitat marine none parental care Delphinus delphis habitat marine Avg. body length 2,44m species none parental care rank Delphinus delphis Avg. body length 2,44m species rank subClassOf
  • 9. 9Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France The Semantic Web is more than the Web of Data Linked Data Querying Vocabularies Inference Infer subsumption relationships between classes Classify individuals: compute instance relationships between individuals and classes Improve query answering: query expansion, infer new triples to improve performance Align similar classifications
  • 10. 10Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France The Semantic Web is more than the Web of Data Linked Data Querying Vocabularies Inference Infer subsumption relationships between classes Classify individuals: compute instance relationships between individuals and classes Improve query answering: query expansion, infer new triples to improve performance Align similar classifications
  • 11. 11Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France Modelling LD requires tackling several questions What is my modelling perspective? For what use? Thesaurus? Ontology? Other? Will I need some type of automatic reasoning eventually? How to maximize interlinking with related datasets? Theoretical issue: DL best practices discourage aligning classes and class instances Linking thesauruses and ontologies not always possible, e.g.: A taxon in NCBI (class) A taxon in Agrovoc (instance of the SKOS concept class) Pragmatism adopt the majority trend to maximize interlinking Taxa = class instances in Agrovoc, EoL, Wikidata, OpenBiodiv-O, DBpedia, GeoSpecies, TaxonConcept Taxa = classes in VTO, NCBI, TAXREF-LD Trade-off between interlinking and reasoning?
  • 12. 12Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France Take-home messages The Semantic Web is not just Linked Data. Think of what inference may solve in my context. Choose a modelling perspective for my LD: controlled vocabulary, thesaurus, ontology, Pragmatism can be beneficial in the short term, but may come with a price.
  • 13. 13Franck MICHEL - Universit辿 C担te dAzur, CNRS, Inria, I3S, France Citation: Michel F., Faron-Zucker C., Tercerie S. and Gargomony O. (2018). Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities. Biodiversity Information Science and Standards 2: e26235. https://doi.org/10.3897/biss.2.26235 Thank you