際際滷

際際滷Share a Scribd company logo
Towards a Linked-Data 
Visualization Wizard 
Ghislain A. Atemezing (@gatemezing)* 
Rapha谷l Troncy (@rtroncy) 
(*) The author thanks the Semantic Web Science Association (SWSA) for the grant receives to particiapte at ISWC, 2014.
Goal and Agenda 
則 Goal: Build a visualization wizard 
based on the RDF stack 
則 Motivation 
 Gap between traditional InfoVis tools and 
Semantic Web applications 
 Graphs are not meant to be shown to end-users 
則 Current situation 
 Visualizations are built on known datasets and vocabularies 
  what happen with unknown datasets and vocabularies? 
則 Proposal: create generic visualizations based on data 
analysis of the RDF graphs 
則 Conclusion and Perspectives 
2014/10/20 #COLD2014  Riva del Garda, Italy - 2
Motivation 
則 Many structured datasets are now available on the 
Web (3 billions of Triples in the DBpedia 2014 release) 
則 RDF is not what we show to end-users 
則 InfoVis community has mature tools and studies 
on visualizing information 
則 Triples are good  
but they need to be beautiful for end-users 
則 In the era of structured big data, we also need 
tools for Webbased visual analysis and reporting 
2014/10/20 #COLD2014  Riva del Garda, Italy - 3
Challenges 
Dont ask what you can do for 
the Semantic Web; ask what 
The Semantic Web can do for 
you! (D. Karger, MIT CSAIL)  
1- How to build bridge to fill the 
gap between traditional 
InfoVis tools and Semantic Web 
technologies 
2- How can Semantic Web help 
in visualization? 
2014/10/20 #COLD2014  Riva del Garda, Italy - 4
A Journey of a Web Application Developer 
則 Scenario 1: 
 Known Datasets, Known 
vocabularies  Specific 
SPARQL queries 
 Visualizations: dataset specific 
則 Example 
 Datasets on schools in France 
 Vocabularies: geo vocab, data 
cube, geometry. 
 Application: PerfectSchool 
2014/10/20 #COLD2014  Riva del Garda, Italy - 5
A Journey of a Web Application Developer 
則 Scenario 2: 
 Unknown Datasets, Known 
domains, so domain-specific 
SPARQL queries 
 Visualizations: domain specific 
則 Example 
 Endpoints of geo datasets 
 Domain: geospatial 
 Application: GeoRDFviz 
2014/10/20 #COLD2014  Riva del Garda, Italy - 6
A Journey of a Web Application Developer 
則 Scenario 3: 
 Unknown Datasets, Unknown 
domains, so generic SPARQL 
queries 
 Visualizations: adapted to 
domains specific 
則 Example 
 Any endpoints 
 Multiple domains: geodata, 
statistics, persons, cross-domains, 
etc.. 
 Application: ??? 
Related work on configuring Semantic Web widgets by data 
mapping [1] 
Application: Efficient search for Semantic News demonstrator 
in Cultural Heritage Dataset 
Tool: ClioPatria 
but method not apply to create 
interfaces on top of arbitrary 
SPARQL endpoints 
[1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping." 
Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96. 
2014/10/20 #COLD2014  Riva del Garda, Italy - 7
Our Proposal 
Linked Data 
Vizualization 
Wizard (LDVizWiz) 
2014/10/20 #COLD2014  Riva del Garda, Italy - 8
Requirements of LDVizWiz (LDViz-Wise) 
則 Predefined categories associated 
to visual elements 
則 Build on top of RDF standards 
 e.g. SPARQL queries; Semantic Web technologies 
則 Reuse existing Visualization libraries 
 e.g. Google Maps, Google Charts, D3.js, etc. 
則 Input: Datasets published as LOD 
則 Reuse Information Visualization Taxonomy 
則 Target to non RDF/SPARQL speakers 
2014/10/20 #COLD2014  Riva del Garda, Italy - 9
Mapping Categories and vocabularies 
則 Geographic 
information 
 Geo, GeoSparql, etc. 
則 Temporal information 
 Time, interval ontologies 
則 Event information 
 lode, event, sport, etc. 
則 Agent/Person 
 foaf, org 
則 Organization 
information 
 ORG vocabulary, vcard 
則 Statistics information 
 Data cube, SDMX model 
則 Knowledge 
information 
 Schemas, classifications 
using SKOS vocabulary 
2014/10/20 #COLD2014  Riva del Garda, Italy - 10
LDVizWiz Workflow 
2014/10/20 #COLD2014  Riva del Garda, Italy - 11
Step 1: Categories detection 
則 Detection of main categories in datasets 
 ASK SPARQL queries on predefined categories 
 Uses well-known vocabularies in LOV 
 Unveil main facets of the visualizations 
 Condition the type of visual elements [1] 
2014/10/20 #COLD2014  Riva del Garda, Italy - 12 
Detection 
[1] B. Shneiderman. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. IEEE, 1996
Experiment: Categories Detection 
Category Number % 
GEO DATA 97 21.84% 
EVENT DATA 16 3.60% 
TIME DATA 27 6.08% 
SKOS DATA 02 0.45% 
ORG DATA 48 10.81% 
PERSON DATA 59 13.28% 
STAT DATA 29 6.6% 
 444 endpoints (*) analyzed, 278 
good answers (62.61%) using 
ASK queries. 
 Few taxonomies in SKOS, many 
GEO DATA 
則 Applications 
 Automatic detection of 
endpoints categories 
 More trustable than 
human tagging 
 Map categories 
detected with suitable 
visual elements for the 
visualizations (e.g. 
TimeLine + maps for 
events data) 
(*) All the endpoints retrieved from sparqles.org 
2014/10/20 #COLD2014  Riva del Garda, Italy - 13
Step2: Properties Aggregation 
則 Goal: Exploit the connectors between graphs 
則 connectors are used to enrich a given graph 
 e.g. owl:sameAs, rdfs:seeAlso, 
skos:exactMatch 
則 Retrieve properties from external datasets 
 So called enriched properties 
則 Build candidate properties for visualization 
 For pop-up menus 
 For facet browsing 
 For charts display 
2014/10/20 #COLD2014  Riva del Garda, Italy - 14 
Detection Aggregation
Step3: Publication 
則 Visualization Generator 
 Recommend the visual elements based on categories 
 Transform ASK queries to SELECT or CONSTRUCT 
queries for input to visual library 
則 Visualization Publisher 
 Export the description of a visualization in RDF 
 Add metadata for the visualization (charts) and the 
steps used to create it 
 e.g. dcat:Dataset, prov:wasDerivedFrom, 
void:ExampleResource, chart vocabulary 
2014/10/20 #COLD2014  Riva del Garda, Italy - 15 
Detection Aggregation Publication
Current Implementation 
則 Javascript light version as proof-of-concept 
則 http://semantics.eurecom.fr/datalift/rdfViz/apps/ 
2014/10/20 #COLD2014  Riva del Garda, Italy - 16
Conclusion and Future Work 
則 LDVizWiz: a tool to generate visualizations 
 Based on RDF standards, target to lay-users for graph analysis 
 Composed of 3 main steps: category detections, property 
aggregation and visualization publication 
則 A Javascript implementation shows the usefulness of 
the approach 
則 Future work 
 Extend categories and vocabularies for detection 
 Add more libraries for visual elements in visualizations 
 Provide templates for generating mash-ups that combine domains 
 Investigate the importance of a category within a dataset 
 Provide a user evaluation 
2014/10/20 #COLD2014  Riva del Garda, Italy - 17
Questions? 
http://ww.slideshare.net/ghislainatemezing/cold2014-ldvizwiz

More Related Content

cold2014-ldvizwiz

  • 1. Towards a Linked-Data Visualization Wizard Ghislain A. Atemezing (@gatemezing)* Rapha谷l Troncy (@rtroncy) (*) The author thanks the Semantic Web Science Association (SWSA) for the grant receives to particiapte at ISWC, 2014.
  • 2. Goal and Agenda 則 Goal: Build a visualization wizard based on the RDF stack 則 Motivation Gap between traditional InfoVis tools and Semantic Web applications Graphs are not meant to be shown to end-users 則 Current situation Visualizations are built on known datasets and vocabularies what happen with unknown datasets and vocabularies? 則 Proposal: create generic visualizations based on data analysis of the RDF graphs 則 Conclusion and Perspectives 2014/10/20 #COLD2014 Riva del Garda, Italy - 2
  • 3. Motivation 則 Many structured datasets are now available on the Web (3 billions of Triples in the DBpedia 2014 release) 則 RDF is not what we show to end-users 則 InfoVis community has mature tools and studies on visualizing information 則 Triples are good but they need to be beautiful for end-users 則 In the era of structured big data, we also need tools for Webbased visual analysis and reporting 2014/10/20 #COLD2014 Riva del Garda, Italy - 3
  • 4. Challenges Dont ask what you can do for the Semantic Web; ask what The Semantic Web can do for you! (D. Karger, MIT CSAIL) 1- How to build bridge to fill the gap between traditional InfoVis tools and Semantic Web technologies 2- How can Semantic Web help in visualization? 2014/10/20 #COLD2014 Riva del Garda, Italy - 4
  • 5. A Journey of a Web Application Developer 則 Scenario 1: Known Datasets, Known vocabularies Specific SPARQL queries Visualizations: dataset specific 則 Example Datasets on schools in France Vocabularies: geo vocab, data cube, geometry. Application: PerfectSchool 2014/10/20 #COLD2014 Riva del Garda, Italy - 5
  • 6. A Journey of a Web Application Developer 則 Scenario 2: Unknown Datasets, Known domains, so domain-specific SPARQL queries Visualizations: domain specific 則 Example Endpoints of geo datasets Domain: geospatial Application: GeoRDFviz 2014/10/20 #COLD2014 Riva del Garda, Italy - 6
  • 7. A Journey of a Web Application Developer 則 Scenario 3: Unknown Datasets, Unknown domains, so generic SPARQL queries Visualizations: adapted to domains specific 則 Example Any endpoints Multiple domains: geodata, statistics, persons, cross-domains, etc.. Application: ??? Related work on configuring Semantic Web widgets by data mapping [1] Application: Efficient search for Semantic News demonstrator in Cultural Heritage Dataset Tool: ClioPatria but method not apply to create interfaces on top of arbitrary SPARQL endpoints [1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping." Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96. 2014/10/20 #COLD2014 Riva del Garda, Italy - 7
  • 8. Our Proposal Linked Data Vizualization Wizard (LDVizWiz) 2014/10/20 #COLD2014 Riva del Garda, Italy - 8
  • 9. Requirements of LDVizWiz (LDViz-Wise) 則 Predefined categories associated to visual elements 則 Build on top of RDF standards e.g. SPARQL queries; Semantic Web technologies 則 Reuse existing Visualization libraries e.g. Google Maps, Google Charts, D3.js, etc. 則 Input: Datasets published as LOD 則 Reuse Information Visualization Taxonomy 則 Target to non RDF/SPARQL speakers 2014/10/20 #COLD2014 Riva del Garda, Italy - 9
  • 10. Mapping Categories and vocabularies 則 Geographic information Geo, GeoSparql, etc. 則 Temporal information Time, interval ontologies 則 Event information lode, event, sport, etc. 則 Agent/Person foaf, org 則 Organization information ORG vocabulary, vcard 則 Statistics information Data cube, SDMX model 則 Knowledge information Schemas, classifications using SKOS vocabulary 2014/10/20 #COLD2014 Riva del Garda, Italy - 10
  • 11. LDVizWiz Workflow 2014/10/20 #COLD2014 Riva del Garda, Italy - 11
  • 12. Step 1: Categories detection 則 Detection of main categories in datasets ASK SPARQL queries on predefined categories Uses well-known vocabularies in LOV Unveil main facets of the visualizations Condition the type of visual elements [1] 2014/10/20 #COLD2014 Riva del Garda, Italy - 12 Detection [1] B. Shneiderman. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. IEEE, 1996
  • 13. Experiment: Categories Detection Category Number % GEO DATA 97 21.84% EVENT DATA 16 3.60% TIME DATA 27 6.08% SKOS DATA 02 0.45% ORG DATA 48 10.81% PERSON DATA 59 13.28% STAT DATA 29 6.6% 444 endpoints (*) analyzed, 278 good answers (62.61%) using ASK queries. Few taxonomies in SKOS, many GEO DATA 則 Applications Automatic detection of endpoints categories More trustable than human tagging Map categories detected with suitable visual elements for the visualizations (e.g. TimeLine + maps for events data) (*) All the endpoints retrieved from sparqles.org 2014/10/20 #COLD2014 Riva del Garda, Italy - 13
  • 14. Step2: Properties Aggregation 則 Goal: Exploit the connectors between graphs 則 connectors are used to enrich a given graph e.g. owl:sameAs, rdfs:seeAlso, skos:exactMatch 則 Retrieve properties from external datasets So called enriched properties 則 Build candidate properties for visualization For pop-up menus For facet browsing For charts display 2014/10/20 #COLD2014 Riva del Garda, Italy - 14 Detection Aggregation
  • 15. Step3: Publication 則 Visualization Generator Recommend the visual elements based on categories Transform ASK queries to SELECT or CONSTRUCT queries for input to visual library 則 Visualization Publisher Export the description of a visualization in RDF Add metadata for the visualization (charts) and the steps used to create it e.g. dcat:Dataset, prov:wasDerivedFrom, void:ExampleResource, chart vocabulary 2014/10/20 #COLD2014 Riva del Garda, Italy - 15 Detection Aggregation Publication
  • 16. Current Implementation 則 Javascript light version as proof-of-concept 則 http://semantics.eurecom.fr/datalift/rdfViz/apps/ 2014/10/20 #COLD2014 Riva del Garda, Italy - 16
  • 17. Conclusion and Future Work 則 LDVizWiz: a tool to generate visualizations Based on RDF standards, target to lay-users for graph analysis Composed of 3 main steps: category detections, property aggregation and visualization publication 則 A Javascript implementation shows the usefulness of the approach 則 Future work Extend categories and vocabularies for detection Add more libraries for visual elements in visualizations Provide templates for generating mash-ups that combine domains Investigate the importance of a category within a dataset Provide a user evaluation 2014/10/20 #COLD2014 Riva del Garda, Italy - 17