�ݺ�ߣ

Towards a Linked-Data
Visualization Wizard
Ghislain A. Atemezing (@gatemezing)*
Raphaël Troncy (@rtroncy)
(*) The author thanks the Semantic Web Science Association (SWSA) for the grant receives to particiapte at ISWC, 2014.

Goal and Agenda
§ Goal: Build a visualization wizard
based on the RDF stack
§ Motivation
Ø Gap between traditional InfoVis tools and
Semantic Web applications
Ø Graphs are not meant to be shown to end-users
§ Current situation
Ø Visualizations are built on known datasets and vocabularies
Ø … what happen with unknown datasets and vocabularies?
§ Proposal: create generic visualizations based on data
analysis of the RDF graphs
§ Conclusion and Perspectives
2014/10/20 #COLD2014 – Riva del Garda, Italy - 2

Motivation
§ Many structured datasets are now available on the
Web (3 billions of Triples in the DBpedia 2014 release)
§ RDF is not what we show to end-users
§ InfoVis community has mature tools and studies
on visualizing information
§ Triples are good …
but they need to be “beautiful” for end-users
§ In the era of “structured big data”, we also need
tools for Web–based visual analysis and reporting

Challenges
“Don’t ask what you can do for
the Semantic Web; ask what
The Semantic Web can do for
you!” (D. Karger, MIT CSAIL) –
1- How to build bridge to fill the
gap between traditional
InfoVis tools and Semantic Web
technologies
2- How can Semantic Web help
in visualization?

A Journey of a Web Application Developer
§ Scenario 1:
Ø Known Datasets, Known
vocabularies à Specific
SPARQL queries
Ø Visualizations: dataset specific
§ Example
Ø Datasets on schools in France
Ø Vocabularies: geo vocab, data
cube, geometry.
Ø Application: PerfectSchool

§ Scenario 2:
Ø Unknown Datasets, Known
domains, so domain-specific
SPARQL queries
Ø Visualizations: domain specific
§ Example
Ø Endpoints of geo datasets
Ø Domain: geospatial
Ø Application: GeoRDFviz

§ Scenario 3:
Ø Unknown Datasets, Unknown
domains, so generic SPARQL
queries
Ø Visualizations: adapted to
domains specific
§ Example
Ø Any endpoints
Ø Multiple domains: geodata,
statistics, persons, cross-domains,
etc..
Ø Application: ???
Related work on configuring Semantic Web widgets by data
mapping [1]
Application: Efficient search for Semantic News demonstrator
in Cultural Heritage Dataset
Tool: ClioPatria
…but “method not apply to create
interfaces on top of arbitrary
SPARQL endpoints”
[1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping."
Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96.

Our Proposal
Linked Data
Vizualization
Wizard (LDVizWiz)

Requirements of LDVizWiz (LDViz-”Wise”)
§ Predefined categories associated
to visual elements
§ Build on top of RDF standards
Ø e.g. SPARQL queries; Semantic Web technologies
§ Reuse existing Visualization libraries
Ø e.g. Google Maps, Google Charts, D3.js, etc.
§ Input: Datasets published as LOD
§ Reuse Information Visualization Taxonomy
§ Target to non “RDF/SPARQL speakers”

Mapping Categories and vocabularies
§ Geographic
information
Ø Geo, GeoSparql, etc.
§ Temporal information
Ø Time, interval ontologies
§ Event information
Ø lode, event, sport, etc.
§ Agent/Person
Ø foaf, org
§ Organization
information
Ø ORG vocabulary, vcard
§ Statistics information
Ø Data cube, SDMX model
§ Knowledge
information
Ø Schemas, classifications
using SKOS vocabulary

LDVizWiz Workflow

Step 1: Categories detection
§ Detection of main categories in datasets
Ø ASK SPARQL queries on predefined categories
Ø Uses well-known vocabularies in LOV
Ø Unveil main facets of the visualizations
Ø Condition the type of visual elements [1]
Detection
[1] B. Shneiderman. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. IEEE, 1996

Experiment: Categories Detection
Category Number %
GEO DATA 97 21.84%
EVENT DATA 16 3.60%
TIME DATA 27 6.08%
SKOS DATA 02 0.45%
ORG DATA 48 10.81%
PERSON DATA 59 13.28%
STAT DATA 29 6.6%
Ø 444 endpoints (*) analyzed, 278
good answers (62.61%) using
ASK queries.
Ø Few taxonomies in SKOS, many
GEO DATA
§ Applications
Ø Automatic detection of
endpoints categories
Ø More “trustable” than
human tagging
Ø Map categories
detected with “suitable”
visual elements for the
visualizations (e.g.
TimeLine + maps for
events data)
(*) All the endpoints retrieved from sparqles.org

Step2: Properties Aggregation
§ Goal: Exploit the “connectors” between graphs
§ “connectors” are used to enrich a given graph
Ø e.g. owl:sameAs, rdfs:seeAlso,
skos:exactMatch
§ Retrieve properties from external datasets
Ø So called “enriched properties”
§ Build candidate properties for visualization
Ø For pop-up menus
Ø For facet browsing
Ø For charts display
Detection Aggregation

Step3: Publication
§ Visualization Generator
Ø Recommend the visual elements based on categories
Ø Transform ASK queries to SELECT or CONSTRUCT
queries for input to visual library
§ Visualization Publisher
Ø Export the description of a visualization in RDF
Ø Add metadata for the visualization (charts) and the
steps used to create it
Ø e.g. dcat:Dataset, prov:wasDerivedFrom,
void:ExampleResource, chart vocabulary
Detection Aggregation Publication

Current Implementation
§ Javascript light version as “proof-of-concept”
§ http://semantics.eurecom.fr/datalift/rdfViz/apps/

Conclusion and Future Work
§ LDVizWiz: a tool to generate visualizations
Ø Based on RDF standards, target to lay-users for graph analysis
Ø Composed of 3 main steps: category detections, property
aggregation and visualization publication
§ A Javascript implementation shows the usefulness of
the approach
§ Future work
Ø Extend categories and vocabularies for detection
Ø Add more libraries for visual elements in visualizations
Ø Provide templates for generating “mash-ups” that combine domains
Ø Investigate the “importance” of a category within a dataset
Ø Provide a user evaluation

Questions?
http://ww.slideshare.net/ghislainatemezing/cold2014-ldvizwiz

�ݺ�ߣ

cold2014-ldvizwiz

More Related Content

cold2014-ldvizwiz