際際滷

際際滷Share a Scribd company logo
Using the Structure of
DBpedia for Exploratory
Search
Speaker: Samantha Lam
Supervisor: Conor Hayes
Motivating Work
DBpedia - heterogeneous graph
2
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for re鍖ning search
speci鍖c schema, (semi) supervised
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for re鍖ning search
speci鍖c schema, (semi) supervised
 good for search when user is familiar with query
 ...but what about complete beginners?
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for re鍖ning search
speci鍖c schema, (semi) supervised
 good for search when user is familiar with query
 ...but what about complete beginners?
 Requires Exploratory Search  Unsupervised
3
Exploratory Search?
Given query, how to organise results in a manner that is useful,
i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
4
Exploratory Search?
Given query, how to organise results in a manner that is useful,
i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
Solution:
Classify results according to its contexts
Why? Alleviates in-depth reading and guides user
4
Assumption
similarity  relatedness
5
Research Questions
1 Can we provide an e鍖ective graph-based framework that can
aid exploratory search?
2 To do this, what is DBpedias graph structures wrt its
di鍖erent datasets?
6
DBpedia graphs summary
Infobox properties
emergent, crowd-sourced
heterogeneous types
dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rules
is-A structure
sparse, tree-like
7
DBpedia graphs summary
Infobox properties
emergent, crowd-sourced
heterogeneous types
dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rules
is-A structure
sparse, tree-like
Infobox good for
GGGGGGGGGGA Relatedness
Ontology good for
GGGGGGGGGGA Labelling similar items
7
Research Q1 Proposition
General Framework:
8
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
9
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
Cluster 1 (W, U) instruments
Top label: (W, U) Musical instruments
Cluster 2 (W) songs (U) album and songs
Top label: (W) Songs by artist (U) Albums by artist
Cluster 3 (W) albums (U) album, music genres and songs
Top label: (W) Albums by artist (U) Music subgenres by genre
9
Sample Query & Results
Query: Lisa Hannigan
Cluster 4 (W) mixed, (U) mixed
Top label: (W) Songs by artist (U) Missing people
Cluster 5 (W) mixed, (U) mixed
Top label: (W) Albums by artist (U)
Towns and villages in the Republic of Ireland by county
Cluster 6 (W) musicians and bands, (U) musicians and bands
Top label: (W) Place of birth missing (living people) (U)
Place of birth missing (living people)
10
Sample Query & Results
Summary:
Weighted produced 4 out of 6 coherent clusters whereas
Unweighted only produced 2.
DBpedia Ontology labelling (see paper) provided broader
labelling for messier clusters, e.g. top label was MusicalWork
for mixed clusters
 Categories better for more speci鍖c clusters.
11
Ongoing Challenges
Evaluation
User Study:
- compare only Weighted versus Unweighted results,
di鍖erent labelling methods?
Comparison:
- possible to compare against other faceted methods?
- compare with plain list for recall?
12
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this 鍖nding in exploratory search, gave
example results
Ongoing challenge, evaluation
13
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this 鍖nding in exploratory search, gave
example results
Ongoing challenge, evaluation
Thanks for listening! Questions welcome!
13
Ad

Recommended

PPT
Annotating with RDFa
giurca
PPT
Databases Foundation General
mrcarty
PPT
Roman Imperial Social Network and other things
ewg118
PPTX
The world is y0ur$: Geolocation-based wordlist generation with wordsmith
Sanjiv Kawa
PPTX
Name That Graph !
Fabien Gandon
PPT
Introduction to RDF
Narni Rajesh
PDF
Sociology 317
Tiffini Travis
PPTX
Crafting tailored wordlists with Wordsmith
Sanjiv Kawa
PDF
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Dataconomy Media
PPTX
Phenoma2evidence
Barbara Johnson
PDF
How to be a Supersearcher: GENERAL
Abby Bedford
PPT
Rdf
Imran Babar
PPT
RDA and Hebraica: Applying RDA in one cataloging community
AJL2011
PPTX
Information Literacy Week 6: Book Searching
Rebecca Johnson
PDF
Soc318
Tiffini Travis
PDF
Revealing Entities From Texts With a Hybrid Approach
Julien PLU
PDF
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
PDF
The Future of Product Management in AI ERA.pdf
Alyona Owens
PPTX
Enabling the Digital Artisan keynote at ICOCI 2025
Alan Dix
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
PDF
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Pitch ...
Michele Kryston
PPTX
reInforce 2025 Lightning Talk - Scott Francis.pptx
ScottFrancis51
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
PDF
Kubernetes - Architecture & Components.pdf
geethak285
PDF
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata

More Related Content

What's hot (8)

PDF
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Dataconomy Media
PPTX
Phenoma2evidence
Barbara Johnson
PDF
How to be a Supersearcher: GENERAL
Abby Bedford
PPT
Rdf
Imran Babar
PPT
RDA and Hebraica: Applying RDA in one cataloging community
AJL2011
PPTX
Information Literacy Week 6: Book Searching
Rebecca Johnson
PDF
Soc318
Tiffini Travis
PDF
Revealing Entities From Texts With a Hybrid Approach
Julien PLU
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Dataconomy Media
Phenoma2evidence
Barbara Johnson
How to be a Supersearcher: GENERAL
Abby Bedford
RDA and Hebraica: Applying RDA in one cataloging community
AJL2011
Information Literacy Week 6: Book Searching
Rebecca Johnson
Revealing Entities From Texts With a Hybrid Approach
Julien PLU

Recently uploaded (20)

PDF
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
PDF
The Future of Product Management in AI ERA.pdf
Alyona Owens
PPTX
Enabling the Digital Artisan keynote at ICOCI 2025
Alan Dix
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
PDF
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Pitch ...
Michele Kryston
PPTX
reInforce 2025 Lightning Talk - Scott Francis.pptx
ScottFrancis51
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
PDF
Kubernetes - Architecture & Components.pdf
geethak285
PDF
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
PDF
Python Conference Singapore - 19 Jun 2025
ninefyi
PDF
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
PDF
The Growing Value and Application of FME & GenAI
Safe Software
DOCX
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
PPTX
叶Wondershare Filmora Crack 14.0.7 + Key Download 2025
sebastian aliya
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
The Future of Product Management in AI ERA.pdf
Alyona Owens
Enabling the Digital Artisan keynote at ICOCI 2025
Alan Dix
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Pitch ...
Michele Kryston
reInforce 2025 Lightning Talk - Scott Francis.pptx
ScottFrancis51
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
Kubernetes - Architecture & Components.pdf
geethak285
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
Python Conference Singapore - 19 Jun 2025
ninefyi
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
The Growing Value and Application of FME & GenAI
Safe Software
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
叶Wondershare Filmora Crack 14.0.7 + Key Download 2025
sebastian aliya
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
ScyllaDB
Ad

Using the Structure of DBpedia for Exploratory Search

  • 1. Using the Structure of DBpedia for Exploratory Search Speaker: Samantha Lam Supervisor: Conor Hayes
  • 2. Motivating Work DBpedia - heterogeneous graph 2
  • 3. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for re鍖ning search speci鍖c schema, (semi) supervised 3
  • 4. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for re鍖ning search speci鍖c schema, (semi) supervised good for search when user is familiar with query ...but what about complete beginners? 3
  • 5. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for re鍖ning search speci鍖c schema, (semi) supervised good for search when user is familiar with query ...but what about complete beginners? Requires Exploratory Search Unsupervised 3
  • 6. Exploratory Search? Given query, how to organise results in a manner that is useful, i.e. aids exploratory search E.g. suppose you hear a song on the radio... 4
  • 7. Exploratory Search? Given query, how to organise results in a manner that is useful, i.e. aids exploratory search E.g. suppose you hear a song on the radio... Solution: Classify results according to its contexts Why? Alleviates in-depth reading and guides user 4
  • 9. Research Questions 1 Can we provide an e鍖ective graph-based framework that can aid exploratory search? 2 To do this, what is DBpedias graph structures wrt its di鍖erent datasets? 6
  • 10. DBpedia graphs summary Infobox properties emergent, crowd-sourced heterogeneous types dense Infobox ontology, SKOS/Wiki Category, YAGO agreed rules is-A structure sparse, tree-like 7
  • 11. DBpedia graphs summary Infobox properties emergent, crowd-sourced heterogeneous types dense Infobox ontology, SKOS/Wiki Category, YAGO agreed rules is-A structure sparse, tree-like Infobox good for GGGGGGGGGGA Relatedness Ontology good for GGGGGGGGGGA Labelling similar items 7
  • 13. Sample Query & Results Query: Lisa Hannigan Two methods Weighted (W) and Uniform (U), 6 clusters 9
  • 14. Sample Query & Results Query: Lisa Hannigan Two methods Weighted (W) and Uniform (U), 6 clusters Cluster 1 (W, U) instruments Top label: (W, U) Musical instruments Cluster 2 (W) songs (U) album and songs Top label: (W) Songs by artist (U) Albums by artist Cluster 3 (W) albums (U) album, music genres and songs Top label: (W) Albums by artist (U) Music subgenres by genre 9
  • 15. Sample Query & Results Query: Lisa Hannigan Cluster 4 (W) mixed, (U) mixed Top label: (W) Songs by artist (U) Missing people Cluster 5 (W) mixed, (U) mixed Top label: (W) Albums by artist (U) Towns and villages in the Republic of Ireland by county Cluster 6 (W) musicians and bands, (U) musicians and bands Top label: (W) Place of birth missing (living people) (U) Place of birth missing (living people) 10
  • 16. Sample Query & Results Summary: Weighted produced 4 out of 6 coherent clusters whereas Unweighted only produced 2. DBpedia Ontology labelling (see paper) provided broader labelling for messier clusters, e.g. top label was MusicalWork for mixed clusters Categories better for more speci鍖c clusters. 11
  • 17. Ongoing Challenges Evaluation User Study: - compare only Weighted versus Unweighted results, di鍖erent labelling methods? Comparison: - possible to compare against other faceted methods? - compare with plain list for recall? 12
  • 18. Summary Investigated graph structure of DBpedia datasets Framework to utilise this 鍖nding in exploratory search, gave example results Ongoing challenge, evaluation 13
  • 19. Summary Investigated graph structure of DBpedia datasets Framework to utilise this 鍖nding in exploratory search, gave example results Ongoing challenge, evaluation Thanks for listening! Questions welcome! 13