�ݺ�ߣ

The Research and Education Space
a pathway to bring our cultural heritage
(including the BBC archive) to life
Dr Chiara Del Vescovo
Data Architect at BBC

Vision
Web-like
Web-based
Interlinking
heterogenous
resources

Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations

Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations
Reliable,
provably
cleared for
education

Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations
Reliable,
provably
cleared for
education
Linked Open Data

A pathway
users
BL
BM
BFI
Tate
V&A
…
BBC

A pathway
users
BL
BM
BFI
Tate
V&A
…
BBC
?

usersdevelopers
A pathway
BL
BM
BFI
Tate
V&A
…
BBC

usersdevelopers
A pathway
BL
BM
BFI
Tate
V&A
…
BBC
aggregating
platform

RES (BBC, Jisc, BUFVC)
Core Platform: “Acropolis”
Project RES: Technical Approach
1
The crawler fetches data via HTTP from published
sources. Once retrieved, it is indexed by the full-text
store and passed to the aggregation engine for evaluation.
2
The results of the aggregation engine's evaluation process
are stored in the aggregate store, which contains minimal
browse information and information about the similarity of
entities.
3
The public face of the core platform is an extremely basic
browsing interface (which presents the data in tabular form
to aid application developers), and read-write RESTful APIs.
4
Applications may use the APIs to locate information about
aggregated entities, and also to store annotations and activity
data.
5
Each component employs standard protocols and formats.
For example, we can make use of any capable quad-store
as our aggregate store.
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
usersdevelopers
Acropolis
(index!)
BL
BM
BFI
Tate
V&A
…
BBC

RES (BBC, Jisc, BUFVC)
1
2
entities.
3
4
data.
5
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
informed by
usersdevelopers
Acropolis
(index!)
planned pilots
BL
BM
BFI
Tate
V&A
…
BBC

AcropolisCore Platform: “Acropolis”
1
The crawler fetches data
sources. Once retrieved
store and passed to the
2
The results of the aggre
are stored in the aggreg
browse information and
entities.
3
The public face of the c
browsing interface (whi
to aid application develo
4
Applications may use th
aggregated entities, and
data.
5
Each component emplo
For example, we can ma
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
storebeta.acropolis.org.uk

1
2
entities.
3
4
data.
5
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
informed by
usersdevelopersAcropolis
What I do
(with my colleague Alex)
planned pilots
BL
BM
BFI
Tate
V&A
…
BBC

What I do
BL
BM
BFI
Tate
V&A
…
BBC

What I do
1.devise a publishing scheme to
determine URIs
2.translate original metadata into RDF
3.links discovery and reconciliation with
“hubs” (e.g., LoC, Geonames,
DBPedia)
4.make the existing schema explicit as
a local ontology
5.matching the ontology onto well-
established ontologies (e.g., DCMI,
FOAF, SKOS, CIDOC-CRM)
6.advice on how to express machine-
readable licenses, for both resources
and metadata
7.technical support to publish LOD
BL
BM
BFI
Tate
V&A
…
BBC

• Europeana
• “general” Data Model (EDM)
• collection holders responsible to ﬁt their
resources and metadata in EDM
Europeana

Challenges
Stakeholders go quiet!

1. Which metadata?
• Currently, resources metadata mostly oriented
towards “physical proximity” 
i.e., indexes reﬂect similarity of author’s surname, broad
subject, format, media, etc.
• Heterogeneous platforms and data models 
incompatibility, transformations needed
• Even when RDF is used, there’s a proliferation of
terms, vocabularies, formats adopted 
little (if any) validation

2. Linking
• Systems that do not use RDF do not allow
collection holders to express their knowledge as
they wish 
underspeciﬁed knowledge
• Even when RDF is used, information often provided
as literals rather than links to URIs 
ad hoc solutions unavailable in a machine-readable format

3. Usability
• Reliability
• Lack of tools 
developers have little contact with collection holders
• Licensing issues 
resources licensing (not always explicit) 
metadata licensing 
users need to be aware of what that mean 
(note that in educations things are slightly easier - blanket
licensing etc.)

Interested?
• get in touch!
• chiara.delvescovo@bbc.co.uk
• alex.tucker@bbc.co.uk
• new advertised position as 
Junior Data Architect 
careershub.bbc.co.uk

�ݺ�ߣ

Documents, services, and data on the web

More Related Content

Documents, services, and data on the web