Presented at Hypertext 2009:
A key impediment for enabling the mainstream adoption of Adaptive Hypermedia for web applications and corporate websites is the difficulty in repurposing existing content for such delivery systems. This paper proposes a novel framework for open-corpus content preparation, making it usable for adaptive hypermedia systems. The proposed framework processes documents drawn from both open (i.e. web) and closed corpora, producing coherent conceptual sections of text with associated descriptive metadata. The solution bridges the gap between information resources and information requirements of adaptive systems by adopting state-of-the-art information extraction and structural content analysis techniques. The result is an on-demand provision of tailored, atomic information objects called slices. The challenges associated with open corpus content reusability are addressed with the aim of improving the scalability and interoperability of adaptive systems. This paper proposes an initial architecture for such a framework in addition to reviews of associated technologies.
1 of 18
Downloaded 15 times
More Related Content
A Framework for Content Preparation to Support Open-Corpus Adaptive Hypermedia
1. Towards a Framework for
Open-Corpus Content Preparation
supporting
Adaptive Hypermedia Systems
Killian Levacher
2. Outline
 Increasing importance of adaptive systems on the Web
 AHS impediments to full mainstream adoption
 Novel content preparation framework solution
 Framework benefits
 Novel challenges introduced
 Roadmap ahead
5. Content Availability Impedes AHS Full
Mainstream Adoption
Mainly due to low availability of suitable content in
terms of volume, style, diversity, meta-data, granularity…
 Manually Authored by Small groups of Users
 Lack of Diversity and Up to Date Content
 Pre-existing Documents
 Authored in particular Formats
6. Wealth of Information on the Web
 Content not directly re-usable by AHS
• Usually built for single purpose usages
• Limited amount of meta-data
• Very heterogeneous (Style…)
• Different languages
• Very coarse grained
• Contains noisy information
10. Open Corpus Content Preparation
 Avail AHS with the wealth of open-corpus information
 Bridge the gap between open-corpus content & AHS specific
information requirements
 Fully decouple content from core adaptive system
 Service that prepares open-corpus content for AHS usage
 Automated content preparation service
 Wide variety of up to date content
 Re-purposing of existing content
 No generic structure to comply with
11. Content Analysis Services
A Priori
 A Slice
• is a semantically independent piece of content extracted
from a pre-existing document
• Is retrieved in a chosen format
• represents a AH subjective perspective of a document
13. Benefits of this Framework
 Open Corpus processing re-purposing content
 Automated content Slicing vs Manual Authorship
• Possible solution to content authorship scalability
 Removal of content format dependency for AH
• Content preparation approach solves Interoperability
issues
 Pipelined approach enables new content annotators to
be plugged in seamlessly
 Concept of subjective slices of existing content
14. New Challenges Introduced
 Structural Segmenter fulfilling large domain and
processing speed requirements
 Semantic Annotator will be a critical component
• How much semantic meta-data can we really aim
for? Will it necessarily be domain dependant?
• Has a direct influence on Slice Precision
15. Roadmap Ahead
 Selection or composition of a framework specific
structural segmenter
 Initial comparative evaluation of semantic analyzer to
evaluate the quality and volume of meta-data expected
 Implementation of framework within a Personalized
Multi-Lingual Customer Care System
16. Summary
 Content provision impedes the full mainstream adoption
of AHS
 Content provision should be fully loosely coupled with
adaptive systems
 Novel open-corpus preparation framework
 Solution provides content scalability, interoperability,
volume, diversity
 New challenges ahead
 Initial proof of concept planned within PMCC system