The document discusses the need for a Semantic Web to address information overload on the current web. It explains that the Semantic Web aims to understand the meaning behind web pages by embedding semantics through techniques like RDF and microformats. This will allow computers to better understand and filter information, leading to a smarter online experience for users where they spend less time searching and viewing irrelevant content. Approaches to building the Semantic Web include bottom-up annotation of existing web pages and top-down extraction of entities from pages using natural language processing tools. Linked Data is seen as a key enabler of the Semantic Web by establishing linkages between data.
2. Agenda Introduction Semantic Web What is Semantic Web? Why it matters? How to Semantify the Web? Web 3.0 Linked Data
3. Introduction 1.3+ billion people connected to the web 2006 161 EB of information created/replicated (1 EB = 1 billion GB) Technical information doubled every 2 years By 2010 six times to 988Eb (approx = 1 ZB) Technical information will double every 72 hours Computers, mobile phones, intelligent devices Internet is broken not one web unable to communicate
4. Information Overload Is that really how the Web experience is supposed to feel? Key Problem how to share meaning? Filtering, not aggregating. Not more, just smarter.
5. Semantics? Related to Syntax Syntax How you say something (letters, punctuation, grammar) eg. HTML Semantics Meaning behind what you say Example: I Love Technology I Technology
6. Whats the big deal? Internet std way to communicate Parrot mimic w/o understanding The Web Store and retrieve docs on the internet syntax to display the doc (HTML) Search Engines Find any website that we want Life is good!!! Can we make it any better?? How??
7. The Answer Semantic Web Understand the meaning behind webpages Web of Things vs Web of Documents Things can be ANYTHING people, places, pets, events, music, movies, organizations. Not only identify these things but also relationships (Human-like!!!) Embed semantics in html docs microformats, RDF Its not about the futureits about Today!!!
10. Why Semantic Web? Spend less time searching Spend less time looking at things that do not matter Spend less time explaining what we want to computers Bottomline improve the online experience!!!
12. Its all about the noise Web 1.0: Get (hear & see) Noise Web 2.0: Make Noise Web 3.0: Filter the Noise Web 4.0: Going deaf.or SmartNoise
13. Semantifying the Web - Approaches Bottom Up Annotating information in web pages with machine readable tags Technical Challenges Representational Complexity How to create manual/automatic? How much can be transformed? Standard Issue Business Challenges Its primitive Consumer Value? How to market? Recent Wins: Yahoo search engine to support RDF, MF Dapper automated annotation tool
14. Annotation Technologies Trade-off between simplicity and completeness RDF Graph based things, attributes, relationships Precise but complex Triple Microformats Uses specific CSS styles Compact Embedded in HTML gaining popularity because of their simplicity Popular microformats: hCard: describes personal and company contact information hReview: adds meta information to review pages hCalendar: used to describe events Limitations no way to described type hierarchies somewhat cryptic, because the focus is to keep the annotations to a minimum Flickr, Eventful, and LinkedIn
15. Semantifying the Web - Approaches Top Down Focused on leveraging information in existing web pages As is NLP Tools (entity extraction) Calais & TextWise APIs that recognize people, companies, places in docs Vertical Search Engines ZoomInfo, Spock & Retrevo Dapper, BlueOrganizer, ClearForest recognize objects in web pages & annotate them Yahoo! Shortcuts, Snap, Smartlinks recognize objects in text and links Challenges Not 100% perfect, has ambiguities May not scale well
20. Structured Data RDBMS Powerful and flexible Pre-defined relationships and usage of data Too constraining and too structured Schema changes are expensive Virtually impossible to make different DBs speak Linked Data Establish linkages at the data level(RDF) Bridges the gap between unstructured and structured data Does not add any semantic meaning to the information
21. Linked Data Medium for the semantic web It does not create smart data, only enables it Relies on clean, granular, structured data Pre-Structured Pre-Connected