The document introduces OpenCalais, a free web service from Thomson Reuters that extracts metadata like entities, facts, and events from content. It has over 18,000 developers and processes over 4 million documents daily. OpenCalais provides categorized content and links it to open web data. Early adopters are using metadata to organize content in new ways and improve search. New publishers and tools are also integrating OpenCalais capabilities.
2. Introducing OpenCalais
• A Thomson Reuters initiative to connect all the world’s
business-relevant content.
• A free service that brings new efficiencies and
productivity to publishers and content curators.
• The fastest, easiest way to categorize your content, and
tag the entities, facts and events therein.
• Progress since Feb., 2008:
• 18,000 developers
• 20+ publishers using OpenCalais
• 50+ cool new apps and services created
• 4+ million documents per day processed
3. Free Metadata Generation
1. You feed your content into our
extraction engine
2. It categorizes the stories; finds
the people, places, companies,
facts and events, and then
returns that metadata to you
3. Along with the metadata, it
returns links to free data on the
open Web (i.e. Wikipedia, CIA
World Fact book, IMDB, etc.)
4. You use the metadata to
streamline content ops, enhance
your content, create topic hubs
on the fly, improve search, etc.
4. Live Demo:
http://viewer.opencalais.com
1. Cut and paste a business news story into the viewer,
and hit submit.
2. View the semantic markup (hover over underlined
items to see relevance, for instance).
3. Expand the extracted entities, facts and events on
the left hand rail.
4. Click on one of the companies in the list on the left,
to view the OpenCalais / Thomson Reuters asset on
that company in the Linked Data cloud.
5. Click the ‘SameAs’ links at the bottom to find more
data on the Linked Data cloud.
5. How Metadata Connects You to the Open Web
NEW!
NEW!
The Linked Data Cloud – December, 2008
7. Your Content & The OpenCalais Process
5
Metadata 3 Which provides
information and
1 returned to
the user other Linked
Unstructur with keys Data pointers
ed Text
Keys
provide
4
access to
the Calais
Calais 2 Linked
Data cloud
extracts
entities, To a range of open
6
facts and and partner Linked
events data assets,
including
Thomson Reuters
10. Early Adopters
• Aggregate & organize content in new ways.
• Automatically produce topic-based sites.
• Improve search functionality.
• Generate better content recommendations.
• Publish reviews, articles & blog posts for programmatic use on the open Web
• Content Triage
• Hyper-local news
• Contextual Ad Placement
11. New Publishers to tap OpenCalais include
• The New Republic: The new TNR.com uses OpenPublish, an
OpenCalais-enabled Drupal-powered CMS to increase editorial productivity
& drive reader engagement.
• Al Jazeera English’s new blogging network: uses
OpenCalais for content operations & tagging; features Al Jazeera
correspondents from around the world.
• Slate Magazine’s News Dots Network: visualizes the
most recent topics in the news as a concise network of related topics.
• I *heart* Sea: a hyper-local news aggregation site that collects some
of the best blogs in Seattle, especially those serving the Capitol Hill area.
12. Media Monitoring and Intelligence Tools
• Meltwater: a rapidly growing SaaS-based provider in the Corporate IR
& PR Services
• Tattler (app): an open source topic monitoring tool for today's Web.
Tattler finds and aggregates content from the Web on topics users ask it to
monitor.
• Interceder: a social media monitoring tool that makes it easy to track
trending topics and search the latest content from major news Web sites,
blogs, Twitter and YouTube.
• AskJot: a tool for analyzing web pages for keywords, and displaying
them as links to search results from services around the Web.
13. New Content Experiences / Open Research
• Feedly: a Firefox plug-in that brings user-selected inputs from Google
Reader, Twitter, RSS feeds, etc. in an easy-to-read magazine-style format.
• OpenPublish: a new CMS based on Drupal that integrates
OpenCalais from the ground up, OpenPublish is tailored to the needs of
today's online publishers & media providers.
• DocumentCloud: founded by reporters from The NYT and
ProPublica, and funded by the Knight Foundation, DocumentCloud will offer
public access to news reporters’ original source materials.
• MediaCloud:an open research tool from Harvard’s Berkman Center
that aggregates mainstream media and blogs to enable researchers to
identify how and where news coverage starts, what we’re missing, etc.
14. Why Thomson Reuters Cares
• Its mission is to connect all the world’s business-
relevant content to provide professionals with ‘intelligent
information.’
• The days of surviving
as a ‘walled garden’ of
content are over.
• ‘Crowdsourcing’ Q&A
creates faster, better,
stronger software.