ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
1
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured DataWSU & AFRL Window-on-Science Seminar on Data MiningAmit P. Sheth,LexisNexis Ohio Eminent ScholarDirector, Kno.e.sis center, Wright State Universityknoesis.orgThanks: K. Gomadam, M. Nagarajan, C. Thomas, C. Henson, C. Ramakrishnan, P. Jain  and Kno.e.sis Researchers
Data & Knowledge Ecosystem3Situational AwarenessDecision SupportInsightKnowledge DiscoveryAnalysis (eg Patterns)Understanding & PerceptionData MiningIntegrationSearchBrowsingMultimedia DataStructured,SemistructuredUnstructuredDataTextual Data: Scientific Literature, Web Pages, News, Blogs,                       Reports, Wiki, Forums, Comments, Tweets Experimental DataObservational DataTransactional Data
Some examples of R&D we have doneSemantic Search & Ranking of Stories and Reports ¨C connecting the dots applications (insider threat, financial risk analysis)Mining of biomedical (scientific) literature (extraction of entities and relationships) ¨C discovering hidden public knowledgeSemantic Integration, Analysis and Decision Support over Sensor DataExtracting taxonomy/domain model from WikipediaDiscovering Hidden Relationships (insights) in Community Created Content (Wikipedia)4
Understanding User Generated Content  (on Social Networking Sites)*What are people talking aboutHow people writeWhy people writeWith application to Artist Popularity Ranking
Advertisement on Social Media
Identifying Social Signals ¨C spatio-temporal-thematic analysis of Citizen Sensor Data5* MeenaNagarajan
SearchIntegrationAnalysisDiscoveryQuestion   AnsweringSituational    AwarenessDomain ModelsPatterns / Inference / ReasoningRDBRelationship WebMeta data / Semantic AnnotationsMetadata ExtractionMultimedia Content and Web dataTextSensor DataStructured and Semi-structured data
Insider threat demo (semantic search/querying, ranking, ¡­)7
Knowledge Discovery from Scientific LiteratureCarticRamakrishnan
9What Knowledge Discovery is NOT SearchKeyword-in-document-out  Keywords are fully specified features of expected outcomeSearching for prospective mining sitesMining Know where to lookUnderspecified characteristics of what is sought are availablePatternsCarticRamakrishnan
10What is knowledge discovery?¡°knowledge discovery is more like sifting through a warehouse filled with small gears, levers, etc., none of which is particularly valuable by itself. After appropriate assembly, however, a Rolex watch emerges from the disparate parts.¡± ¨C James Caruther ¡°discovery is often described as more opportunistic search in a less well-defined space, leading to a psychological element of surprise¡± ¨C James BuchananOpportunistic search over an ill-defined space leading to surprising but useful emergent knowledgeCarticRamakrishnan
Element of surprise ¨C Swanson¡¯s discoveriesStress?Swanson¡¯s DiscoveriesMagnesiumMigraineCalcium Channel BlockersSpreading Cortical Depression11 possible associations foundPubMedAssociations Discovered based on keyword searches followed by manually analysis of text to establish possible relevant relationships11
Knowledge Discovery over textTextAssigning interpretation to text Semantic metadata in the form ofsemi-structured dataExtraction of Semantics from textSemantic Metadata Guided Knowledge Explorations Semantic Metadata Guided Knowledge DiscoveryTriple-basedSemantic SearchSemanticbrowserSubgraphdiscovery12CarticRamakrishnan
Information Extraction via Ontology assisted text mining ¨C Relationship extraction4733 documents9284 documents5 documentsUMLS Semantic NetworkcomplicatesBiologically active substanceaffectscausescausesDisease or SyndromeLipidaffectsinstance_ofinstance_of???????Fish OilsRaynaud¡¯s DiseaseMeSHPubMed13CarticRamakrishnan
Background knowledge and Data usedUMLS ¨C A high level schema of the biomedical domain136 classes and 49 relationshipsSynonyms of all relationship ¨C using variant lookup (tools from NLM)49 relationship + their synonyms = ~350 verbsMeSH 22,000+ topics organized as a forest of 16 treesUsed to query PubMedPubMed Over 16 million abstractAbstracts annotated with one or more MeSH terms14
Method ¨C Parse Sentences in PubMedSS-Tagger (University of Tokyo)SS-Parser (University of Tokyo) Entities (MeSH terms) in sentences occur in modified forms
¡°adenomatous¡± modifies ¡°hyperplasia¡±
¡°An excessive endogenous or exogenous stimulation¡± modifies ¡°estrogen¡±
 Entities can also occur as composites of 2 or more other entities
¡°adenomatous hyperplasia¡± and ¡°endometrium¡± occur as ¡°adenomatous hyperplasia of the endometrium¡±(TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) 15CarticRamakrishnan
Method ¨C Identify entities and relationships in Parse TreeModifiersTOPModified entitiesComposite EntitiesSVPUMLS ID T147NPVBZinducesNPPPNPNPNNestrogenINbyJJexcessivePPDTtheADJPNNstimulationMeSHIDD004967INofJJadenomatousNNhyperplasiaNPJJendogenousJJexogenousCCorMeSHIDD006965NNendometriumDTtheMeSHIDD00471716
Representation ¨C Resulting RDFModifiersModified entitiesComposite Entities17
18Preliminary Results Swanson¡¯s discoveries ¨C Associations between Migraine and Magnesium [Hearst99]stress is associated withmigraines
stress can lead to loss of magnesium
calcium channel blockersprevent some migraines
magnesiumis a natural calcium channel blocker
spreading cortical depression (SCD) is implicated in some migraines
high levels of magnesiuminhibit SCD
migraine patients have highplatelet aggregability
magnesium can suppressplatelet aggregabilityData sets generated using these entities (marked red above) as boolean keyword queries against pubmedBidirectional breadth-first search used to find paths in resulting RDF
Paths between Migraine and MagnesiumPaths are considered interesting if they have one or more named relationshipOther thanhasPart or hasModifiers in them19CarticRamakrishnan
An example of such a pathCONCLUSIONRules over parse trees are able to extract structure from sentences
Our definition of compound and modified entities are critical for identifying both implicit and explicit relationships
Swanson¡¯s discovery can be automated ¨C if recall can be improved ¨C what hurts recall?20
Unsupervised Joint Extraction of Compound Entities and RelationshipCartic Ramakrishnan, Pablo N. Mendes, Shaojun Wang and Amit P. Sheth "Unsupervised Discovery of Compound Entities for Relationship Extraction"EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns
Joint Extraction approachgovernordependentDependency parse ¨C Stanford Parseramod        = adjectival modifiernsubjpass = nominal subject in passive voice22
AlgorithmRelationship headSubject headObject headObject head23CarticRamakrishnan
24Preliminary resultsCarticRamakrishnan
25Extracted Triples
Semantic Metadata Guided Knowledge Explorations and Discovery
27ResultsCarticRamakrishnan
Hypothesis Driven  retrieval of Scientific Literature  affectsMigraineMagnesiumStressisainhibitPatientCalcium Channel BlockersComplex QuerySupportingDocument setsretrievedKeyword query: Migraine[MH] + Magnesium[MH]PubMed28
29ApplicationsTriple-based semantic searchSemantic Browser
30Knowledge Discovery = Extraction + Heuristic AggregationUndiscovered Public Knowledge
Understanding, Analyzing, Mining Social MediaMeenaNagarajan, Karthik Gomadam
mumbai, india
november 26, 2008
another chapter in the war against civilization
 and
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data
 the world saw itThrough the eyes of the people
 the world read itThrough the words of the people
PEOPLE told their stories to PEOPLE
A powerful new era in Information dissemination had taken firm ground
Making it possible for us tocreate a global network of citizensCitizen Sensors ¨C Citizens observing, processing, transmitting, reporting
Geocoder(Reverse Geo-coding)Address to location database18 Hormusji Street, ColabaVasantViharImage Metadatalatitude: 18¡ã 54¡ä 59.46¡å N, longitude: 72¡ã 49¡ä 39.65¡å EStructured Meta ExtractionNariman HouseIncome Tax OfficeIdentify and extract information from tweetsSpatio-Temporal Analysis
Research Challenge #1Spatio Temporal and Thematic analysisWhat else happened ¡°near¡± this event location?What events occurred ¡°before¡± and ¡°after¡± this event?Any message about ¡°causes¡± for this event?
Spatial Analysis¡­.Which tweets originated from an address near 18.916517¡ãN 72.827682¡ãE?
Which tweets originated during Nov 27th 2008,from 11PM to 12 PM
Giving usTweets originated from an address near 18.916517¡ãN, 72.827682¡ãE during time interval27th Nov 2008 between 11PM to 12PM?
Research Challenge #2:Understanding and Analyzing Casual TextCasual textMicroblogs are often written in SMS style languageSlangs, abbreviations
Understanding Casual TextNot the same as news articles or scientific literatureGrammatical errorsImplications on NL parser resultsInconsistent writing styleImplications on learning algorithms that generalize from corpus
Nature of MicroblogsAdditional constraint of limited contextMax. of x chars in a microblogContext often provided by the discourseEntity identification and disambiguationPre-requisite to other sophisticated information analytics
NL understanding is hard to begin with..Not so hard¡°commando raid appears to be nigh at Oberoinow¡±Oberoi = Oberoi Hotel, Nigh = highChallengingnew wing, live fire @ taj 2nd floor on iDesi TV streamFire on the second floor of the Taj hotel, not on iDesi TV
Research OpportunitiesNER, disambiguation in casual, informal text is a budding area of researchAnother important area of focus: Combining information of varied quality from a corpus (statistical NLP), domain knowledge (tags, folksonomies, taxonomies, ontologies), social context (explicit and implicit communities)
Social Context surrounding contentSocial context in which a message appears is also an added valuable resourcePost 1: ¡°Hareemane Househostages said by eyewitnesses to be Jews. 7 Gunshots heard by reporters at Taj¡±Follow up postthat is Nariman House, not (Hareemane)
Understanding content ¡­ informal textI say: ¡°Your music is wicked¡± What I really mean: ¡°Your music is good¡± 54
Urban DictionarySentiment expression: Rocks Transliterates to: cool, goodStructured text (biomedical literature)Semantic Metadata: Smile is a TrackLil transliterates to Lilly AllenLilly Allen is an ArtistMusicBrainz TaxonomyInformal Text (Social Network chatter)Artist: Lilly AllenTrack: Smile    Your smile rocks LilMultimedia Content and Web dataWeb Services
Example: Pulse of a CommunityImagine millions of such informal opinionsIndividual expressions to mass opinions¡°Popular artists¡± lists from MySpace commentsLilly Allen	Lady Sovereign	Amy WinehouseGorillazColdplayPlaceboStingKeanJoss Stone
What Drives the Spatio-Temporal-Thematic Analysis and Casual Text UnderstandingSemantics with the help ofDomain ModelsDomain ModelsDomain Models(ontologies, folksonomies)
Domain Knowledge: A key driverPlaces that are nearby ¡®Nariman house¡¯Spatial queryMessages originated around this placeTemporal analysisMessages about related events / placesThematic analysis
Research Challenge #3But Where does the Domain Knowledge come from?Expert and committee based ontology creation  ¡­ works in some domains (e.g., biomedicine, health care,¡­)Community driven knowledge extraction How to create models that are ¡°socially scalable¡±?How to organically grow and maintain this model?
Building models¡­seed word to hierarchy creation using WIKIPEDIAQuery: ¡°cognition¡±
Identifying relationships: Hard, harder than many hard things But NOT that Hard, When WE do it
Games with a purposeGet humans to give their solitaire time Solve real hard computational problemsImage tagging, Identifying part of an image Tag a tune, Squigl, Verbosity, and MatchinPioneered by Luis Von Ahn
OntoLablrRelationship Identification Gameleads to
causesExplosionTraffic congestion
How  do you get comprehensive situational awareness by merging ¡°human sensing¡± and ¡°machine sensing¡±?64
Research Challenge #4: Semantic Sensor Web
Semantically Annotated O&M<swe:component name="time">	<swe:Time definition="urn:ogc:def:phenomenon:time" uom="urn:ogc:def:unit:date-time">		<sa:swe rdfa:about="?time" rdfa:instanceof="time:Instant">			<sa:sml rdfa:property="xs:date-time"/>		</sa:swe>	</swe:Time></swe:component><swe:component name="measured_air_temperature">	<swe:Quantity definition="urn:ogc:def:phenomenon:temperature¡° 			           		uom="urn:ogc:def:unit:fahrenheit">		<sa:swe rdfa:about="?measured_air_temperature¡°              			rdfa:instanceof=¡°senso:TemperatureObservation">			<sa:swe rdfa:property="weather:fahrenheit"/>			<sa:swe rdfa:rel="senso:occurred_when" resource="?time"/>			<sa:swe rdfa:rel="senso:observed_by" resource="senso:buckeye_sensor"/>		</sa:sml>					</swe:Quantity></swe:component><swe:value name=¡°weather-data">	2008-03-08T05:00:00,29.1</swe:value>

More Related Content

Similar to Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data (20)

Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan
?
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Nils Gehlenborg
?
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
James Hendler
?
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
Meena Nagarajan
?
Navigating the Neuroscience Data Landscape
Navigating the Neuroscience Data LandscapeNavigating the Neuroscience Data Landscape
Navigating the Neuroscience Data Landscape
Neuroscience Information Framework
?
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉúNAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
ysuzuki-naist
?
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
Paul Groth
?
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Final
kdjamies
?
Brief introduction to Bioinformatics
Brief introduction to BioinformaticsBrief introduction to Bioinformatics
Brief introduction to Bioinformatics
Cynthia Alexander Rascon
?
Resesarch types
Resesarch typesResesarch types
Resesarch types
Nits Kedia
?
Resesarch types
Resesarch typesResesarch types
Resesarch types
StudsPlanet.com
?
Visualizing and Making Sense of Information
Visualizing and Making Sense of InformationVisualizing and Making Sense of Information
Visualizing and Making Sense of Information
PARC, a Xerox company
?
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
Ontotext
?
Human Brain Essay.pdf
Human Brain Essay.pdfHuman Brain Essay.pdf
Human Brain Essay.pdf
Jennifer Reese
?
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
University of Washington
?
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
Connected Data World
?
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
Innovation Quotient Pvt Ltd
?
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domain
Angelo Salatino
?
Guided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomicsGuided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomics
Nils Gehlenborg
?
Ibm cognitive seminar march 2015 watsonsim final
Ibm cognitive seminar march 2015  watsonsim finalIbm cognitive seminar march 2015  watsonsim final
Ibm cognitive seminar march 2015 watsonsim final
diannepatricia
?
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan
?
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Nils Gehlenborg
?
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
James Hendler
?
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
Meena Nagarajan
?
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉúNAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
NAIST¥Ó¥Ã¥°¥Ç©`¥¿¥·¥ó¥Ý¥¸¥¦¥à - Çéˆó Ëɱ¾ÏÈÉú
ysuzuki-naist
?
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
Paul Groth
?
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Final
kdjamies
?
Visualizing and Making Sense of Information
Visualizing and Making Sense of InformationVisualizing and Making Sense of Information
Visualizing and Making Sense of Information
PARC, a Xerox company
?
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
Ontotext
?
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
University of Washington
?
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
Connected Data World
?
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
Innovation Quotient Pvt Ltd
?
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domain
Angelo Salatino
?
Guided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomicsGuided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomics
Nils Gehlenborg
?
Ibm cognitive seminar march 2015 watsonsim final
Ibm cognitive seminar march 2015  watsonsim finalIbm cognitive seminar march 2015  watsonsim final
Ibm cognitive seminar march 2015 watsonsim final
diannepatricia
?

Recently uploaded (20)

Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
?
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
The Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nesThe Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nes
ScyllaDB
?
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
BoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is DynamicBoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is Dynamic
Ortus Solutions, Corp
?
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
?
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOTSMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
TanmaiArni
?
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
?
UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
?
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Jonathan Bowen
?
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog GavraReplacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
ScyllaDB
?
Unlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤EUnlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤E
Expeed Software
?
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
?
Both Feet on the Ground - Generative Artificial Intelligence
Both Feet on the Ground - Generative Artificial IntelligenceBoth Feet on the Ground - Generative Artificial Intelligence
Both Feet on the Ground - Generative Artificial Intelligence
Pete Nieminen
?
Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramentoAIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
Alessandro Bogliolo
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
?
Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
?
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
The Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nesThe Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nes
ScyllaDB
?
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
BoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is DynamicBoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is Dynamic
Ortus Solutions, Corp
?
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
?
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOTSMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
TanmaiArni
?
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
?
UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
?
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Jonathan Bowen
?
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog GavraReplacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
ScyllaDB
?
Unlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤EUnlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤E
Expeed Software
?
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
?
Both Feet on the Ground - Generative Artificial Intelligence
Both Feet on the Ground - Generative Artificial IntelligenceBoth Feet on the Ground - Generative Artificial Intelligence
Both Feet on the Ground - Generative Artificial Intelligence
Pete Nieminen
?
Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramentoAIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
AIXMOOC 2.3 - Modelli di reti neurali con esperimenti di addestramento
Alessandro Bogliolo
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
?

Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured Data

  • 1. 1
  • 2. Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and Unstructured DataWSU & AFRL Window-on-Science Seminar on Data MiningAmit P. Sheth,LexisNexis Ohio Eminent ScholarDirector, Kno.e.sis center, Wright State Universityknoesis.orgThanks: K. Gomadam, M. Nagarajan, C. Thomas, C. Henson, C. Ramakrishnan, P. Jain and Kno.e.sis Researchers
  • 3. Data & Knowledge Ecosystem3Situational AwarenessDecision SupportInsightKnowledge DiscoveryAnalysis (eg Patterns)Understanding & PerceptionData MiningIntegrationSearchBrowsingMultimedia DataStructured,SemistructuredUnstructuredDataTextual Data: Scientific Literature, Web Pages, News, Blogs, Reports, Wiki, Forums, Comments, Tweets Experimental DataObservational DataTransactional Data
  • 4. Some examples of R&D we have doneSemantic Search & Ranking of Stories and Reports ¨C connecting the dots applications (insider threat, financial risk analysis)Mining of biomedical (scientific) literature (extraction of entities and relationships) ¨C discovering hidden public knowledgeSemantic Integration, Analysis and Decision Support over Sensor DataExtracting taxonomy/domain model from WikipediaDiscovering Hidden Relationships (insights) in Community Created Content (Wikipedia)4
  • 5. Understanding User Generated Content (on Social Networking Sites)*What are people talking aboutHow people writeWhy people writeWith application to Artist Popularity Ranking
  • 7. Identifying Social Signals ¨C spatio-temporal-thematic analysis of Citizen Sensor Data5* MeenaNagarajan
  • 8. SearchIntegrationAnalysisDiscoveryQuestion AnsweringSituational AwarenessDomain ModelsPatterns / Inference / ReasoningRDBRelationship WebMeta data / Semantic AnnotationsMetadata ExtractionMultimedia Content and Web dataTextSensor DataStructured and Semi-structured data
  • 9. Insider threat demo (semantic search/querying, ranking, ¡­)7
  • 10. Knowledge Discovery from Scientific LiteratureCarticRamakrishnan
  • 11. 9What Knowledge Discovery is NOT SearchKeyword-in-document-out Keywords are fully specified features of expected outcomeSearching for prospective mining sitesMining Know where to lookUnderspecified characteristics of what is sought are availablePatternsCarticRamakrishnan
  • 12. 10What is knowledge discovery?¡°knowledge discovery is more like sifting through a warehouse filled with small gears, levers, etc., none of which is particularly valuable by itself. After appropriate assembly, however, a Rolex watch emerges from the disparate parts.¡± ¨C James Caruther ¡°discovery is often described as more opportunistic search in a less well-defined space, leading to a psychological element of surprise¡± ¨C James BuchananOpportunistic search over an ill-defined space leading to surprising but useful emergent knowledgeCarticRamakrishnan
  • 13. Element of surprise ¨C Swanson¡¯s discoveriesStress?Swanson¡¯s DiscoveriesMagnesiumMigraineCalcium Channel BlockersSpreading Cortical Depression11 possible associations foundPubMedAssociations Discovered based on keyword searches followed by manually analysis of text to establish possible relevant relationships11
  • 14. Knowledge Discovery over textTextAssigning interpretation to text Semantic metadata in the form ofsemi-structured dataExtraction of Semantics from textSemantic Metadata Guided Knowledge Explorations Semantic Metadata Guided Knowledge DiscoveryTriple-basedSemantic SearchSemanticbrowserSubgraphdiscovery12CarticRamakrishnan
  • 15. Information Extraction via Ontology assisted text mining ¨C Relationship extraction4733 documents9284 documents5 documentsUMLS Semantic NetworkcomplicatesBiologically active substanceaffectscausescausesDisease or SyndromeLipidaffectsinstance_ofinstance_of???????Fish OilsRaynaud¡¯s DiseaseMeSHPubMed13CarticRamakrishnan
  • 16. Background knowledge and Data usedUMLS ¨C A high level schema of the biomedical domain136 classes and 49 relationshipsSynonyms of all relationship ¨C using variant lookup (tools from NLM)49 relationship + their synonyms = ~350 verbsMeSH 22,000+ topics organized as a forest of 16 treesUsed to query PubMedPubMed Over 16 million abstractAbstracts annotated with one or more MeSH terms14
  • 17. Method ¨C Parse Sentences in PubMedSS-Tagger (University of Tokyo)SS-Parser (University of Tokyo) Entities (MeSH terms) in sentences occur in modified forms
  • 19. ¡°An excessive endogenous or exogenous stimulation¡± modifies ¡°estrogen¡±
  • 20. Entities can also occur as composites of 2 or more other entities
  • 21. ¡°adenomatous hyperplasia¡± and ¡°endometrium¡± occur as ¡°adenomatous hyperplasia of the endometrium¡±(TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) 15CarticRamakrishnan
  • 22. Method ¨C Identify entities and relationships in Parse TreeModifiersTOPModified entitiesComposite EntitiesSVPUMLS ID T147NPVBZinducesNPPPNPNPNNestrogenINbyJJexcessivePPDTtheADJPNNstimulationMeSHIDD004967INofJJadenomatousNNhyperplasiaNPJJendogenousJJexogenousCCorMeSHIDD006965NNendometriumDTtheMeSHIDD00471716
  • 23. Representation ¨C Resulting RDFModifiersModified entitiesComposite Entities17
  • 24. 18Preliminary Results Swanson¡¯s discoveries ¨C Associations between Migraine and Magnesium [Hearst99]stress is associated withmigraines
  • 25. stress can lead to loss of magnesium
  • 27. magnesiumis a natural calcium channel blocker
  • 28. spreading cortical depression (SCD) is implicated in some migraines
  • 29. high levels of magnesiuminhibit SCD
  • 30. migraine patients have highplatelet aggregability
  • 31. magnesium can suppressplatelet aggregabilityData sets generated using these entities (marked red above) as boolean keyword queries against pubmedBidirectional breadth-first search used to find paths in resulting RDF
  • 32. Paths between Migraine and MagnesiumPaths are considered interesting if they have one or more named relationshipOther thanhasPart or hasModifiers in them19CarticRamakrishnan
  • 33. An example of such a pathCONCLUSIONRules over parse trees are able to extract structure from sentences
  • 34. Our definition of compound and modified entities are critical for identifying both implicit and explicit relationships
  • 35. Swanson¡¯s discovery can be automated ¨C if recall can be improved ¨C what hurts recall?20
  • 36. Unsupervised Joint Extraction of Compound Entities and RelationshipCartic Ramakrishnan, Pablo N. Mendes, Shaojun Wang and Amit P. Sheth "Unsupervised Discovery of Compound Entities for Relationship Extraction"EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns
  • 37. Joint Extraction approachgovernordependentDependency parse ¨C Stanford Parseramod = adjectival modifiernsubjpass = nominal subject in passive voice22
  • 38. AlgorithmRelationship headSubject headObject headObject head23CarticRamakrishnan
  • 41. Semantic Metadata Guided Knowledge Explorations and Discovery
  • 43. Hypothesis Driven retrieval of Scientific Literature affectsMigraineMagnesiumStressisainhibitPatientCalcium Channel BlockersComplex QuerySupportingDocument setsretrievedKeyword query: Migraine[MH] + Magnesium[MH]PubMed28
  • 45. 30Knowledge Discovery = Extraction + Heuristic AggregationUndiscovered Public Knowledge
  • 46. Understanding, Analyzing, Mining Social MediaMeenaNagarajan, Karthik Gomadam
  • 49. another chapter in the war against civilization
  • 53. the world saw itThrough the eyes of the people
  • 54. the world read itThrough the words of the people
  • 55. PEOPLE told their stories to PEOPLE
  • 56. A powerful new era in Information dissemination had taken firm ground
  • 57. Making it possible for us tocreate a global network of citizensCitizen Sensors ¨C Citizens observing, processing, transmitting, reporting
  • 58. Geocoder(Reverse Geo-coding)Address to location database18 Hormusji Street, ColabaVasantViharImage Metadatalatitude: 18¡ã 54¡ä 59.46¡å N, longitude: 72¡ã 49¡ä 39.65¡å EStructured Meta ExtractionNariman HouseIncome Tax OfficeIdentify and extract information from tweetsSpatio-Temporal Analysis
  • 59. Research Challenge #1Spatio Temporal and Thematic analysisWhat else happened ¡°near¡± this event location?What events occurred ¡°before¡± and ¡°after¡± this event?Any message about ¡°causes¡± for this event?
  • 60. Spatial Analysis¡­.Which tweets originated from an address near 18.916517¡ãN 72.827682¡ãE?
  • 61. Which tweets originated during Nov 27th 2008,from 11PM to 12 PM
  • 62. Giving usTweets originated from an address near 18.916517¡ãN, 72.827682¡ãE during time interval27th Nov 2008 between 11PM to 12PM?
  • 63. Research Challenge #2:Understanding and Analyzing Casual TextCasual textMicroblogs are often written in SMS style languageSlangs, abbreviations
  • 64. Understanding Casual TextNot the same as news articles or scientific literatureGrammatical errorsImplications on NL parser resultsInconsistent writing styleImplications on learning algorithms that generalize from corpus
  • 65. Nature of MicroblogsAdditional constraint of limited contextMax. of x chars in a microblogContext often provided by the discourseEntity identification and disambiguationPre-requisite to other sophisticated information analytics
  • 66. NL understanding is hard to begin with..Not so hard¡°commando raid appears to be nigh at Oberoinow¡±Oberoi = Oberoi Hotel, Nigh = highChallengingnew wing, live fire @ taj 2nd floor on iDesi TV streamFire on the second floor of the Taj hotel, not on iDesi TV
  • 67. Research OpportunitiesNER, disambiguation in casual, informal text is a budding area of researchAnother important area of focus: Combining information of varied quality from a corpus (statistical NLP), domain knowledge (tags, folksonomies, taxonomies, ontologies), social context (explicit and implicit communities)
  • 68. Social Context surrounding contentSocial context in which a message appears is also an added valuable resourcePost 1: ¡°Hareemane Househostages said by eyewitnesses to be Jews. 7 Gunshots heard by reporters at Taj¡±Follow up postthat is Nariman House, not (Hareemane)
  • 69. Understanding content ¡­ informal textI say: ¡°Your music is wicked¡± What I really mean: ¡°Your music is good¡± 54
  • 70. Urban DictionarySentiment expression: Rocks Transliterates to: cool, goodStructured text (biomedical literature)Semantic Metadata: Smile is a TrackLil transliterates to Lilly AllenLilly Allen is an ArtistMusicBrainz TaxonomyInformal Text (Social Network chatter)Artist: Lilly AllenTrack: Smile Your smile rocks LilMultimedia Content and Web dataWeb Services
  • 71. Example: Pulse of a CommunityImagine millions of such informal opinionsIndividual expressions to mass opinions¡°Popular artists¡± lists from MySpace commentsLilly Allen Lady Sovereign Amy WinehouseGorillazColdplayPlaceboStingKeanJoss Stone
  • 72. What Drives the Spatio-Temporal-Thematic Analysis and Casual Text UnderstandingSemantics with the help ofDomain ModelsDomain ModelsDomain Models(ontologies, folksonomies)
  • 73. Domain Knowledge: A key driverPlaces that are nearby ¡®Nariman house¡¯Spatial queryMessages originated around this placeTemporal analysisMessages about related events / placesThematic analysis
  • 74. Research Challenge #3But Where does the Domain Knowledge come from?Expert and committee based ontology creation ¡­ works in some domains (e.g., biomedicine, health care,¡­)Community driven knowledge extraction How to create models that are ¡°socially scalable¡±?How to organically grow and maintain this model?
  • 75. Building models¡­seed word to hierarchy creation using WIKIPEDIAQuery: ¡°cognition¡±
  • 76. Identifying relationships: Hard, harder than many hard things But NOT that Hard, When WE do it
  • 77. Games with a purposeGet humans to give their solitaire time Solve real hard computational problemsImage tagging, Identifying part of an image Tag a tune, Squigl, Verbosity, and MatchinPioneered by Luis Von Ahn
  • 80. How do you get comprehensive situational awareness by merging ¡°human sensing¡± and ¡°machine sensing¡±?64
  • 81. Research Challenge #4: Semantic Sensor Web
  • 82. Semantically Annotated O&M<swe:component name="time"> <swe:Time definition="urn:ogc:def:phenomenon:time" uom="urn:ogc:def:unit:date-time"> <sa:swe rdfa:about="?time" rdfa:instanceof="time:Instant"> <sa:sml rdfa:property="xs:date-time"/> </sa:swe> </swe:Time></swe:component><swe:component name="measured_air_temperature"> <swe:Quantity definition="urn:ogc:def:phenomenon:temperature¡° uom="urn:ogc:def:unit:fahrenheit"> <sa:swe rdfa:about="?measured_air_temperature¡° rdfa:instanceof=¡°senso:TemperatureObservation"> <sa:swe rdfa:property="weather:fahrenheit"/> <sa:swe rdfa:rel="senso:occurred_when" resource="?time"/> <sa:swe rdfa:rel="senso:observed_by" resource="senso:buckeye_sensor"/> </sa:sml> </swe:Quantity></swe:component><swe:value name=¡°weather-data"> 2008-03-08T05:00:00,29.1</swe:value>
  • 83. Semantic Sensor ML ¨C Adding Ontological MetadataDomainOntologyPersonCompanySpatialOntologyCoordinatesCoordinate SystemTemporalOntologyTime UnitsTimezone67Mike Botts, "SensorML and Sensor Web Enablement," Earth System Science Center, UAB Huntsville
  • 84. 68Semantic QuerySemantic Temporal QueryModel-references from SML to OWL-Time ontology concepts provides the ability to perform semantic temporal queriesSupported semantic query operators include:contains: user-specified interval falls wholly within a sensor reading interval (also called inside)within: sensor reading interval falls wholly within the user-specified interval (inverse of contains or inside)overlaps: user-specified interval overlaps the sensor reading intervalExample SPARQL query defining the temporal operator ¡®within¡¯
  • 86. Semantic Sensor Web demo (online)Semantic Sensor Web demo (local)70
  • 87. Synthetic but realistic scenarioan image taken from a raw satellite feed71
  • 88. an image taken by a camera phone with an associated label, ¡°explosion.¡± Synthetic but realistic scenario72
  • 89. Textual messages (such as tweets) using STT analysisSynthetic but realistic scenario73
  • 90. Correlating to getSynthetic but realistic scenario
  • 91. Create better views (smart mashups)
  • 92. Extracting Social Signalswhat are the important topics of discussions and concerns in different parts of the world on a particular dayhow different cultures or countries are reacting to the same event or situation (eg Mumbai Attack)how a situation such as financial crisis is evolving over a period of time in terms of key topics of discussion and issues of concern (eg subprime mortgages and foreclosures, followed by troubled banks and credit freeze, followed by massive government intervention and borrowing, and so on).Twitris Demo76
  • 93. A few more thingsUse of background knowledgeEvent extraction from texttime and location extraction Such information may not be presentSomeone from Washington DC can tweet about MumbaiScalable semantic analyticsSubgraph and pattern discoveryMeaningful subgraphs like relevant and interesting pathsRanking paths
  • 94. The Sum of the PartsSpatio-Temporal analysisFind out where and when+ Thematic What and how+ Semantic Extraction from text, multimedia and sensor data - tags, time, location, concepts, events+ Semantic models & background knowledgeMaking better sense of STTIntegration + Semantic Sensor WebThe platform = Situational Awareness
  • 95. KNO.E.SIS as a case study of world class research based higher education environmenthttp://knoesis.org79
  • 96. Kno.e.sis Center Labs (3rd Floor, Joshi)Amit ShethSemantic Science LabSemantic Web LabService Research LabTK PrasadMetadata and Languages LabShaojun WangStatistical Machine LearningPascal HitzlerFormal Semantics & Reasoning labMichael RaymerBioinformatics LabGuozhu DongData Mining LabKeke ChenData Intensive Analysis and Computing LabKno.e.sis Members ¨C a subset
  • 97. Exceptional studentsSix of the senior PhD students: 84 papers, 43 program committees, contributed to winning NIH and NSF grants.Successfully competed with two Stanford PhDs, 1000+ citations in 2 years of his graduation.¡°BTW, Meena is an absolute find.? If all of your other students are as talented, you are very lucky.? ¡­? I¡¯d definitely like to work with more interns of her caliber, ... ¡±[Dr. Kevin Haas, Director of Search at Yahoo!]¡°It has been a few years since I visited Dayton (Wright AFB). However, it is clear that Wright State has?transformed?itself. Congratulations on your success with the KnoesisCenter.¡± [Dr. AlpersCaglayan ¨C looking to hire Kno.e.sis grads]
  • 98. Funding, Collaboration, etcUGA, Stanford, CCHMC, SAIC, HP, IBM, Yahoo!NIH, NSF, AFRL-HE, AFRL-Sensor, HP, IBM, Microsoft, Google 70% Federal, 19% State, 11% IndustryStudents intern at the bestIndustry labs & national labsGraduates very successful83
  • 99. Interested in more background?Semantics-Empowered Social ComputingSemantic Sensor Web Traveling the Semantic Web through Space, Theme and Time Relationship Web: Blazing Semantic Trails between Web Resources Text Mining, Workflow Management, Semantic Web Services, Cloud Computing with application to healthcare, biomedicine, defense/intelligence, energyContact/more details: amit @ knoesis.orgSpecial thanks: Karthik Gomadam, MeenaNagarajan, Christopher ThomasPartial Funding: NSF (Semantic Discovery: IIS: 071441, Spatio Temporal Thematic: IIS-0842129), AFRL and DAGSI (Semantic Sensor Web), Microsoft Research and IBM Research (Analysis of Social Media Content),and HP Research (Knowledge Extraction from Community-Generated Content).

Editor's Notes

  • #51: Microblogs are one of the most powerful ways of talking of CSD
  • #54: Implicit social context created by people responding to other messages. In this example we are showing how the system can identify that its is Nariman and not Hareemane
  • #59: In the scenario, what techniques and technlologies are being brought together? Semantic + Social Computing + Mobile Web
  • #64: Users are shown two images along with labels. Labels gotten from GI or similar data source. Users add relationships. When 2 users agree, the labels are tagged with this relationship. Multiple relationships, using ML techniques, the system will learn .