The document describes the OGO (Orthologs and Genes Ontology) system, which provides a semantic query interface for exploring information about ortholog genes and genetic diseases. It allows users to formulate complex queries about orthologs and diseases without needing SPARQL syntax. An example query and its results are shown, finding the ortholog genes of the gene that causes prostate cancer in rats. Future plans include adding more reasoning capabilities and integrating with additional biomedical ontologies and standards.
1 of 25
More Related Content
Mikel egana itbam_2010_ogo_system
1. A semantic query interface for the OGO platform Jos辿 Antonio Mi単arro-Gim辿nez (jose.minyarro@um.es) Mikel Ega単a Aranguren, Ph.D. (mikel.egana.aranguren@gmail.com) Francisco Garc鱈a-S叩nchez, Ph.D. (frgarcia@um.es) Jesualdo Tom叩s Fern叩ndez-Breis, Ph.D. (jfernand@um.es) Faculty of Computer Science University of Murcia Spain ITBAM (DEXA) Bilbo 2010 http://tinyurl.com/35amhn6
2. Overview Orthologs Information about orthologs and diseases OGO system A semantic query interface for the OGO system Sample query
8. OGO ontology: imported ontologies Gene Ontology (OBOF): molecular function, biological process and cellular component of gene products Evidence Codes Ontology (Candidate OBOF): GO annotations evidence codes OBO Relationship Types (Candidate OBOF): Gene product participates in some (molecular function or biological process) Gene product located in some cellular component NCBI taxonomy: organisms classification
15. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
16. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
17. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
18. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
19. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
20. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ? @prefix ncbi: <http://um.es/ncbi.owl>. @prefix ogo: <http://miuras.inf.um.es/ontologies/OGO.owl>. SELECT ?Gene_0 ?Genetic_disease_1 WHERE { ?Gene_0 ogo:fromSpecies ncbi:NCBI_10116 ?Genetic_disease_1 ogo:Name ?literal_4 . FILTER (regex(?literal_4,"Prostate cancer, susceptibility to")) . ?Genetic_disease_1 ogo:causedBy ?Gene_2 . ?Cluster_of_Orthologous_genes_3 ogo:hasOrthologous ?Gene_2 . ?Cluster_of_Orthologous_genes_3 ogo:hasOrthologous ?Gene_0 . }
21. Sample query Ortholog genes of the gene that causes prostate cancer on Rattus norvegicus ?
22. Query grammar Query::= "SELECT" ListVar (WhereClause)? ListVar::=Var (Var)* WhereClause::="WHERE {" ConditionClause (ConditionClause)* "}" ConditionClause::=[VarCondition | LiteralCondition] "." VarCondition::=[Var | Individual] Property [Var | Individual] LiteralCondition::=[Var | Individual] Property [Var | Individual] "." "FILTER (regex (" Var "," Literal "))" Var -> This term represents a variable in the query which can be matched to any concept or individual in the ontology. Individual -> This term represents a concept or individual identied by an URI in the ontology. Property -> This term represents a relationship or property identied by an URI in the ontology. Literal -> This term represents any data value dened by the user.
23. Future plans OWL reasoning for querying (OWL 2 QL?) Pellet Integrity Constraint Validator (Pellet ICV): OWL as schema language for RDF (CWA) Check the gathered information More bio-ontologies Clinical archetypes for querying (ISO 13606): exchange of ortholog/disease information in a standard biomedical research setting
24. Conclusions Orthologs and diseases: new hypotheses OGO provides a resource for exploiting such combined information Semantic query interface: Complex queries easily (No SPARQL syntax) http://miuras.inf.um.es/~ogo/
25. Acknowledgements Spanish Ministry for Science and Education (grant TSI2007-66575-C02-02) Comunidad Aut坦noma de la Regi坦n de Murcia (grant BIO-TEC 06/01-0005) Fundaci坦n S辿neca, Servicio de Empleo y Formaci坦n (grant 07836/BPS/07)
Editor's Notes
The OGO system ... Presentation URL Creative commons attribution non commercial share alike
Orthologs are homolog sequences (they share a common ancestor) that diverged by an speciation event
Orthologs can be used to generate hypotheses. For example, if frog alpha and chicken alpha are ortholog genes, and it is known that frog alpha is involved in a certain trait (e.g. a disease), then it is likely that chicken alpha is also involved in or related to such trait, in chicken Therefore, the information about orthologs is very important in biomedical research, since they show new research paths for human diseases with a genetic cause
Unfortunately, information about orthologs and diseases is scattered and it is difficult to combine
The OGO system provides a resource for accesing the ortholog/diseases combined information in a precise way. The OGO system is an OWL KB, in which the OGO ontology provides the schema and the information regarding orthologs and diseases is stored in instances, with relationships between them The OGO ontology is also used as a guide for the user to build queries The system is accessed with keywords or SPARQL The pipeline is executed periodically (Mappings, information checking)
OGO ontology (KB schema and querying)
Imported ontologies (GO, ECO, RO) reuse existing semantics for querying, as we will see when I describe the queries OBOF: Wealth of quality reusable semantics of the biodomain GO: Member ECO, RO: Candidates
Not detailed Classes as values (OBO format) Future DL
Pipeline
JENA allows to store OWL in a MySQL database, and to access it with SPARQL
The OGO system has two interfaces: Keyword based interface (by disease/by orthologs): not very expressive but fast Semantic interface (next)
The semantic interface is more expressive than the keyword based interface. However, as SPARQL is difficult to use by biologists, the semantic interface provides a graphical interface for creating queries, that, later, are translated into SPARQL It should be noted that this does not allow to use the whole expressivity of SPARQL, but a considerable part of it (see grammar)
In order to define the query, we can select concepts from the OGO ontology, and add any requirements, also using the OGO ontology We can exploit the imported ontologies for querying: GO, ECO, NCBI The defined query is translated into SPARQL and executed against the KB
Whole process First we select the variables that we are interested in from the OGO ontology. In this case, Gene and Genetic disease (i.e, we want to retrieve Genes and Genetic diseases) The imported ontologies can be exploited (GO, ECO, NCBI) for querying
Then we add requirements, also using the OGO ontology (And Imported ontologies ). We can use the selected variables or new ones. We can delete/edit requirements
We edit a requirement by using the OGO ontology (to add new variables and values) or by using the already defined variables NCBI (imported, like GO, and ECO) for providing values for the requirement
We add the finished requirement to the the query
We can add as many requirements as we want
Finally, the query is translated into SPARQL and executed against the KB
Results
The expressivity of the query is limited by the grammar
YOGY already does this, however, redundant results by resource, instead of gene centric, i.e.same gene in different resources OGO ontology is used to check the consistency of the info Less expressivity in SPARQL: no OPTIONAL