際際滷

際際滷Share a Scribd company logo
APPROACHES TO DATA INTEGRATION
AND BRINGING TOGETHER
PRESENTED BY,
SHARADHA.M
I M PHARM
DEPT.OF PHARMACEUTICS
JSS COLLEGE OF PHARMACY
MYSORE
CONTENTS:
1.Approaches To Data Integration Standards
- OMG,13C (RIP) LSIT & W3C
2.Bringing Together All Three Disciplines
APPROACHES TO DATA INTEGRATION STANDARDS
-OMG, I3C (RIP) LSIT, and W3C
 As stated on the OMG (Object Management) website
(http://www.omg.org/), a lack of data standards results
in data conversions, loss of information, lack of
interoperability, etc.
 Current standards are XML (Extensible Markup
Language) , LSID (Life Sciences Identifiers), and now the
RDF(Resource Description Framework) from the W3C
(World Wide Web Consortium),which is extensible
though hard to implement. Substantial work on OO
(Object Oriented) modeling of life science data types
takes place at the OMGs LSR (Life Sciences Research)
group
 The OMG (Object Management) adopts and publishes
Interface specifications. Specifications may also be
chosen from existing products in competitive selection
process. Any interface specifications are freely available
to both members and nonmembers.
 Implementations must be available from an OMG
member. The OMG uses many approaches to object
oriented modeling of complex data types.
 The OMG has specific domain task force (DTF) groups
that deal with these specific types.
 Working groups are formed to address specific areas
of interest within the task force. Of course, whenever
there is potential for reuse of existing standards, it is
positively encouraged!
 The life sciences domain task force (LSRDTF) has
several working groups: architecture and road map,
biochemical pathways, chemi-informatics, gene
expression, sequence analysis, and single nucleotide
polymorphisms.
 Each working group has a corresponding chairperson
who champions requests for proposals (RFPs) from
any interested parties.
 The working group members identify key needs and
help with the building of RFPs from a boilerplate
standard document issued by the OMG. Anyone can
submit a letter of intent (LOI) to respond to a RFP;
however, to become a submitter, the organization
must become an OMG member.
 A typical OMG standards adoption process is 20
months. The gene expression RFP issued on March
10th 2000 and was an available specification on 16th
Nov 2001.
 The LECIS (Laboratory Equipment Control Interface
Specification) standard is used by Creon as part of their
Q-DIS data standard support.
 There are many open tools out there, toobio-
molecular sequence analysis standard (BSA) is at the
EBI in the form of Open BSA.
 The bibliographic query service standard (BQS) is
also at the EBI as Open BQS.
 The macromolecular structure standard is supported
by the Protein Data Bank as the Open MM toolkit.
 The reason that LSR works is not technology but
peopleparticipation is essential for organizations,
individuals, and evangelists. OMGs constitution is
both fair and equitableHaving a well-defined
process that is transparent in operation to allow open
sharing of information is the key to its success
 The I3C (Interoperable Informatics Infrastructure
Consortium) that, like its website, no longer
functions, but in the main, standards emerge with the
backing of one or two major vendors and the
consumers follow.
 Very rarely, the consumers rally together and force
change upon the vendors.
 Finally, government bodies enforce mandatory
changes that we struggle to comply with (just ask any
CEO about SarbanesOxley).
 The authors speculate what would happen if the FDA
(Food and Drug Administration) stated that all
electronic submissions had to be in XML for CFR 21
part 11 compliance (Title 21 Code of Federal
Regulations, part 11)!
 This is why information management and knowledge
management are so important to data standards.
Fig: The request for proposals life cycle.
BRINGING TOGETHER ALL THREE DISCIPLINES
 Overcoming the three big reasons is the first milestone
in bringing together information and knowledge
management with data standards.
 Domain-specific knowledge is also critical and cross-
domain knowledge even better.
 Finding the data architect who understands the process
and workflow of a chemist is like mining for a rare gem
among the seams of coal. These people are hard to find
and harder to retain.
 As expert disciplines mature and become more
accessible to younger scientists, then multi-skilled
employees will gradually filter upward. However, as this
will take several years, the most widely used approach is
to lure staff from a parallel organization into the
business.
 The only downside is that new ways of thinking and
innovation are now at a premium.
 As with all successful projects, a small proof-of
concept pilot that addresses key stakeholder needs is
the best way of gathering momentum to achieve lasting
change and progress.
 Fixing the time delay between compound submission
and biology IC50 (inhibitory concentration at which
50% of the enzyme is inhibited) results has a better
defined scope than building a science Google for all
users.
Reference:
COMPUTER APPLICATIONS IN PHARMACEUTICAL
RESEARCH AND DEVELOPMENT BT SEAN EKINS,
Page No. 177-179.
THANK YOU!

More Related Content

Approaches to data integration and bringing together

  • 1. APPROACHES TO DATA INTEGRATION AND BRINGING TOGETHER PRESENTED BY, SHARADHA.M I M PHARM DEPT.OF PHARMACEUTICS JSS COLLEGE OF PHARMACY MYSORE
  • 2. CONTENTS: 1.Approaches To Data Integration Standards - OMG,13C (RIP) LSIT & W3C 2.Bringing Together All Three Disciplines
  • 3. APPROACHES TO DATA INTEGRATION STANDARDS -OMG, I3C (RIP) LSIT, and W3C As stated on the OMG (Object Management) website (http://www.omg.org/), a lack of data standards results in data conversions, loss of information, lack of interoperability, etc. Current standards are XML (Extensible Markup Language) , LSID (Life Sciences Identifiers), and now the RDF(Resource Description Framework) from the W3C (World Wide Web Consortium),which is extensible though hard to implement. Substantial work on OO (Object Oriented) modeling of life science data types takes place at the OMGs LSR (Life Sciences Research) group
  • 4. The OMG (Object Management) adopts and publishes Interface specifications. Specifications may also be chosen from existing products in competitive selection process. Any interface specifications are freely available to both members and nonmembers. Implementations must be available from an OMG member. The OMG uses many approaches to object oriented modeling of complex data types. The OMG has specific domain task force (DTF) groups that deal with these specific types.
  • 5. Working groups are formed to address specific areas of interest within the task force. Of course, whenever there is potential for reuse of existing standards, it is positively encouraged! The life sciences domain task force (LSRDTF) has several working groups: architecture and road map, biochemical pathways, chemi-informatics, gene expression, sequence analysis, and single nucleotide polymorphisms.
  • 6. Each working group has a corresponding chairperson who champions requests for proposals (RFPs) from any interested parties. The working group members identify key needs and help with the building of RFPs from a boilerplate standard document issued by the OMG. Anyone can submit a letter of intent (LOI) to respond to a RFP; however, to become a submitter, the organization must become an OMG member. A typical OMG standards adoption process is 20 months. The gene expression RFP issued on March 10th 2000 and was an available specification on 16th Nov 2001.
  • 7. The LECIS (Laboratory Equipment Control Interface Specification) standard is used by Creon as part of their Q-DIS data standard support. There are many open tools out there, toobio- molecular sequence analysis standard (BSA) is at the EBI in the form of Open BSA. The bibliographic query service standard (BQS) is also at the EBI as Open BQS.
  • 8. The macromolecular structure standard is supported by the Protein Data Bank as the Open MM toolkit. The reason that LSR works is not technology but peopleparticipation is essential for organizations, individuals, and evangelists. OMGs constitution is both fair and equitableHaving a well-defined process that is transparent in operation to allow open sharing of information is the key to its success
  • 9. The I3C (Interoperable Informatics Infrastructure Consortium) that, like its website, no longer functions, but in the main, standards emerge with the backing of one or two major vendors and the consumers follow. Very rarely, the consumers rally together and force change upon the vendors.
  • 10. Finally, government bodies enforce mandatory changes that we struggle to comply with (just ask any CEO about SarbanesOxley). The authors speculate what would happen if the FDA (Food and Drug Administration) stated that all electronic submissions had to be in XML for CFR 21 part 11 compliance (Title 21 Code of Federal Regulations, part 11)! This is why information management and knowledge management are so important to data standards.
  • 11. Fig: The request for proposals life cycle.
  • 12. BRINGING TOGETHER ALL THREE DISCIPLINES Overcoming the three big reasons is the first milestone in bringing together information and knowledge management with data standards. Domain-specific knowledge is also critical and cross- domain knowledge even better. Finding the data architect who understands the process and workflow of a chemist is like mining for a rare gem among the seams of coal. These people are hard to find and harder to retain. As expert disciplines mature and become more accessible to younger scientists, then multi-skilled employees will gradually filter upward. However, as this will take several years, the most widely used approach is to lure staff from a parallel organization into the business.
  • 13. The only downside is that new ways of thinking and innovation are now at a premium. As with all successful projects, a small proof-of concept pilot that addresses key stakeholder needs is the best way of gathering momentum to achieve lasting change and progress. Fixing the time delay between compound submission and biology IC50 (inhibitory concentration at which 50% of the enzyme is inhibited) results has a better defined scope than building a science Google for all users.
  • 14. Reference: COMPUTER APPLICATIONS IN PHARMACEUTICAL RESEARCH AND DEVELOPMENT BT SEAN EKINS, Page No. 177-179.