ݺߣ

ݺߣShare a Scribd company logo
The Challenges of Building Enterprise
Content Taxonomies and the Role of
Classification Technologies in Maintaining
Their Effectiveness
The Challenge of Unstructured
Content
Key Concepts and Terms
Taxonomy, Classification and
ECM Adoption
Classification Technologies for
ECM
 80% of Enterprise Data is
Unstructured
 Document
 Image
 E-mail
 Report
 Other
 Billing statements
 Claims images
 Customer
 Correspondence
 Mortgage docs
 Contracts
 Signed BOLs
 Healthcare EOBs
 Marketing collateral
 Website content
 Voice authorizations
 Signature cards
 Credit enrollments
 Material Safety
 Data Sheets
 ISO 9000 docs
 Plant schematics
 Product images
 Spec sheets
 ….and much more!
Ameritek ecm presentation
 Organizing the explosion of unstructured
content becomes critical:
 We‟ve got 600 GB of content from
basic content services all over the
enterprise.
 How can we get this content
efficiently mapped into our ECM
taxonomy?
 We‟ve been managing our content
without classifying it for a few years
now.
 How can our users navigate amongst
this existing content in a way that‟s
intuitive for our business?
 The lawyers have to review 400,000
electronic documents for their case.
 How can we make sure they don‟t
waste their time?
Ameritek ecm presentation
Ameritek ecm presentation
Ameritek ecm presentation
Ameritek ecm presentation
Metadata: a means of describing, locating, cataloging, and
activating content as objects in a software ecosystem (literally,
data about data).
Enterprise Catalog: a centralized and normalized metadata
model for unstructured content for the purposes of providing consistent
services across all ECM applications.
Taxonomy: a hierarchical structure of information
components, any part of which can be used to classify a
content item in relation to other items in the structure.
Classification: a coding of content items as members of a
group for the purposes of cataloging them or associating them
with a taxonomy.
Ameritek ecm presentation
Classification Examples:
– Document Classing
– Foldering
Taxonomy Examples:
– Enterprise Content Catalog
– Industry Standard Document Taxonomies (ISO, XMI)
Methods:
– Rules-Based: Applies pre-determined rules for
“if then” classification of text and properties
– Analytics-Based: Applies algorithms to interpret
classes in order to apply classification rules to them
Ameritek ecm presentation
Integrate with and support the ECM
metadata model Interpret a highly-
federated content ecosystem
Go beyond search to catalog and manage
content
Build on advanced analytic technologies –
rules alone are not enough
Getting Classification Right: „Garbage in = garbage out‟ is often used in
metadata management projects to describe the problem of building a
metadata model on inconsistent sources.
Driving Process on Taxonomies: ERP systems depending on 3 master
taxonomies – material, vendor and customer. These taxonomies drive
events, workflow definition and the development of transaction-centric
business process applications
Mastering Metadata: The ability to deploy new enterprise applications
depends upon the re-usability, scalability and integrity of the metadata
model
System of Record is Required for Standardization:
– Establishes an enterprise standard that can be audited
– Forms the foundation for building demonstrable best practices
– Enforces consistency of data capture and output
Classification Examples:
– Document Classing
– Foldering
Taxonomy Examples:
– Enterprise Content Catalog
– Industry Standard Document Taxonomies (ISO, XMI)
Methods:
– Rules-Based: Applies pre-determined rules for „if,
then‟ classification of text and properties
– Analytics-Based: Applies algorithms to interpret
classes in order to apply classification rules to the
 Most organizations face content taxonomy pain
–
 especially as they standardize around ECM
 – Mapping content to taxonomy during
ingestion
 – Reclassifying content under management
 – Evolving taxonomies as new types of content
 emerge
 – Integrating folksonomies (SharePoint) into a
 master taxonomy
Proliferating departmental solutions
– Content Management
– Collaboration (SP, Quickr, Team Rooms, Wikis)
User-based classification and high workforce
turnover
– Productivity declines as knowledge disappears
– Legal discovery is a secondary concern
Mergers and Acquisitions – need to reconcile
disparate content management practices,
repositories and processes

More Related Content

Ameritek ecm presentation

  • 1. The Challenges of Building Enterprise Content Taxonomies and the Role of Classification Technologies in Maintaining Their Effectiveness
  • 2. The Challenge of Unstructured Content Key Concepts and Terms Taxonomy, Classification and ECM Adoption Classification Technologies for ECM
  • 3.  80% of Enterprise Data is Unstructured  Document  Image  E-mail  Report  Other  Billing statements  Claims images  Customer  Correspondence  Mortgage docs  Contracts  Signed BOLs  Healthcare EOBs  Marketing collateral  Website content  Voice authorizations  Signature cards  Credit enrollments  Material Safety  Data Sheets  ISO 9000 docs  Plant schematics  Product images  Spec sheets  ….and much more!
  • 5.  Organizing the explosion of unstructured content becomes critical:  We‟ve got 600 GB of content from basic content services all over the enterprise.  How can we get this content efficiently mapped into our ECM taxonomy?  We‟ve been managing our content without classifying it for a few years now.  How can our users navigate amongst this existing content in a way that‟s intuitive for our business?  The lawyers have to review 400,000 electronic documents for their case.  How can we make sure they don‟t waste their time?
  • 10. Metadata: a means of describing, locating, cataloging, and activating content as objects in a software ecosystem (literally, data about data). Enterprise Catalog: a centralized and normalized metadata model for unstructured content for the purposes of providing consistent services across all ECM applications. Taxonomy: a hierarchical structure of information components, any part of which can be used to classify a content item in relation to other items in the structure. Classification: a coding of content items as members of a group for the purposes of cataloging them or associating them with a taxonomy.
  • 12. Classification Examples: – Document Classing – Foldering Taxonomy Examples: – Enterprise Content Catalog – Industry Standard Document Taxonomies (ISO, XMI) Methods: – Rules-Based: Applies pre-determined rules for “if then” classification of text and properties – Analytics-Based: Applies algorithms to interpret classes in order to apply classification rules to them
  • 14. Integrate with and support the ECM metadata model Interpret a highly- federated content ecosystem Go beyond search to catalog and manage content Build on advanced analytic technologies – rules alone are not enough
  • 15. Getting Classification Right: „Garbage in = garbage out‟ is often used in metadata management projects to describe the problem of building a metadata model on inconsistent sources. Driving Process on Taxonomies: ERP systems depending on 3 master taxonomies – material, vendor and customer. These taxonomies drive events, workflow definition and the development of transaction-centric business process applications Mastering Metadata: The ability to deploy new enterprise applications depends upon the re-usability, scalability and integrity of the metadata model System of Record is Required for Standardization: – Establishes an enterprise standard that can be audited – Forms the foundation for building demonstrable best practices – Enforces consistency of data capture and output
  • 16. Classification Examples: – Document Classing – Foldering Taxonomy Examples: – Enterprise Content Catalog – Industry Standard Document Taxonomies (ISO, XMI) Methods: – Rules-Based: Applies pre-determined rules for „if, then‟ classification of text and properties – Analytics-Based: Applies algorithms to interpret classes in order to apply classification rules to the
  • 17.  Most organizations face content taxonomy pain –  especially as they standardize around ECM  – Mapping content to taxonomy during ingestion  – Reclassifying content under management  – Evolving taxonomies as new types of content  emerge  – Integrating folksonomies (SharePoint) into a  master taxonomy
  • 18. Proliferating departmental solutions – Content Management – Collaboration (SP, Quickr, Team Rooms, Wikis) User-based classification and high workforce turnover – Productivity declines as knowledge disappears – Legal discovery is a secondary concern Mergers and Acquisitions – need to reconcile disparate content management practices, repositories and processes