The document discusses the challenges of organizing large amounts of unstructured content in enterprises using content taxonomies and classification technologies. It notes that 80% of enterprise data is unstructured like documents, images, emails and more. It discusses how classification technologies can help map existing content to taxonomies, navigate large amounts of content intuitively, and improve legal document review. Effective classification requires integrating with metadata models, interpreting diverse content sources, going beyond search to catalog and manage content, and using both rules-based and analytics-based approaches. Challenges include building taxonomies during content ingestion, reclassifying existing content, evolving taxonomies over time, and integrating different classification approaches.
1 of 18
Download to read offline
More Related Content
Ameritek ecm presentation
1. The Challenges of Building Enterprise
Content Taxonomies and the Role of
Classification Technologies in Maintaining
Their Effectiveness
2. The Challenge of Unstructured
Content
Key Concepts and Terms
Taxonomy, Classification and
ECM Adoption
Classification Technologies for
ECM
3. 80% of Enterprise Data is
Unstructured
Document
Image
E-mail
Report
Other
Billing statements
Claims images
Customer
Correspondence
Mortgage docs
Contracts
Signed BOLs
Healthcare EOBs
Marketing collateral
Website content
Voice authorizations
Signature cards
Credit enrollments
Material Safety
Data Sheets
ISO 9000 docs
Plant schematics
Product images
Spec sheets
….and much more!
5. Organizing the explosion of unstructured
content becomes critical:
We‟ve got 600 GB of content from
basic content services all over the
enterprise.
How can we get this content
efficiently mapped into our ECM
taxonomy?
We‟ve been managing our content
without classifying it for a few years
now.
How can our users navigate amongst
this existing content in a way that‟s
intuitive for our business?
The lawyers have to review 400,000
electronic documents for their case.
How can we make sure they don‟t
waste their time?
10. Metadata: a means of describing, locating, cataloging, and
activating content as objects in a software ecosystem (literally,
data about data).
Enterprise Catalog: a centralized and normalized metadata
model for unstructured content for the purposes of providing consistent
services across all ECM applications.
Taxonomy: a hierarchical structure of information
components, any part of which can be used to classify a
content item in relation to other items in the structure.
Classification: a coding of content items as members of a
group for the purposes of cataloging them or associating them
with a taxonomy.
12. Classification Examples:
– Document Classing
– Foldering
Taxonomy Examples:
– Enterprise Content Catalog
– Industry Standard Document Taxonomies (ISO, XMI)
Methods:
– Rules-Based: Applies pre-determined rules for
“if then” classification of text and properties
– Analytics-Based: Applies algorithms to interpret
classes in order to apply classification rules to them
14. Integrate with and support the ECM
metadata model Interpret a highly-
federated content ecosystem
Go beyond search to catalog and manage
content
Build on advanced analytic technologies –
rules alone are not enough
15. Getting Classification Right: „Garbage in = garbage out‟ is often used in
metadata management projects to describe the problem of building a
metadata model on inconsistent sources.
Driving Process on Taxonomies: ERP systems depending on 3 master
taxonomies – material, vendor and customer. These taxonomies drive
events, workflow definition and the development of transaction-centric
business process applications
Mastering Metadata: The ability to deploy new enterprise applications
depends upon the re-usability, scalability and integrity of the metadata
model
System of Record is Required for Standardization:
– Establishes an enterprise standard that can be audited
– Forms the foundation for building demonstrable best practices
– Enforces consistency of data capture and output
16. Classification Examples:
– Document Classing
– Foldering
Taxonomy Examples:
– Enterprise Content Catalog
– Industry Standard Document Taxonomies (ISO, XMI)
Methods:
– Rules-Based: Applies pre-determined rules for „if,
then‟ classification of text and properties
– Analytics-Based: Applies algorithms to interpret
classes in order to apply classification rules to the
17. Most organizations face content taxonomy pain
–
especially as they standardize around ECM
– Mapping content to taxonomy during
ingestion
– Reclassifying content under management
– Evolving taxonomies as new types of content
emerge
– Integrating folksonomies (SharePoint) into a
master taxonomy
18. Proliferating departmental solutions
– Content Management
– Collaboration (SP, Quickr, Team Rooms, Wikis)
User-based classification and high workforce
turnover
– Productivity declines as knowledge disappears
– Legal discovery is a secondary concern
Mergers and Acquisitions – need to reconcile
disparate content management practices,
repositories and processes