際際滷

際際滷Share a Scribd company logo
Data Quality Overview
   Alex Meadows
     1/28/2013
Data Quality Facts
    Cost of poor data quality in US - $600 Billion
    Poor Data/Lack of visibility cited as #1 reason for
     project cost overruns
    Poor data quality costs the US Economy $3.1 Trillion a
     year
    Implementing data quality best practices boosts
     revenue by 66%
    Median Fortune 1000 company could increase
     revenue by $2.01 Billion if they improved usability of
     data by 10%

    Source: http://www.webmastat.com/blog/2012/09/07/7-facts-about-data-quality/
What is Data Quality?




Measuring data to determine if it is
        fit for purpose
Fit For Purpose?
   Bad data is a myth!
   Two Questions
          What is the data used for?
          What can be measured to make sure it meets
            the need?
   Application use vs. Reporting/Analysis
Data Quality Dimensions
   Consistency                             Accuracy
   Correctness                             Objectivity
   Timeliness                              Conciseness
   Precision                               Usefulness
   Unamiguous                              Usability
   Completeness                            Relevance
   Reliability                             Amount of data


        Source: Data Quality Fundamentals, The Data Warehousing Institute
Measuring Data Quality
   Profiling  understanding metadata
          Point in time shows what data looks like now
          Automating shows trends
                  Alert to new/potential issues as they happen
                  Potentially fix issues in near real time
                  Six Sigma Principals
Statistical Process Control
   Automated inspection
   Visibly shows process deviation
Data Profiling Analysis
   Duplication              Character Set
   Pattern matching         Reference Data
   Boolean/String/Numb       Matching
    er                       Value Distribution
   Date Gap                 Inter-Data Set
   Date/time                 Comparisons
   Day of Week
Master Data Management
   Create a gold standard for data
   Distribute data so that all sources are uniform
          Names
          Addresses
          Phone Numbers
          Products
   Can hook into third party sources
Data Governance Program
   Central authority for data quality control
   Applies information collected from data
    profiling, MDM, etc. Uniformly across the
    business
   Communication channels between business
    and IT groups
Questions?

More Related Content

Data quality overview

  • 1. Data Quality Overview Alex Meadows 1/28/2013
  • 2. Data Quality Facts Cost of poor data quality in US - $600 Billion Poor Data/Lack of visibility cited as #1 reason for project cost overruns Poor data quality costs the US Economy $3.1 Trillion a year Implementing data quality best practices boosts revenue by 66% Median Fortune 1000 company could increase revenue by $2.01 Billion if they improved usability of data by 10% Source: http://www.webmastat.com/blog/2012/09/07/7-facts-about-data-quality/
  • 3. What is Data Quality? Measuring data to determine if it is fit for purpose
  • 4. Fit For Purpose? Bad data is a myth! Two Questions What is the data used for? What can be measured to make sure it meets the need? Application use vs. Reporting/Analysis
  • 5. Data Quality Dimensions Consistency Accuracy Correctness Objectivity Timeliness Conciseness Precision Usefulness Unamiguous Usability Completeness Relevance Reliability Amount of data Source: Data Quality Fundamentals, The Data Warehousing Institute
  • 6. Measuring Data Quality Profiling understanding metadata Point in time shows what data looks like now Automating shows trends Alert to new/potential issues as they happen Potentially fix issues in near real time Six Sigma Principals
  • 7. Statistical Process Control Automated inspection Visibly shows process deviation
  • 8. Data Profiling Analysis Duplication Character Set Pattern matching Reference Data Boolean/String/Numb Matching er Value Distribution Date Gap Inter-Data Set Date/time Comparisons Day of Week
  • 9. Master Data Management Create a gold standard for data Distribute data so that all sources are uniform Names Addresses Phone Numbers Products Can hook into third party sources
  • 10. Data Governance Program Central authority for data quality control Applies information collected from data profiling, MDM, etc. Uniformly across the business Communication channels between business and IT groups