KDD is the automatic extraction of hidden knowledge from large volumes of data. KDD in databases is the non-trivial process of identifying valid, potentially useful and ultimately understandable patterns in data. The document then discusses the steps of KDD which include data cleaning, integration, selection, transformation, mining, evaluation and presentation. It also discusses the goals of the knowledge discovery process and how a KDD system can be used in libraries for searching, classification and acquisition by developing domain knowledge, cleaning data, choosing mining tasks and consolidating discovered knowledge.
2. The knowledge discovery database
• KDD is the automatic extraction of non-obvious,
hidden knowledge from large volumes of data.
• KDD in databases is the non-trivial process of
identifying valid, potentially useful and
ultimately understandable patterns in data
Why do we need KDD
• Data is the important tool to gain a competitive
edge by providing improved, cutomized services.
3. Knowledge discovery database
• KDD in databases is the non-trivial process of identifying valid,
potentially useful and ultimately understandable patterns in
data.
1. LIST OF STEPS OF KDD:-
2. Data Cleaning- in this step, the noise and inconsistent data is
removed
3. Data integration-in this step, multiple data sources are combined.
4. Data selection-in this step, data relevant to the analysis task are
retreived from the database.
5. Data transformation- in this step, data is transformed or
consolidated into forms appropriate for mining by performing
aggregation operations.
6. Data mining- intellingent methods are applied in order to extract
data patterms.
7. Pattern evaluation- data patterns are evaluated
8. Knowledge presentation- knoledge is represented.
5. Knowledge discovery Process
GOALS
DATA SELECTION, ACQUISITION&INTEGRATION
DATA REDUCTION AND PROJECTION
MATCHING THE GOALS
EXPLORATORY DATA ANALYSIS
DATA MINING
INTREPRETATION AND TESTING
CONSOLIDATION AND USE
6. KDD System-in Library
• In libraries online databases use KDD system
• It can be used in searching, classification and
acquisition process of libraries.
• Developing an understanding of application
domain helps in relevant prior knowledge.
• Cleaning of processing of data
• Choosing data mining task of KDD process like
classification.
• Consolidating discovered knowledge