introduction, data mining, why data mining, application of data mining, steps of data mining, threat of data mining, solution of data mining, role of data mining, data warehouse, oltp & olap, data warehouse, data mining tools, latest research
1 of 30
Downloaded 168 times
More Related Content
Data Mining and Data Warehouse
2. INTRODUCTION
DATA MINING
WHY DATA MINING
APPLICATION OF DATA MINING
STEPS OF DATA MINING
DATA MINING TECHNIQUES
THREAT OF DATA MINING
SOLUTION OF THREAT
ROLE OF DATA MINING
DATA WAREHOUSE
OLTP & OLAP
DATA MINING TOOLS
LATEST RESEARCH
3. INTRODUCTION
Data mining, the extraction of hidden predictive information
from large databases, is a powerful new technology with great
potential to help companies focus on the most important
information in their data warehouses.
4. DATA MINING
It is extraction of previously unknown, valid and understandable
information or pattern from data in repositories or sources :
Databases
Text files
Social networks
Computer simulation
The information obtained should be such that is can be used in any
organizations and enterprises for business making.
5. Why Data Mining ?
Data. Data everywhere yet
I cant find the data I need
I cant get the data I need
I cant understand the data I found
I cant use the data I found
6. Data explosion problem
Advance data collection tools and database technology lead to
tremendous amounts of data stored in database.
We are drawing in data, but starving for
knowledge!
Solution: Data warehousing and Data mining
- Data warehousing and on-line analytical processing.
- Extraction of interesting knowledge using data mining.
7. APPLICATION OF DATA MINING
Data Mining is primarily used today by companies with a strong
consumer focus retail, financial, communication, and marketing
organizations.
15. STEPS OF DATA MINING
Data integration
Data selection
Data transformation
Data mining
Pattern evaluation
Knowledge presentation
17. DATA MINING TECHNIQUES
Classification and Prediction
example Focused Hiring
Cluster Analysis
example Market Segmentation
Outlier Analysis
example Fraud Detection
Association Analysis
example Market Basket Analysis
Evolution Analysis
example Forecasting stock market index using Time series Analysis
18. Threat To Privacy From Data Mining
They data mine information about your buying habits, sites you surf, so they
can personalize your search results when you use their search engine. It's
both frightening but on the other hand, in theory it's a way for companies to
tailor your online experience. The problem, of course, is that while generally
the data isn't scoured by humans, it is used by machines.
19. SOLUTION OF DATA MINING THREAT
SOLUTIONS :
Purposes Specification & Use Limitation
Openness
Security Measures like Encryption
20. ROLE OF DATA MINING IN IT
Business Intelligence
Model Tool Method
Behavioral Basics
Information TechnologyData
Problem
Decision
21. DATA WAREHOUSE
Data warehousing is a technology that aggregates
structured data from one or more sources so that it can
be compared and analyzed for greater business
intelligence.
23. DATA WAREHOUSE
Data warehouse provides the enterprise with a
memory.
Data Mining provides enterprise with intelligence.
24. OLTP & OLAP
On-Line Transaction Processing (OLTP)
Short, simple, frequent queries and modifications
Each involving a small number of tuples
Example answering queries from a web interface, sales at cash registers,
selling airline tickets.
On-line Application Processing (OLAP)
Few but complex queries --- may run for hours.
Queries do not depend on having an absolutely up-to-date
Database.
Example analyst at Wal-mart look for items with increasing sales in some
region.
26. DATA MINING TOOLS
Microsoft SQL Server 2005
Microsoft SQL Server 2008
Oracle Data Mining
DB Miner
27. Latest Research and Reviews on Data
Mining
1. Systematic discovery of mutation-specific synthetic lethal by mining pan-
cancer human primary tumor data.
2. Multi-label Learning for Predicting the Activities of Antimicrobial
Peptides.
3. Semantic correction system - Little complex but interesting. Generally
retried text faces semantic error, hence leads to wrong result. Applying
this as preprocessing leads to better outcomes.
28. 4. Syntactic correction system - Much needed now a days. Non-English
speakers creates much syntactical error. It can also be used as
preprocessing job in many projects. So you algorithm should
automatically detect such errors and suggest correct grammar.
5. Search engine for Wikipedia - Wikipedia data available as dump file.
Check dbpedia for reference. Apply indexing techniques and build
small kind of SE for wiki pages. As Wikipedia already provides this
functionality but you can work on better user experience, result
optimization.
6. Twitter tweets classifier - Pretty easy and interesting too. Creating
learning system for various categories kind of Sports, entertainment,
business, politics, Hollywood etc. Train the classifier (naive bayes,
SVM) and predict the category for incoming tweets.
29. 7. Sentiment analysis for twitter, review, conversations - There are few
packages available in R which can help to perform this job. One needs to add
few additional feature on top of that to make more intuitive. Nltk, Stanford,
good open source tools for the same.
8. Spam mail detection - Again learning based classification system. Train
the classifier using users pre-selected spam mail which would be able to
classify new upcoming mails. If uses mark new mail as spam, then
retrain(may be some other better option).
9. Sarcasms detection - This can be very interesting one. In sentiment
analysis we identify users sentiment regarding something's, here we identify
sarcasm expressed by users. Check out Page on psu.edu - Sarcasm detection
on twitter