The document discusses data purification in a data warehouse. It makes three key points: 1) Data in a data warehouse comes from different source systems and is integrated together. It undergoes ETL processing to increase consistency and scope but purity cannot be guaranteed at 100%. 2) Purifying large amounts of data is challenging as the purification process is unpredictable. The document proposes dividing data by priority and purifying high priority data at 100% and medium at 50% to address this. 3) Eliminating redundant data is important as duplication is a main cause of impurity. Proper knowledge, tools, review, and maintaining priority and schedule are necessary for effective data purification.