contador Saltar al contenido

Difference between data mining and data warehouse

Both Data Mining and Data Warehousing are used to hold business intelligence and enable decision making. But both data mining and data warehousing have different aspects of how a company's data works. On the one hand, the data warehouse an environment in which a company's data is collected and stored in an aggregated and summarized way. On the other hand, the data mining a process; which apply algorithms to extract knowledge from data that you don't even know exist in the database.

Let's try to verify the difference between data mining and data warehousing with the help of a comparison chart shown below.

Comparative chart

Basis for comparison Data extraction Data Warehousing
Basic Data mining is a process for retrieving or extracting significant data from the database / data warehouse. The data warehouse is a repository where information from multiple sources is stored in a single schema.

Definition of data mining

Data mining a process for discover knowledge, that not you would be never waited of exist in your database . Using the traditional query tool can retrieve only the information known from the data. However, Data Mining offers you the way to recover hidden information from data . Data mining extracts significant information from the database that can be used for decision process .

The discovery of knowledge in databases, called KDD, exhibition relations is schemes . The relationship can be between two or more different objects, between attributes of the same object. Another data mining pattern that shows the smooth and intelligible sequence of information that helps in decision making.

The steps involved in KDD, or Knowledge Discovery in Database, can be summarized as before selection the data set to be mined. Next there pre-processing which involves the removal of inconsistent data. Then it comes transforming data into which the data is transformed into the appropriate form for data mining. The next the data mining, here the data mining algorithms are applied to the data. And finally, interpretation and evaluation which involve the extraction of the relationship or model between the data.

Data mining fits well with the data warehouse environment that stored data in an aggregate and summarized way. How easy it is to extract data in the data warehouse

Definition of Data Warehousing

Data Warehouse a central location where information collected from multiple sources are stored in a single unified scheme . The data is initially collected, various business sources are then cleaned, transformed and stored in a data warehouse. Once the data has been placed in a data warehouse, it remains there for a long period of time and can be consulted in case of need.

Data Warehouse a perfect blend of technologies like data modeling, data acquisition, data management, metadata management, archive management of development tools . All these technologies support functions like data extraction, data transformation, data storage, providing user interfaces for data access .

The data warehouse is not a product or software, an information environment, which provides information such as an integrated vision of an enterprise. You can access the current and historical company data that help in the decision-making process. Supports transactions made for decision making without affecting operating systems. a flexible resource for obtaining strategic information.

Key differences between data mining and data warehouse

  1. There is a fundamental difference between data mining and data warehousing that data mining is a process of extracting significant data from the large database or data warehouse. However, the data warehouse provides an environment where data is stored in an integrated form that facilitates data mining to extract data more efficiently.

Conclusion:

Data mining can only be performed when a well-integrated large database exists, such as the data warehouse. Then the data warehouse must be completed before data mining. The data warehouse must have information in a well-integrated form so that data mining can efficiently extract knowledge.