Data+Matching_Data+Mining

Definition
Data matching is a process of ** comparing two differents sets ** of raw data

Description

 * Data matching ** enables user to eliminate differences between different data sets, determining the correct value and reducing the errors number.



The overall goal of the **data mining** process is to extract information from a data set and transform it into an understandable structure for further use. Allows users to analyze data from **difefrent perspectives**, **categorize** it and **sumamrize** the relationships identified. Involves following steps:
 * Extract, select and load data onto data warehouse
 * Pre-process data (cleaning data)
 * Store and manage the data in database
 * Analysis process using special software
 * Visualization (present in a useful format such a graphs or diagrams) of deterrmined patterns



**Figure 1**

What is Data Mining? (Video by Stephan Kudyba - professor in New Jersey Institute of Technology)

media type="youtube" key="R-sGvh6tI04" width="728" height="407" align="center"

Specific examples

 * Data mining is primarily used by **retail, communication and marketing organizations**. It gives companies opportunities to find relationships among "internal" factors (price, product positioning, staff skills) and "external" factors (economic indicators, competition, customer demographics). Mining enables companies to **determine the impact of these factors on sales, customer satisfaction or profit made.**
 * Companies may also use it to create **targeted advertisement** and send it to specific clients based on their rental history database
 * Blockbuster Entertainment mines its video rental history database to recommend rentals to individual customers.
 * American Express can suggest products to its cardholders based on analysis of their monthly expenditures.
 * **WalMart** is pioneering massive data mining to transform its supplier relationships. WalMart captures point-of-sale transactions from over 2,900 stores in 6 countries and continuously transmits this data to its massive 7.5 terabyte data warehouse. WalMart allows **more than 3,500 suppliers**, to access data on their products and perform data analyses. These suppliers use this data to identify **customer buying patterns** at the store display level. They use this information to manage local store inventory and identify new merchandising opportunities. In 1995, WalMart computers processed over 1 million complex data queries.
 * One **Midwest grocery chain** used the data mining capacity of **Oracle** **software** to analyze local buying patterns. They discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few items. The retailer concluded that they purchased the beer to have it available for the upcoming weekend. The grocery chain could use this newly discovered information in **various ways to increase revenue**. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.

Resources

 * Palace, B;. Data Mining: What is Data Mining? Retrieved December 5, 2014, from http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm
 * http://msdn.microsoft.com/en-us/library/hh213071.aspx
 * ====Figure 1 retrieved from http://www.information-mining.org/data-mining/====