Big+Data+Analysis+Techniques

Definition
Big Data Analysis Techniques is also known as Data Mining, which is the analytic process in an extremely large database, where data is analyzed in order to find consistent patterns and relationships between different variables.

Description
Data Mining has one complex goal: being able to predict future behaviour of data points. Data Mining consists of three stages:

1st) Data preparation, cleaning, transforming and taking the needed subset of data with needed characteristics. 2nd) Depending on the patterns of data, different applicable models are considered which can best describe the data behavior. 3rd) The desired model is applied to the data and predictions and expected estimates are generated.

Specific examples
FIRST STAGE. One of the data mining techniques **in** **the first stage** is //clustering//. Roughly speaking __clustering__ is grouping similar members out of the set of objects. So a __cluster__ would contain a collection of objects that have a common characteristic. In the figure above, the dots are clustered in the basis of distance. Close dots are grouped together, and the process is called “distance-based clustering”. There is also “concept-based clustering”, meaning objects are grouped on the basis of the concept. For example, if we had objects “Shyngys”, “Daniyar”, “Ermek”, “Assiya”, “Nurbek”, we could cluster them into:
 * Shyngys, Ermek, Nurbek
 * Daniyar, Assiya

This process would be “gender-based clustering”.
SECOND STAGE. One of the techniques used **in the second stage** is building a neural network. This technique derives from the way (supposively) how our brain works. Our brain consists of millions of neurons connected together. As we learn to walk, to talk, to count, new networks emerge and connect and bond together. In the same way, patterns can be found by starting from a small neural network.

media type="youtube" key="qbhMpthdWDM" width="560" height="315"

Contributor: Assiya
====Additionally, the techniques for big data analysis have progressed from single variable to multi-variable. It helps to isolate things that we want to look at from what we are not interested in, in terms of data. Further concept is explained by Michael Houseman, Chief Analytics officer at Evolv. (Housman, M., 18 August,2014)==== media type="youtube" key="YDEUhiSVJIw" width="560" height="315"

Resources

 * What is Data Mining (Predictive Analytics, Big Data). (n.d.). Retrieved December 2, 2014, from http://www.statsoft.com/textbook/data-mining-techniques
 * Clustering - Introduction. (n.d.). Retrieved December 2, 2014, from http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/