Sandbox+Yerlan

There are several phases that can be distinguished. The phases are iterative, in that feedback from later phases may result in additional work in earlier phases
 * Analysis of data ** is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.

The data necessary as inputs to the analysis are specified based upon the requirements of those directing the analysis or customers who will use the finished product of the analysis. Data is collected from a variety of sources. The data may be collected from sensors in the environment, such as traffic cameras, satellites, recording devices, etc. It may also be obtained through interviews, downloads from online sources, or reading documentation. Data initially obtained must be processed or organized for analysis. For instance, this may involve placing data into rows and columns in a table format for further analysis, such as within a spreadsheet or statistical software. Once processed and organized, the data may be incomplete, contain errors. The need for data cleaning will arise from problems in the way that data is entered and stored. Data cleaning is the process of preventing and correcting these errors. There are several types of data cleaning that depend on the type of data. Quantitative data methods for outlier detection can be used to get rid of likely incorrectly entered data. Textual data spellcheckers can be used to lessen the amount of mistyped words, but it is harder to tell if the words themselves are correct.   Once the data is cleaned, it can be analyzed. Analysts may apply a variety of techniques to begin understanding the messages contained in the data. The process of exploration may result in additional data cleaning or additional requests for data, so these activities may be iterative in nature Mathematical formulas or models called algorithms may be applied to the data to identify relationships among the variables, such as correlation or causation. Analysts may attempt to build models that are descriptive of the data to simplify analysis and communicate results. A data product is a computer application that takes data inputs and generates outputs, feeding them back into the environment. It may be based on a model or algorithm. An example is an application that analyzes data about customer purchasing history and recommends other purchases the customer might enjoy. Once the data is analyzed, it may be reported in many formats to the users of the analysis to support their requirements. The users may have feedback, which results in additional analysis. As such, much of the analytical cycle is iterative. When determining how to communicate the results, the analyst may consider data visualization techniques to help clearly and efficiently communicate the message to the audience.
 * Data requirements **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Data collection **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Data processing **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Data cleaning **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Exploratory data analysis **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Modeling and algorithms **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Data product **
 * <span style="font-family: Arial,sans-serif; font-size: 14pt;">Communication **

<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">Longnecker, M., & Ott, R. (2001). An introduction to statistical methods and data analysis. //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">ISBN-13 //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">, //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">854576151 //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">. <span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">Weir, B. S. (1990). //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">Genetic data analysis. Methods for discrete population genetic data //<span style="background-color: #ffffff; color: #222222; font-family: Arial,sans-serif; font-size: 13px;">. Sinauer Associates, Inc. Publishers.