Data Mining Tasks
Data mining tasks are generally divided into
Predictive tasks which are used to predict the value of a particular attribute based on the values of other attributes. The attribute to be predicted is known as target or dependent variables and the attributes used for making the prediction are known as explanatory or independent variables
Descriptive tasks which are used to derive the patterns (correlations, trends, clusters, trajectories, and anomalies) that summarize the underlying relationships in data. It is exploratory in nature. It requires postprocessing techniques to validate and explain the results
The core data mining tasks are classified into
- Predictive Modeling
- Association Analysis
- Cluster Analysis
- Anomaly Detection
Predictive Modeling used to build a model for the target variable as a function of the explanatory variables.
Two types of predictive tasks are Regression, used for continuous target variables and Classification, used for discrete/categorical target variables.
Example: Predicting disease of a patient
Association Analysis is used to discover patterns that describe strongly associated features in the data.
Example: Identifying products bought together
Cluster Analysis used to identify groups of closely related observations
Example: Grouping articles to the related topics
Anomaly Detection used to identify the observations whose characteristics are significantly different from the rest of the data. A good anomaly detector must have a high detection rate and a low false alarm rate.
Example: Detecting credit card fraud