Skip to main content

Command Palette

Search for a command to run...

Data Mining Tasks

Updated
2 min read
K

A skilled construction professional specializing in MEP projects. Armed with a Master's degree in Data Science, seamlessly combines hands-on expertise in construction with a passion for Python, NLP, Deep Learning, and Data Visualization. While currently at a basic level, dedicated to enhancing data skills, envisioning a future where insights derived from data reshape the landscape of construction practices. With a forward-thinking mindset, building structures but also shaping the future at the intersection of construction and data.

Data mining tasks are generally divided into

Predictive tasks which are used to predict the value of a particular attribute based on the values of other attributes. The attribute to be predicted is known as target or dependent variables and the attributes used for making the prediction are known as explanatory or independent variables

Descriptive tasks which are used to derive the patterns (correlations, trends, clusters, trajectories, and anomalies) that summarize the underlying relationships in data. It is exploratory in nature. It requires postprocessing techniques to validate and explain the results

The core data mining tasks are classified into

  • Predictive Modeling
  • Association Analysis
  • Cluster Analysis
  • Anomaly Detection

Predictive Modeling used to build a model for the target variable as a function of the explanatory variables.

Two types of predictive tasks are Regression, used for continuous target variables and Classification, used for discrete/categorical target variables.

Example: Predicting disease of a patient

Association Analysis is used to discover patterns that describe strongly associated features in the data.

Example: Identifying products bought together

Cluster Analysis used to identify groups of closely related observations

Example: Grouping articles to the related topics

Anomaly Detection used to identify the observations whose characteristics are significantly different from the rest of the data. A good anomaly detector must have a high detection rate and a low false alarm rate.

Example: Detecting credit card fraud


More from this blog

Data Ilm - Data to Knowledge Discovery

47 posts

Mechanical engineer turned data scientist, passionate about unraveling insights from data and sharing knowledge