18EE654 INTRODUCTION TO DATA ANALYTICS (OPEN ELECTIVE) MODULE WISE NOTES

 18EE654 INTRODUCTION TO DATA ANALYTICS  

(OPEN ELECTIVE) MODULE WISE NOTES




Course Learning Objectives:

To explain introductory concepts, a brief methodological description and some descriptive statistics of data. To explain multivariate descriptive statistics methods of data analytics, methods used in the data preparation. phase of the CRISP-DM methodology, concerning data quality issues, converting data to different scales or scale types and reducing data dimensionality. To explain methods involving clustering, frequent pattern mining, which aims to capture the most frequent patterns. To explain cheat sheet and project on descriptive analytics and generalization, performance measures for regression and the bias–variance trade-off. To explain the binary classification problem, performance measures for classification, methods based on probabilities and distance measures and more advanced and state-of-the-art methods of prediction of data.

Module-1

 Introductory: Introduction to Data, Big Data and Data Science, Big Data Architectures, Small Data, What is Data? A Short Taxonomy of Data Analytics, Examples of Data Use, A Project on Data Analytics.

Descriptive Statistics: Scale Types, Descriptive Univariate Analysis, Descriptive Bivariate Analysis.

Module-2

Multivariate Analysis: Multivariate Frequencies, Multivariate Data Visualization, Multivariate Statistics, Infographics and Word Clouds. 

Data Quality and Preprocessing: Data Quality, Converting to a Different Scale Type, Converting to a Different Scale, Data Transformation, Dimensionality Reduction.

Module-3

Clustering: Distance Measures, Clustering Validation, Clustering Techniques.

Frequent Pattern Mining: Frequent Itemsets, Association Rules, Behind Support and Confidence, Other Types of Pattern.

Module-4

Cheat Sheet and Project on Descriptive Analytics: Cheat Sheet of Descriptive Analytics, Project on Descriptive Analytics.

Regression: Predictive Performance Estimation, Finding the Parameters of the Model, Technique and Model Selection.

Module-5

Classification: Binary Classification, Predictive Performance Measures for Classification,

Distance-based Learning Algorithms, Probabilistic Classification Algorithms.


For Videos Join Our YouTube Channel