Data Mining

Data MiningHours: 3 0 3

Introduction to data mining, related technologies – Machine learning, DBMS, OLAP, stages and techniques of data mining process, methods and applications of knowledge representation, data preprocessing, data cleaning, data transformation, data reduction, discretization, generating concept hierarchies, Weka 3 data mining system, filters, statistics and discretization in Weka, measures of interestingness, visualization techniques and experiments in Weka, attribute-oriented analysis, generalization, and relevance, class comparison, statistical measures, algorithms and association rules, motivation and terminology, item sets, generating item sets and rules, correlation analysis, classification, basic learning/mining tasks, inferring rudimentary rules: 1R algorithm, decision trees and rules in Weka, statistical (Bayesian) classification, Bayesian networks, instance-based methods (nearest neighbor), linear models, training and testing in Weka, estimating classifier accuracy, combining multiple models (bagging, boosting, stacking), Minimum Description Length Principle (MLD), clustering, partitioning methods: k-means, expectation maximization (EM), hierarchical methods: distance-based agglomerative and divisible clustering, Cobweb, text mining: extracting attributes, structural approaches (parsing, soft parsing), Bayesian approach to classifying text, web mining: classifying web pages, extracting knowledge from the web, and data mining software and applications.

Pre-requisites: AI231Co-requisites: AI

Hours: XYZ where X = Lecture, Y = Lab, Z = Credit
All hours are per week.
3 Lab hours constitute 1 credit hour
1 credit hour implies 1 lecture of 50mins per academic week. 16 weeks in total.
Pre-Requisite courses are courses required to be completed before this course may be taken
Co-Requisite courses are courses required to be taken along with this course