Skip to main content

DATA MINING AND DATA WAREHOUSE

 

DATA MINING AND DATA WAREHOUSE

UNIT-1:

Data Mining:

Data mining is defined as the procedure of extracting information from large sets of data i.e. there is a large of data available in the industry. This data is of no use until it is converted into useful information. It is necessary to analyze this large amount of data and extract useful information.

Sometimes referred as

 Knowledge Extraction

 Knowledge Mining

 Pattern Anaysis

 Data Archeology

Areas of Data mining:

 Financial Data Analysis:

The financial data in banking and financial industry is generally reliable and of high quality which facilities systematic data analysis and data mining. Some of the typical cases are as follows:

 Loan payment prediction and customer credit policy analysis.

 Classification and clustering of customers for targeted marketing

 Detection of money laundering and other financial crimes

 Retail Industry:

Data mining in retail industry helps in identifying customer buying items and trends that lead to improved quality of customer services and good customer retention and satisfaction.

 Telecommunication Industry:

Data mining in telecommunication industry helps in identifying the telecommunication pattern, catch fraudulent activities, make better se of resources, and improve quality of services.

 Biological Data Analysis:

In recent times, we have seen a tremendous growth in the field of biology such as genomics, proteomics, functional Genomics and biomedical researches. Biological data mining is a very important part of Bioinformatics.

CLICK HERE FOR DATA MINING AND DW FOR BCA - UNIT-1 NOTES

UNIT-2

What is Data Warehouse?

Data warehousing provides architectures and tools for business executives to systematically organize, understand, and use their data to make strategic decisions.

The term "Data Warehouse" was first coined by William H. Inmon in 1990. According to Inmon, a data warehouse is a subject oriented, integrated, time-variant, and non-volatile collection of data. This data helps analysts to take informed decisions in an organization.

Characteristics of Data wherehouse:

Subject Oriented − A data warehouse is subject oriented because it provides information around a subject rather than the organization's ongoing operations. These subjects can be product, customers, suppliers, sales, revenue, etc. A data warehouse does not focus on the ongoing operations, rather it focuses on modelling and analysis of data for decision making.

Integrated − A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases, flat files, etc. This integration enhances the effective analysis of data.

Time Variant − The data collected in a data warehouse is identified with a particular time period. The data in a data warehouse provides information from the historical point of view. (e.g., the past 5–10 years data)

Non-volatile − Non-volatile means the previous data is not erased when new data is added to it. A data warehouse is kept separate from the operational database and therefore frequent changes in operational database is not reflected in the data warehouse

CLICK HERE DATA MINING AND DW FOR BCA - UNIT-2 NOTES

UNIT-3

Mining Frequent Patterns and Associations

Frequent Itemset Mining :

Frequent Itemset Mining (FIM) is one of the most well known techniques to extract knowledge from data. FIM is the technique used mostly in field of data mining like finance, health care system.

Example 1: Most important use of FIM is customer segmentation in marketing, shopping cart analyzes, management relationship, web usage mining, and player tracking and so on.

Example 2: FIM in Market Basket Analysis: 

This process analyzes customer buying habits by finding associations between the different items that customers place in their “shopping baskets” as shown in the adjacent figure.

The discovery of such associations can help retailers develop marketing strategies by gaining insight into which items are frequently purchased together by customers. For instance, if customers are buying milk, how likely are they to also buy bread (and what kind of bread) on the same trip to the supermarket.

CLICK HERE DATA MINING AND DW FOR BCA - UNIT-3 NOTES

UNIT-4

Classification and Prediction

A classification is a data mining technique that assigns categories to a collection of data in order to help or support in take accurate analysis. A classification identifies data into a predefined groups or classes. Before examining the data, the classes are determined because it is supervised learning.

Example: A Bank loan officer classifies the application are analyzed and determined whether to

make a bank loan and identifying the credit risks. Classification consists to predicting a certain outcome based on a given input. Prediction is a method which is used to estimate the future data based on the past and current data.

In order to predict the outcomes, the algorithm processes a training set containing a set of attributes and the respective outcomes, usually called goal or prediction attribute. The algorithm tries to discover relationships between the attributes that would make it possible to predict the outcomes.

Classification and prediction have numerous applications, including fraud detection, target marketing, performance prediction, manufacturing, and medical diagnosis. 

CLICK HERE DATA MINING AND DW FOR BCA - UNIT-4 NOTES

UNIT-5

Cluster Analysis 

Making a group of related things or objects that occur closely together is the process of clustering. 

"The process of organizing objects into groups whose members are similar in some way" could be a definition of clustering. Therefore, a cluster is a group of objects that are "similar" to one another and "dissimilar" to those found in other clusters. The most significant unsupervised learning challenge is clustering. 

CLICK HERE DATA MINING AND DW FOR BCA - UNIT-5 NOTES


DATAMINIG LAB PROGRAMS USING WEKA

Comments

Popular posts from this blog

COMMUNITY SERVICE PROJECT

  NATIONAL DEGREE COLLEGE::NANDYAL Introduction  Community Service Project is an experiential learning strategy that integrates meaningful community service with instruction, participation, learning and community development  Community Service Project involves students in community development and service activities and applies the experience to personal and academic development.  Community Service Project is meant to link the community with the college for mutual benefit. The community will be benefited with the focused contribution of the college students for the village/ local development. The college finds an opportunity to develop social sensibility and responsibility among students and also emerge as a socially responsible institution CSP HAND BOOK DOWNLOAD IT EVERYONE Guidelines from APSHE SAMPLE CSP PROJECTS done by the Students of National Degree College CHILD LABOUR AGRICULTURE PRODUCTS AND MARKETING USAGE OF MOBILE ONLINE PURCHAGE PLANTATION DIABETES WATER POLUTION USE O

JAVA NOTES FOR ALL

  JAVA NOTES FOR ALL Consider the following important ideas and considerations when dealing with Java: Java is an object-oriented programming language, which means it places a strong emphasis on the idea of objects that encapsulate information and behaviour. Encapsulation, inheritance, and polymorphism are important OOP tenets. Syntax and Organisation: Classes are used as building blocks for objects while writing Java programming. Each class consists of variables (fields) for data storage and functions (methods) for behaviour definition. A main() function is often where Java programmes begin to run. Primitive and reference types are the two basic categories of data types in Java. Integer, double, and boolean types are examples of primitive types, whereas objects, arrays, and strings are examples of reference types. Control Flow: Java has statements for controlling the flow of execution based on conditions, including if-else, switch-case, for loops, while loops, and do-while loops. Exce