Skip to main content

DATA SCIENCE MATERIAL

DATA SCIENCE MATERIAL

 INTERDUCTION

 Data science also known as data-driven science.

• Data science is an inter-disciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured.

• Data science is a concept to unify statistics, data analysis and their related methods in order to understand and analyze actual phenomena with data.

• Data science is a multidisciplinary blend of data inference, algorithm development, and technology in order to solve analytically complex problems.

• We define Data science as managing the process that can transform hypotheses and data into actionable predictions. For example: who will win an Election, what products will sell together, which loans will default, or which advertisements will be clicked on.

• The Data science field employs mathematics, statistics, and computer science disciplines, and incorporates techniques like Machine Learning and Artificial Intelligence.

• The main advantage of enlisting Data science in an organization is the empowerment and facilitation of decision-making.

• For any company that wishes to enhance their business by being more data-driven, Data science is the secret sauce.

Data Science – Development of Data Product:


A “Data Product” is a technical asset that: utilizes data as input, and processes that data to return algorithmically-generated results.

The classic examples of a Data Product are:

a. Amazon’s product recommendation engine(systems)- which ingests user data, and make personalized recommendations based on that data, it suggests items for you to buy, determined by their algorithms.

b. Gmail’s spam filter is a Data Product – an algorithm behind the scenes, which processes incoming mail and determines if a message is junk or not.

Some more examples:

 Google’s advertisement valuation systems

 LinkedIn’s contact recommendation system

 Twitter’s trending topics

 Walmart’s consumer demand projection systems

 Banking institutions are mining data to enhance fraud detection.

 Streaming services like YouTube, Netflix mine data to determine what its sers are interested in, and use that data to determine what TV shows or films to produce. Data-based algorithms are also used at YouTube & Netflix to create personalized recommendations based on a user’s viewing history.

 Shipment companies like DHL, FedEx use Data science to find the best delivery routes and times, as well as the best modes of transport for their shipments.

 Popular Restaurants & Departmental stores use Data science to improve their businesses.

 Fields like Medical, Health care, insurance, etc. and various Public and Private sectors implementing Data science for analysis, decision making and predictions.

ROLES of DATA SCIENTIST


Data science is not performed in a vacuum. It’s a collaborative effort that draws on a number of roles, skills, and tools. In the data science process, the roles must be filled in a successful project.The roles of the data scientist can be shown in the following figure:



a. Data Engineer:
A Data Engineer is a person, fully equipped with knowledge of hardware, databases, data processing at scale and computer engineering and who can build data infrastructure, manage data storage and use and Implement production tools.

b. Data Scientist:
A data scientist is responsible for pulling and cleaning data, designing experiments, analyzing data and communicating result. He should have stronger statistics and presentation skills than a data analyst and data engineer. A data scientist would have strong skills of Inferential Statistics, Machine Learning, Data
Analysis, Data Communication.

c. Data Science Manager:
A Data Science Manager is a person who builds a data team, manages the whole data science process, set goals and priorities and interact with other groups and higher management. He should be strong knowledge of software and hardware, knowledge of roles, strong communication and he knows what can and can’t be achieved. A Data Manger can be any background like: Data science plus management skills or Data engineering plus management skills or Management skills plus got certain training in data science.

d. Data Architect:
A Data Architect understand all the sources of data and responsible for integrating, centralizing and maintaining all the data. He has strong knowledge of how the data relates to the current operations and the effects that any future process changes will have on the use of data in the organization. The role may include things like designing relational databases, developing strategies for data acquisitions, archive recovery, and implementation of a database, cleaning and maintaining the database by removing and deleting old data etc. 

e. Data Analyst:
Data analysts need to have a good understanding of programming, statistics, machine learning, data managing, and data visualization. The Analyst may not have the mathematical or research background to invent new algorithms, but they have a strong understanding of how to use existing tools to solve problems and get new useful insights from data. 

f. Business Analyst:
Business Analyst performs the task of understanding business change needs, assessing the business impact of those changes, capturing, analyzing and documenting requirements and supporting the communication and delivery of requirements with relevant stakeholders. The business analyst role is often seen as a communication bridge between IT and the business stakeholders. Business analysts must be great verbal and written communicators, tactful diplomats, problem solvers, thinkers and analyzers – with the ability to engage with stakeholders to understand and respond to their needs in rapidly changing business environments.

g. Software Engineer:
Software engineers are also needed in data science team because Software is the generalization of a specific aspect of a data analysis. If specific parts of a data analysis require implementing or applying a number of procedures or tools together then we need to build a piece of software to reduce the repeated work. 


Comments

Popular posts from this blog

JAVA NOTES FOR ALL

  JAVA NOTES FOR ALL Consider the following important ideas and considerations when dealing with Java: Java is an object-oriented programming language, which means it places a strong emphasis on the idea of objects that encapsulate information and behaviour. Encapsulation, inheritance, and polymorphism are important OOP tenets. Syntax and Organisation: Classes are used as building blocks for objects while writing Java programming. Each class consists of variables (fields) for data storage and functions (methods) for behaviour definition. A main() function is often where Java programmes begin to run. Primitive and reference types are the two basic categories of data types in Java. Integer, double, and boolean types are examples of primitive types, whereas objects, arrays, and strings are examples of reference types. Control Flow: Java has statements for controlling the flow of execution based on conditions, including if-else, switch-case, for loops, while loops, and do-while loops. Exce

DATA MINING AND DATA WAREHOUSE

  DATA MINING AND DATA WAREHOUSE UNIT-1: Data Mining: Data mining is defined as the procedure of extracting information from large sets of data i.e. there is a large of data available in the industry. This data is of no use until it is converted into useful information. It is necessary to analyze this large amount of data and extract useful information. Sometimes referred as  Knowledge Extraction  Knowledge Mining  Pattern Anaysis  Data Archeology Areas of Data mining:  Financial Data Analysis: The financial data in banking and financial industry is generally reliable and of high quality which facilities systematic data analysis and data mining. Some of the typical cases are as follows:  Loan payment prediction and customer credit policy analysis.  Classification and clustering of customers for targeted marketing  Detection of money laundering and other financial crimes  Retail Industry: Data mining in retail industry helps in identifying customer buying items and trends that

COMMUNITY SERVICE PROJECT

  NATIONAL DEGREE COLLEGE::NANDYAL Introduction  Community Service Project is an experiential learning strategy that integrates meaningful community service with instruction, participation, learning and community development  Community Service Project involves students in community development and service activities and applies the experience to personal and academic development.  Community Service Project is meant to link the community with the college for mutual benefit. The community will be benefited with the focused contribution of the college students for the village/ local development. The college finds an opportunity to develop social sensibility and responsibility among students and also emerge as a socially responsible institution CSP HAND BOOK DOWNLOAD IT EVERYONE Guidelines from APSHE SAMPLE CSP PROJECTS done by the Students of National Degree College CHILD LABOUR AGRICULTURE PRODUCTS AND MARKETING USAGE OF MOBILE ONLINE PURCHAGE PLANTATION DIABETES WATER POLUTION USE O