IT enabled Business Intelligence, CRM, Database Applications
Sep-18
Introduction
Data Mining and Business Intelligence
Prof. Vibhanshu (Vibs) Abhishek
The Paul Merage School of Business
University of California, Irvine
BANA 273 Session 1
1
Agenda
Introduction
Instructor and TA
Course Logistics
Data Mining Examples
SQL
2
About the Instructor
Undergraduate degree in Computer Sc & Engr
Indian Institute of Technology, Kanpur
Masters in Statistics
University of Pennsylvania, The Wharton School
Ph.D in Operations and Information Management
University of Pennsylvania, The Wharton School
Assistant Professor at Carnegie Mellon University
Just joined UCI
Teaching
273 Business Intelligence for Analytical Decision Making – BANA/FEMBA
173 Business Intelligence for Analytical Decision Making – UG
Digital Marketing Analytics
Managing Disruptive Technologies
Research
Effect of emerging technologies on consumers, firms and markets
Digital Marketing, Retail, Mobility
5
Some Consulting Projects
Sequoia Capital
Adobe
McKinsey
LEGO
Kaplan
One Championship
Rivigo
6
BSA Survey on Analytics
Other than work …
Startups
Photography
Travel
Anvi
Big Data Analytics: The Revolution Has Just Begun
6:20 to 9:20
Analytics 2012 keynote speaker, Will Hakes
6:20 to 9:20
9
Course Topics
Getting data from Database: Structured Query Language
RFM Analysis, Excel Pivot Tables
Basic Probability & Information Theory
Preparing Data, Testing and Validation
Data Mining Methods
Classification Using Naïve Bayes
Classification using Decision Trees
Association Rule Mining
Clustering
10
Who should take this class
Required core class for MSBA students
No prior technical knowledge is assumed
Aptitude for learning technical concepts
Algorithms
Software (MS Excel, Weka)
Understand business concepts
Course is recommended for students interested in understanding the techniques and applications of data mining and acquiring hands-on skills for making intelligent business decisions in data-rich organizations.
11
Guidelines for this course
No cell phone or laptop usage in-class unless required for class
UCI Office of Academic Integrity: Do not
Post class materials on non-UCI websites
Do not share individual assignments
Learn about US businesses, language and culture
Groups for class project
Motivation: Business Value of Customer Data
What can a retail store do with shopping basket data?
13
How to increase customer viewership
through tune in ads?
© Prof. V Abhishek, September 18
How can big data be used to reduce fuel
consumption?
How data is utilized is key
Accurate, timely intelligence
Rapid deployment
Effective tactics
Relentless follow-up and assessment
© Prof. V Choudhary, September 18
Business Intelligence & Data Mining
Sep-18
17
Business Intelligence: Components
Reporting
Tooth-paste sales in September?
Cosmetics purchased last quarter?
18
Business Intelligence
Reporting
Tooth-paste sales in September?
Cosmetics purchased last quarter?
Dashboards
Easy access to performance metrics
Ability to drill down into the data
19
Example of Dashboard from MicroStrategy
Example of Dashboard from MicroStrategy
Business Intelligence
Reporting
Tooth-paste sales in September?
Cosmetics purchased last quarter?
Dashboards
Easy access to performance metrics
Ability to drill down into the data
Data Mining
23
Statistics/
AI
Data Mining
Database systems
What Is Data Mining?
Data mining is a process of discovering and interpreting (unknown) patterns in data to solve a business problem.
Machine Learning/
Pattern
Recognition
24
NY Times on 538.com:
Between his live TV appearances on election night, Mr. Silver updated his model and determined around 8
p.m., after New Hampshire went to Senator Obama, that Senator McCain had no way of winning. By the end
of the night, Mr. Silver had predicted the popular vote within one percentage point, predicted 49 of 50
states’ results correctly, and predicted all of the resolved Senate races correctly.
Data Sources for Data Mining
Public and private data
Demographic data
Customer data
Transactions data
Product data
Accounting and marketing data
Web logs (click-stream data)
Application server-logs
Third party server-logs
Data integration is an important challenge
25
Who Uses Data Mining?
26
Amazon’s Personalized Web Pages
Who Uses Data Mining?
Firms with corporate offices in OC
28
Data Mining Methods…
Classification
Clustering
Association Rule Discovery
Decision Trees
Visualization
Collaborative filtering
Text Mining
Neural networks and Deep learning
29
Examples of Visualizing Data:
Word Cloud
Examples of Visualizing Data:
Weekly interface activity
4-dimensional plot uses color, location and size of bubbles
http://www.wolframalpha.com/input/?i=facebook#_=_
Examples of Visualizing Data:
VC’s Facebook Friend Network
© Prof. V Choudhary, September 18
Classification: Definition
Given a collection of records (training set )
Each record contains a set of attributes, one of the attributes is the class.
Find a model for class attribute as a function of the values of other attributes.
Goal: previously unseen records should be assigned a class as accurately as possible.
33
Classification Example
categorical
categorical
continuous
class
Test
Set
Training
Set
Model
Learn
Classifier
34
Example of Decision Tree
categorical
categorical
continuous
class
MarSt
Refund
TaxInc
YES
NO
NO
NO
Yes
No
Married
Single, Divorced
< 80K
> 80K
(11, Yes, Divorced, 115K)
(12, No, Single, 150K)
35
© Prof. V Choudhary, September 18
Other Applications for Classification
Direct Marketing – LL Bean
Goal: Reduce cost of mailing by targeting a set of consumers likely to buy a new cell-phone product.
37
Other Applications for Classification
Customer Attrition/Churn at AT&T
16% in 2013
Goal: To predict whether a customer is likely to be lost to a competitor.
38
Who is Watson?
IBM Watson: Final Jeopardy!
0 to 0:25; 1:10 to 2:03; 3:10 to 4:27; 6:30 to 7:40
Nice energetic intro video 6:20 to 9:20
0:40 – 2:20
Data visualization
Hans Rosling Data Visualization
Obama analytics campaign
https://www.youtube.com/watch?v=6ZOfnFCMHTM
Varian Google CIST Informs 7:05 to 8:10
Watson Jeopardy
Watson 0 to 0:25; 1:10 to 2:03; 3:10 to 4:27; 6:30 to 7:40
39
https://www.riverscasino.com/pittsburgh/BrainsVsAI
Sep-18
Structured Query Language
41
Database Management Systems (DBMS)
Software through which users and application programs interact with a database
DBMS
DBMS
DBMS manages data resources like an operating system manages hardware resources
42
42
Data Warehouse
BI / Data Mining Tools
43
43
SQL Example
Product
Maker
Model
Type
Printer
PrinterModel
Color
Type
Price
PC
PCModel
Speed
RAM
HD
CD
Price
Laptop
LaptopModel
Speed
RAM
HD
Screen
Price
Primary Key
and
Foreign Key
44
45
46
47
48
The SELECT Statement
Used for queries on single or multiple tables
Clauses of the SELECT statement:
SELECT – List the columns (and expressions) that should be returned from the query
FROM – Indicate the table(s) or view(s) from which data will be obtained
WHERE – Indicate the conditions under which a row will be included in the result
Optional clauses:
GROUP BY – Indicate categorization of results
HAVING – Indicate the conditions under which a category (group) will be included
ORDER BY – Sorts the result according to specified criteria
49
SQL Query
SELECT