Taobao user behavior analysis
Taobao user behavior analysis
Huayu Luo/1901872
Dataset information
Source:https://tianchi.aliyun.com/dataset/dataDetail?dataId=649
This dataset provide about 1 million users who look through website and purchase items during November 25 to December 03, 2017.
Major columns: User ID, item ID, category ID, behavior type, timestamp
Follow chart of implementation
Upload dataset
Hive data analysis
Import data to MySQL
Predict repeat customers
Data visualization
Preprocess dataset
Enormous dataset 10000 pieces
Upload to HDFS
Create a database in Hive
Future plan
How much traffic is on the site per day? (pv)
How much people view the site per day? (uv)
What is the number of page views per person per day?
What is the conversion rate?
What is the churn rate?
What is the repurchase rate?