程序代写代做代考 database Hive Taobao user behavior analysis

Taobao user behavior analysis

Taobao user behavior analysis

Huayu Luo/1901872

Dataset information

Source:https://tianchi.aliyun.com/dataset/dataDetail?dataId=649

This dataset provide about 1 million users who look through website and purchase items during November 25 to December 03, 2017.

Major columns: User ID, item ID, category ID, behavior type, timestamp

Follow chart of implementation

Upload dataset
Hive data analysis
Import data to MySQL
Predict repeat customers

Data visualization

Preprocess dataset
Enormous dataset 10000 pieces

Upload to HDFS
Create a database in Hive

Future plan

How much traffic is on the site per day? (pv)

How much people view the site per day? (uv)

What is the number of page views per person per day?

What is the conversion rate?

What is the churn rate?

What is the repurchase rate?