CASE STUDY # 1
Please answer the SQL questions. There are several ways to write these SQL statement and any form (MySQL) is acceptable as long as it generates the correct answers.
• The salary table contains salary information of all employees in various departments of ABC Company. A partial view of the table is given below.
• Write a SQL statement to find the highest salaried employee in each department.
• Write a SQL statement to calculate average salary per department.
• Write a SQL statement to find employees whose name starts with ‘J’.
• Write a SQL statement to find employees who have been in the company 5+ years.
• Write a SQL statement to find the number of employees who work in Finance department
• Write a SQL statement to list employee names and their tenure in the company.
• Write a SQL statement to list employees with the highest tenure in each department.
Table name: EE_information
Table name: EE_salary
• Employee Benefit table contains an array of structure with Benefit code and deduction amount for each employee. DNT Dental Insurance deduction amount, HB Health Insurance Deduction Amount and FSA FSA deductiom amount. Write a SQL statement to calculate total amount of benefit deductions for each employee, e.g. John Doe’s total benefit deduction should be 200(50+150).
Table Name: EE_Benefit
CASE STUDY # 2
Background: X Company’s Digital team tracks behavior of web visitors on X Company.com website. The attached data gives information about web activity of our online visitors sourced from Paid Search over a month. Please use Python (Jupyter Notebook) to run the analysis. Here are some general expectations:
• Analyze the data to identify patterns in different methods and compare cons and pros among these methods
• Identify anomalies/outlier in the data in different methods and compare cons and pros among these methods
• Try to predict which visitors are more likely to submit a lead form in different methods and compare cons and pros among these methods.
• Hint: above will deploy different machine learning algorithm and not limit to other methodologies, such as Clustering, PCA, Decision Tree, Random Forests, Voting Classifier, GBM, XG Boosting, etc.
Data Dictionary