Group Assignment: Business Analytics Case Study
Group Assignment: Business Analytics Case Study
In groups of 4, you are required to develop a Business Analytics (BA) system for Anti-VirUs. You are expected to design a high-level BA framework with a detailed dimensional model design. Additionally, you will need to implement components of your BA framework, generate insights from data and suggest ways for managerial action.
Assessment
This assignment is worth 40% of your assessment for the subject.
Guidelines for Teamwork
As you will complete this assignment in groups, please study the following resources for successful teamwork:
https://my.uq.edu.au/information-and-services/student-support/study-skills/group-work https://www.edx.org/course/working-in-teams-a-practical-guide
Case Description
Please see below: Anti-VirUs COVID Tracing Case Study (text is identical to individual assignment)
Deliverable: Submission/Report Requirements
The submission should be about 2000 words (± 5%; total length does not include appendices) in length and take the form of a report describing and justifying the design of the star schema and predictive models. Please design the report in the following structure:
1. Executive Summary: An executive summary is a short summary (max. 500 words) of the key information in the document and is written for a business person. It should contain clear, evidence- based recommendations for government and health organization officials regarding the questions raised below and whether or not investments into Anti-VirUs are justified.
2. Design of a Dimensional Model: Your group is required to design a star schema for the case study. Include a diagram of your star schema as well as justifications for the design choices that you have made.
3. Implementation of the Dimensional Model: Your group is required to implement your dimensional model to answer five out of the six analytical questions below. Please make sure to carefully read and answer the questions. Answering different questions or not answering each part of the questions will lead to your group losing points. Implementing the Dimensional model involves two steps: (1) data integration and (2) data visualization.
3.1. Data integration (SSIS or SQL):
• Use the Microsoft SQL Server Integration Services (SSIS) tool to create your fact and
dimension tables.
• Add a screenshot of your SSIS transformations to the text and briefly explain your
transformations and steps.
• Attach and submit your SSIS transformation file along with your report.
3.2. Data visualization (PBI):
• Use Microsoft Power BI to build dashboard(s) to communicate key information to
managers and inform decisions as highlighted in the “insights needed” description below.
• Add a screenshot of your dashboard(s) to the text
• Attach and submit your .pbix file along with your report.
Group Assignment: Business Analytics Case Study
4. Development of an analytical model (RapidMiner): Your group needs to apply statistical analysis to answer one out of the six questions below:
• You need to use RapidMiner to perform your analysis
• Add a screenshot of your RapidMiner process to the text
• Attach and submit your RapidMiner (*.rmp) file along with your report
5. Generate Business Insights and Strategize: Your group is required to analyse the data (as requested in deliverable 3 and 4). You need to use the digital dashboards and the analytical models you have built to generate business insights regarding each of the questions raised below. In addition to generating business insights, you need to propose action that countries and health organizations.
Deliverable: Submission/Report Appendix
6. Appendix 1 – Data Dictionary: The star schema must be documented with a data dictionary for each of your fact and dimension tables. Include an entry for each data item including its name, a brief description of its meaning, and describe how it will be sourced from the existing databases and include a description of any necessary transformations. The data dictionary needs to contain enough detail on your dimensional model design so that the model can be clearly understood.
7. Appendix 2 – SSIS transformations, PBI visualisations and RapidMiner Process: You also need to zip your SSIS, PBI and RapidMiner files and submit along with your assignment. Submissions without SSIS, PBI, and RapidMiner attachments will not be marked. Late submissions of these files will incur penalties as discussed in subject eCP.
8. Appendix 3 – SQL (optional): If you prefer to use SQL for implementing your dimensional model, include all of the SQL statements that you used including Create statements, Key creation and any indexing you deem necessary.
9. Appendix 4 – Work Breakdown: – Detail the breakdown of work of the team members for this assignment. This should be a detailed account of what each team member accomplished as part of the assignment. Submissions without work breakdown will not be marked.
Assessment Criteria of Deliverables (find rubric specific descriptions in separate file)
Aspect of review (% of grade)
Comprehensive
Proficient
Basic
Below Basic
Poor
Report
Executive Summary (5%)
Dimensional Model Design (20%)
Dimensional Model Implementation (30%)
Building Analytical Model (15%)
Generating business insights and strategizing (25%)
Appendix
Data Dictionary (5%)
SSIS (or SQL) transformations, PBI visualizations and RapidMiner Process
(mandatory submission, non-graded item)
Work Breakdown
(mandatory submission, non-graded item)
Submission
This assignment is due at 2:00 PM on 1 November 2020.
For late submissions, please refer to guidelines outlined on eCP:
http://www.courses.uq.edu.au/student_section_loader.php?section=5&profileId=93817
Group Assignment: Business Analytics Case Study
You are required to submit an electronic version of your work through Blackboard (see the Assessment Link on Blackboard and look for the link labelled Assignment Submission. You will have two opportunities to submit your documents. So, in case you make a mistake on your first submission or wish to revise your assignment, you can re-submit it once. Please ensure that you put your report through Turnitin BEFORE submitting it to Blackboard, to make sure that your assignment is plagiarism free. By submitting your work in this manner, plagiarism is checked for you and you will be able to see if you have any issues with this in your work. Use Turnitin as many times as you need to fix any similarity issues before the due date. Also, by submitting the Assignment via Blackboard you attest that the Assignment is your own work.
Please compress your SSIS, PBI and RapidMiner files as well as your Data Dictionary in a zip file. Submit the Zip file along with your report on both Blackboard assignment and Turnitin.
Academic Honesty
Academic misconduct occurs when students portray someone else’s work as their own. There are many ways in which Academic misconduct can occur. Some of these are:
• Sham Paraphrasing: Material copied verbatim from text and source acknowledged in-line but represented as paraphrased.
• Illicit Paraphrasing: Material paraphrased from text without in-line acknowledgement of source.
• Other Plagiarism: Material copied from another student’s assignment with the knowledge of the
other student.
• Verbatim Copying: Material copied verbatim from text without in-line acknowledgement of the
source.
• Recycling: Same assignment submitted more than once for different subjects.
• Ghost Writing: Assignment written by third party and represented as own work.
• Purloining: Assignment copied from another student’s assignment or other person’s papers without
that persons knowledge.
The university is committed to graduating students with “a profound respect for truth, and for the ethics of scholarship […] we want our graduates to be capable of independent thought, to be able to do their own work, and to know how to acknowledge the work of others” (Prof. Peter McPhee). As such, the university takes a dim view of students to are not able to correctly acknowledge the work of others, or who try to pass this work off as their own. All students should be aware of the following web site: https://my.uq.edu.au/information-and-services/manage-my-program/student-integrity-and- conduct/academic-integrity-and-student-conduct which provides practical advice to students about how not to be involved in academic misconduct. In particular look at the Quick Checklist section (paraphrased below).
Group Assignment: Business Analytics Case Study
Quick checklist – Avoid Plagiarism with Proper Referencing
To be certain to acknowledge sources fairly and avoid plagiarizing, review this checklist before beginning to write your essay and again after you have completed your first draft1.
1. What type of source are you using: your own independent material, common knowledge, or someone else’s independent material?
2. If you are quoting someone else’s material, is the quotation exact? Have you used quotation marks for quotations embedded in your text? Have you set off block quotes with an extra space before and after the quote, single spacing within the quote, and left indenting of all lines of the block quote? Are omissions shown with ellipses and additions with square brackets?
3. If you are paraphrasing someone else’s material, have you rewritten it in your own words and sentence structures? Does your paraphrase employ quotation marks when you resort to the author’s exact language? Have you represented the author’s meaning without distortion?
4. Have you acknowledged each use of someone else’s material?
5. Do all references contain complete and accurate information on the sources you have cited?
1 Reproduced with permission of Dr Stephen Morgan, Faculty of Economics and Commerce, UniMelb
Group Assignment: Business Analytics Case Study
Anti-VirUs COVID Tracing Case Study
Anti-VirUs (AVU) is an Australia-based world leading non-for-profit E-Health data analytics company. AVU essentially collects, prepares, analyzes, and visualizes data for various governments and health organizations (e.g., World Health Organization). During the ongoing COVID pandemic, AVU collects data from multiple sources all around the world to inform about the spread of the severe acute respiratory syndrome. COVID (short for Coronavirus disease or COVID-19) is an infectious disease caused by a newly discovered coronavirus.
AVU provides insights to understand the factors that influence the general spread of COVID and make local predictions about infection numbers. Various national governments as well as global health organizations rely on AVU’s insights to make decisions regarding the provision of additional hospital resources, the enforcement of curfews, and the containment of the spread of the pandemic.
AVU has built its global recognition by being able to integrate various data sources. They are able to collect various measures such as population statistics, health indicators, mobility metrics, and even weather measures from sources such as the WorldBank, Google, NOAA, and Wikidata. They have broad-level data for most regions as well as county/municipality-level data for Argentina, Brazil, Chile, Colombia, Czech Republic, Mexico, Netherlands, Peru, United Kingdom, and USA. In the context of this case study, you will only be working with parts of that comprehensive data sample (figure 1).
As countries are faced with a second wave of COVID infections, AVU is being approached by various decision-makers about the lessons they have learnt during the first wave of infections. At the same time, AVU is also constantly revising their internal processes trying to find ways to improve their recommendations and gain a better understanding of the COVID spread. You have been hired as a member of their analytics team to inform these decisions.
The Insights Needed for Decision-Making
The second wave of the relentless COVID pandemic is posing significant challenges to decision makers in government and health organizations alike. Although all officials have a wealth of experience and a strong “gut feel” about the pandemic, AVU is asked to provide evidence based on data to inform their decision-making. In particular, officials need insights about citizens and countries at risk, time- dependent trends, the effectiveness of governmental interventions and types of affected countries as well as predictions about perspective case numbers. Therewith, AVU wants to support data-driven decision making to reduce the spread of COVID.
Group Assignment: Business Analytics Case Study
1. Does testing drive the active case incidence?2
The active case incidence describes number of newly confirmed patients excluding the numbers of recently recovered and deceased cases. The active case incidence represents the extent to which COVID-19 is spreading or retracting. Some politicians argue that the reason for the increasing number of active cases is the result of an expansion of tests. Test this assumption by comparing the number of active case incidence with the number of globally conducted tests.
2. Is there a link between weather conditions and infection spread?2
The observable variation in active case incidence suggests that there might be systematic differences affecting the spread of COVID. Regarding these systematic differences, some state leaders hope that as summer comes and temperatures raise, the COVID cases might automatically decline. Explore whether there is any indication that seasons or temperatures can help reduce the spread of COVID.
3. How do economic conditions affect infection spread?2
Some heads of state argue that the spread of COVID is determined by the national gross domestic product (GDP) per capita. They assume that countries with a lower GDP per capita suffer from higher infection numbers. As of 1 July 2019, low-income economies are defined as those with a GDP per capita, calculated using the World Bank Atlas method, of $1,025 or less in 2018; lower middle-income economies are those with a GDP per capita between $1,026 and $3,995; upper middle-income economies are those between $3,996 and $12,375; high-income economies are those with a GDP per capita of $12,376 or more. Your analysis should inform the discussion on which types of economy are on average most affected as it relates to the GDP per capita.
4. Which index is best suited to explain the infection incidence?2
The World Bank has developed the human development index (HDI) and argues that this is a better indicator for infections than the GDP per capita. Consider both measures to identify two countries that are outliers with excessive active case incidences. Excluding these outliers from the analysis, inform the discussion on which of the two indexes is better suited to describe infection numbers and what the indexes predict.
5. Which countries seem particularly effective in containing the COVID spread?2
Different governments are trying to control the impact of the pandemic. Australia, Russia, India, and the USA have experienced different death rates during the pandemic. Considering the total number of deceased patients in these countries, do large investments into the healthcare system automatically prevent deaths. If you have to prioritize investments, please advise on whether governments should invest into physicians or nurses to prevent deaths.
6. Classification of affected countries3
In order to assist global health organizations, develop country specific guidelines to deal with the pandemic, load the country aggregated data sample (country_aggregate.csv). Identify three and four clusters of countries based on their HDI and GDP per capita, number of new tests, healthcare expenditure and out of pocket healthcare expenditure, as well as fatality rates and the dichotomized fatality rates measure (0 = low fatality rate, 1 = high fatality rate). Remember to remove outliers from your analysis and cases with missing data in the relevant variables. Justify whether you propose a three or four cluster solution and describe each cluster.
2 PBI data visualization (report deliverable number 3.2) 3 RapidMiner prediction (report deliverable number 4)
Group Assignment: Business Analytics Case Study
The Current Information Systems
The company uses OLTP databases to source epidemiology, weather, health, economy and regional information. Data for the BA system will be sourced from these existing operational systems.
Covid-19 Transactional database
The transactional system is a package based on an MySQL database running on a windows-based computer. The ER model for the system is shown in Figure 1 below.
Figure 1: Covid-19 ER Model (Definition of attributes are provided in the data dictionary in Assignment folder on LMS)
Connecting to OLTP database
Use the following information to connect to OLTP database and source the data:
Please note you can only access this database via Virtual Machine – Student Performance Desktop Use the “ODBC source” (not OLE DB) connector in SSIS to load the data.
Server: Database: Username: Password:
teach.business.uq.edu.au bism_sem_2_2020 bism_student Student_lets_learn_4000
Best of luck with your assignment!