Coursework: Survey project
UoL MSc Data Science: Data Visualisation
October 2021
Coursework by topic
2 Survey project research statements (formative) 2
3 Survey project specification (CW1) [30%] 4
5 Survey data collection (formative) 8
10 Survey project final report (CW2) [70%] 9
1
Topic 2
Survey project research statements
(formative)
This formative assessment activity provides an opportunity to explore potential ideas for your
coursework.
Complete the following steps:
1. read the full specification of the module coursework and ensure you understand the time-
line and required final deliverables
• Coursework 1: Survey project specification (30%)
• Coursework 2: Survey project final report (70%)
2. brainstorm three potential topics for a survey-based investigation
• these can be any topics that interest you, and that you can conceivably investigate
by collecting and analysing survey response data
• conduct research into your topic
• identify a public source of data relevant to your topic that could be used to contextu-
alise or otherwise enrich findings from a survey analysis
3. try to articulate your project ideas in terms of research objectives and research questions
• your research objectives should aim to explore and describe your chosen topic
• your research questions should be specific and realistic
• pay particular attention to the scope implied by different research questions you de-
velop
• pay particular attention to the number of participants you might need in order to
collect enough data to answer your questions (i.e. variables with lots of categories
will need more people)
• aim for one or two general research objectives per topic, and three or four very spe-
cific research questions per topic
• how will you utilise external data alongside survey data to answer your research
questions?
4. compare your ideas to the examples provided
• are your research statements wider or narrower in scope?
• does the topic of the survey require specific participants, or is it suitable for the gen-
eral public?
• what kind of phenomena will the survey illuminate (individual human behaviour,
social behaviours, structures or organisation, preferences or opinions, psychological
2
concepts etc.)?
• are there any potential ethical or privacy concerns with your topic?
Don’t worry if your ideas change and evolve over the next few weeks – that is a normal part of
research. In the next topic you will choose one idea to develop in greater detail.
2.1 Ideas
• Gaming behaviour: How many hours do people spend playing games? Gender? Age?
What types of games?
• Social network usage: How much time spent on social media? How many posts per day?
Connection with personality types?
• Health and fitness: Levels of exercise? Preferred methods of exercise? Age, gender, job
type factors?
• Attitudes towards current affairs and topical world events.
• Look at YouGov for inspiration.
3
Topic 3
Survey project specification (CW1)
[30%]
3.1 Assignment specification
Produce a project specification and survey for a data visualisation-led investigation into a topic
of your choice. Your project must be based on analysis of data you have collected yourself from
a survey and public data found online.
The project specification will serve as the basis for Coursework 2, which will be a report of your
findings.
• Submit your project specification as a PDF file (1500–2000 words).
• Submit a PDF copy of your survey implemented as a JISC Online Surveys.
• References and appendices do not count towards the word limit.
1. Project specification and survey design
State the following clearly and concisely.
• the topic of your investigation
• background research
– provide a short summary of the domain of research/field of enquiry
– provide and discuss references to contextualise the investigation
• research objective(s)
• motivation(s)
• research questions (aim for four very precise research questions)
• population and sample
• domain concepts (give precise definitions)
– reference academic/technical literature where appropriate
• an external source of data you will use alongside your survey data
– state how you will use this data to provide additional context or answer specific
research questions
Comment on your survey design.
• how do your survey questions map to research questions?
• how have you operationalised the concepts in your study to ensure you will collect
valid quantifiable data from your participants?
• how have your structured your survey and ordered your questions to ensure you
obtain reliable and valid data?
4
2. Survey implementation
Implement a survey using JISC Online Surveys. See the module information section of the
VLE for instructions on how to access this service for free as a UoL student. Do not sign
up for an account via the website, you must obtain a link from UoL for a free account.
Think about how survey questions map to research questions. As a guide, one research
question will typically map to one to three survey questions.
You should keep you surveys as short as possible. Aim for approximately ten survey ques-
tions (in addition to basic demographic questions). The exact number of questions is not
the primary issue – the important thing is that your survey should collect the data neces-
sary to answer your specific research questions, and no more.
The word limit for the final report is 3500–4500 words. This is the hard limit on scope.
Your questionnaire should include:
• a clear statement about the topic of the questionnaire
– but should not explicitly reveal the research questions or bias the responses
• clear, relevant, neutral, inclusively-worded questions focusing on the recent past
• avoid open questions, or if they are essential, you must have a strategy for coding
answers quantitatively (explain this in your project specification)
You are highly encouraged to pilot your survey prior to submission.
Download a PDF copy of your survey to submit it along with your project specification –
click “Preview survey” and select the Export survey as PDF option in the survey preview
settings menu. This will serve as a ‘hard copy’ for assessment and feedback purposes.
After you receive feedback on your project specification and surveymake any final changes
to your survey before making it live and circulating it to your population. You may wish
to post the URL to your survey to the VLE forum to gather data from your peers, but you
are also free to circulate your survey in any way you deem appropriate to your research
topic.
3.2 Example introduction
I am studying for a MSc Data Science at the University of London, and as part of my coursework,
I am collecting data regarding
The survey is designed to…
This survey will be open until
Your identity will remain anonymous. All data will be destroyed after the project has been com-
pleted.
The whole survey is likely to take around five minutes to complete. Please take your time to read
the questions carefully, and answer as truthfully as possible. Should you have any questions,
please feel free to contact us. Thank you very much for your time.
3.3 Rubric
1. Project specification and survey design [22 marks]
(a) Topic overview
• [0] not stated or vague topic
• [1] topic briefly discussed
5
• [2] clear and concise discussion
(b) Background research
• [0] no background research
• [1] brief discussion of related wider issues
• [2] clear and concise discussion of related research or news stories
(c) Domain concepts
• [0] not discussed
• [1] key terms/concepts briefly defined
• [2] comprehensive set of domain terms/concepts discussed, citing previous re-
search where appropriate
(d) Sample and population
• [0] not discussed
• [1] sample and population stated
• [2] wider discussion including sampling methods and potential bias
(e) Motivation and objectives
• [0] no motivation or objectives
• [1] brief discussion of motivations and objectives
• [2] clear and concise rationale motivating the study and potential impact
(f) Research question(s)
• [0] not stated explicitly
• [1] attempt to construct research questions, but some inconsistency or lack of
focus
• [2] basic set of coherent research questions
• [3] discussion of scope, which is appropriate for the duration and context of this
assignment
• [4] clear and well-designed set of research questions with potential to generate
novel insight within the scope of the study
• [5] research questions informed by previous research or in conjunction with a
theoretical framework
(g) Research questions map to survey questions
• [0] not discussed
• [1] some inconsistencies in the relationship between research and survey ques-
tions
• [2] research and survey questions are well aligned and appropriate in scope
(h) Operational definitions
• [0] not discussed
• [1] brief discussion of how concepts and behaviours are quantified, but some
poor decisions
• [2] clearly justified set of measurement criteria, likely to produce high quality
and usable data in a real-world setting, and relevant for answering research ques-
tions
• [3] well designed measurements of complex concepts (e.g. multi-question oper-
ationalisation) and/or use of sophisticated measurement scales from published
research
(i) Survey structure and question flow
6
• [0] not discussed
• [1] brief justification of survey structure
• [2] clear and concise discussion of survey design in relation to question flow and
potential bias
2. Survey implementation [8 marks]
Brief comments should be provided where appropriate to help students improve their sur-
veys.
(a) Survey introduction
• [0] not provided
• [1] basic description of topic
• [2] clear description, bias free
(b) Question wording
• [0] some unclear, confusing or open questions
• [1] mostly clear, good inclusive language, but some minor clarity issues
• [2] all clear, simple and specific, good inclusive language
(c) Question relevance and focus
• [0] poor relevance, too few or too many questions
• [1] mostly good but some irrelevant or unfocused questions
• [2] all relevant, focus on recent past, relate to appropriately scoped research
questions
(d) Question flow
• [0] confusing order, or obvious bias
• [1] reasonable order, some potential bias
• [2] logical order avoiding bias
7
Topic 5
Survey data collection (formative)
Once you have received feedback on Coursework 1 you should implement any improvements to
your project specification and survey design.
When you are confident that your survey will collect the data that you need to investigate your
research questions (this is crucially important!) publish your survey and start collecting data.
We recommend leaving your survey open between one and two weeks, but you are free to exer-
cise your judgement based on your individual circumstances.
8
Topic 10
Survey project final report (CW2)
[70%]
10.1 Assignment specification
Produce a final report, in the form of a Jupyter Notebook, of your data visualisation-led investiga-
tion started in Coursework 1. Your project must be based on analysis of data you have collected
yourself from a survey and public data found online.
If your project has changed from your initial Project Specification, that is fine. Discuss any
changes and the reasons behind them in the evaluation section of your report.
• You must write your report as a Jupyter notebook using inline markdown (see provided
template).
• You must submit all work in a single ZIP file.
• Your ZIP file must include:
– your notebook (ipynb)
– all survey data (csv, xlsx, ods files etc., data can be anonymised)
– a copy of the public data used in your analysis
– any supplementary scripts
– a copy of your final survey as a PDF file.
• The maximum word limit is 4500 words (suggested range 3500–4500 words).
• Include any supplementary information not essential to the main body of the report as
appendices. References and appendices do not count towards the word limit.
• No marks will be directly awarded for material submitted in appendices.
• No marks will be awarded for analysis discussion submitted as comments in code cells.
• See provided template notebook for how to count the number of words in your notebook.
10.2 Report guidelines
Reports should include discussion of the following points.
1. Research topic and background [15%]
• introduction
– overview of topic
– relevant news or research articles
– research objectives and motivation
– overview of key findings
• research question(s)
9
– population and sampling method
– explicitly stated research question(s)
– scope (should be appropriate for the assignment)
• domain concepts
– clearly define important terms and concepts in the study
2. Data sources [5%]
• briefly explain your survey methodology
• briefly explain how you use external data in this project
– where/how did you find it?
– how/why was the data initially collected?
– are the any ethical or legal issues?
• critically evaluate your data, is the data trustworthy, and valid for your purposes?
3. Data overview and pre-processing [10%]
• data types and pre-processing
– brief description of variables and data types
– describe and justify data cleaning and pre-processing (i.e. tidy data)
– handing of missing or erroneous data
• data summary statistics
– number of survey responses
– number of observations in external data
– summary of demographics and key variables
– use of tables or easily understandable quantities in prose
4. Analysis [50%]
• visualise individual variables
• visualise relationships between variables
• aim for high quality explanatory visualisation that describe or tell a story about the
behaviour or phenomena under investigation
• marks will be awarded for (see rubric for more detail):
– appropriate plots for variable data types
– presentation quality
– visual communication
– methodical data visualisation process
5. Conclusion and evaluation [10%]
• summarise key findings
• future directions
• evaluate your process and visualisations
• things to improve and/or pointers to future research
6. Code [10%]
• all python code should be submitted in your notebook (.ipynb file)
• supplementary scripts (.py files) that are called from within your notebook can be
used and must also be submitted
• code should be legible, with brief comments
• re-using and adapting code you find in documentation or elsewhere online is OK, but
sources must be attributed correctly (web link and date accessed)
• re-using and adapting code covered during the module is encouraged
• make sure all code runs correctly prior to submission
10
10.3 Rubric
1. Research topic and background [15 marks]
(a) Introduction / topic overview
• [0] no introduction
• [1] topic defined and brief overview of the analysis undertaken
• [2] clear and concise overview of the report, summarising its structure and key
findings
(b) Background and context
• [0] no background
• [1] brief discussion of related wider issues
• [2] clear and concise discussion of related research, news stories or other critical
insights
(c) Motivation and objectives
• [0] no motivation or objectives
• [1] key motivations and objectives briefly discussed
• [2] clear and concise rationale motivating the study and potential impact
(d) Sample and population
• [0] not discussed
• [1] sample and population stated
• [2] wider discussion including sampling methods and potential bias
(e) Research question(s)
• [0] Not stated explicitaly
• [1] Attempt to construct research questions, but some inconsistency or lack of
focus
• [2] basic set of coherent research questions
• [3] discussion of scope, which is appropriate for the duration and context of this
assignment
• [4] clear and well-designed set of research questions with potential to generate
novel insight within the scope of the study
• [5] research questions informed by previous research or in conjunction with a
theoretical framework
(f) Domain concepts
• [0] not discussed
• [1] key terms/concepts briefly defined
• [2] comprehensive set of domain tems/concepts discussed, citing previous re-
search where appropriate
2. Data sources [5 marks]
(a) Survey data
• [0] not discussed
• [1] basic discussion of how survey was designed in relation to research questions
• [2] rigorous survey methodology, justification of key decisions and discussion
of bias
(b) External data
11
• [0] not discussed
• [1] states the source of the dataset and how it was found
• [2] discussion of trustworthiness and validity (e.g. how the data was initially
collected)
• [3] thorough critical assessment of the data, discussing potential issues or bias
3. Data overview and pre-processing [10 marks]
(a) Data types and pre-processing
• [0] not discussed
• [1] description of key variables and data types
• [2] discretionary mark
• [3] discussion of data cleaning and tidy data
• [4] discretionary mark
• [5] reflective discussion of response data and appropriate handling of missing,
‘other’, or erroneous data
(b) Data summary statistics
• [0] no tables or summary stats used
• [1] number of responses stated
• [2] basic tables or inline descriptive statistics for key variables describing statis-
tical quantities in simple language
• [3] appropriate use of cross-tabulation and sorting
• [4] discretionary mark
• [5] highly effective communicative tables showing more advanced data process-
ing such as grouping, aggregating, filtering or normalisation
4. Analysis [50 marks]
(a) Appropriate plots for variable data types
• [0] many highly inappropriate visualisations, e.g. using line graphs for non-
sequential data, pie charts with many categories, or pointless use of 3D
• [2] some inappropriate visualisations for certain variables
• [4] discretionary
• [6] appropriate univariate visualisations of all data types (nominal, ordinal and
numerical)
• [8] discretionary
• [10] appropriate multivariate visualisations across a range of different data type
combinations
(b) Presentation quality
• [0] poor quality and lack of attention to detail
• [2] inconsistent titles and/or axes labelling, screenshot/low resolution images,
unreadable labels or occluding visual elements
• [4] discretionary
• [6] consistent titles/figure captions and labels, attention to spacing and mean-
ingful use of colour
• [8] discretionary
• [10] professional level presentation quality, immaculate plots with no extraneous
or obscuring details
(c) Visual communication
• [0] many meaningless or pointless plots
12
• [5] some effective simple plots, but also some confusing or misleading visualisa-
tions
• [10] consistent high quality univariate plots, with some effective multivariate
plots
• [15] good range of univariate and multivariate plots, each able to effectively
communicate a strong message
• [20] discretionary
• [25] consistent highly efficient visual communication requiring little or no ex-
planation
(d) Methodical data visualisation process
• [0] disorganised approach, no clear method to the analysis and no coherent story
presented
• [1] some attempt to explore the data methodically and to construct a basic nar-
rative
• [2] discretionary
• [3] clear evidence of direction in exploratory analysis and findings presented
clearly and logically related to stated research questions
• [4] discretionary
• [5] analysis is well planned, executed and presented, leading to interesting in-
sights that are conveyed within a clear and logical narrative
5. Conclusion and evaluation [10 marks]
(a) Conclusion
• [0] no conclusion
• [1] superficial conclusion listing key findings
• [2] discretionary
• [3] discussion of findings in relation to research questions
• [4] discretionary
• [5] clear and concise discussion of main findings in relation to research ques-
tions, scope and possible impact
(b) Evaluation
• [0] no evaluation
• [1] superficial discussion of problems
• [2] discretionary
• [3] insightful discussion of problems, solutions and what could have been im-
proved
• [4] discretionary
• [5] insightful and honest reflection on the aims, process and execution of the
study, and pointers to possible future directions of research
6. Code [10 marks]
(a) Code
• [0] missing code
• [2] python code for each visualisation
• [4] discretionary
• [6] code is well commented and make idiomatic use of Python data science li-
braries (i.e. using the APIs correctly results in fewer lines of code, which is
generally better)
• [8] discretionary
13
• [10] code is well commented and logically structured with minimal copy-and-
paste code, ensuring that the process of analysis is transparent and reproducible
14