MODULE 3: VISUALIZATION TECHNIQUES
Introduction
In this module, you will learn how to analyze time series using visualization tools. In particular, you will learn how to interpret line charts, scatter plots, histograms and bar plots, and how to create them using our data software. The different concepts will be presented using the climate data from the previous module.
Learning to create charts is a process that requires practice. You should therefore have your software open as you go through the worked examples. In many places in the module, you will be asked to stop and experiment with what you have learned. It is highly recommended that you practice these skills as you review the material.
Learning Outcomes
Students will be able to do the following:
• Describe the trending and short term behaviour of a time series using line charts.
• Interpret a histogram by characterizing its mode, its degree of uniformity and its degree of asymmetry.
• Characterize the comovement between time series using scatter plots.
• Interpret bar charts to compare a small number of observations at a given point or at different points in time.
• Create line charts, histograms, scatter plots and bar plots using a data software.
Key Terms
• Bar plot: This is a chart used to compare values of a small number of observations. The X-axis represents the observations and the y-axis represents their values characterized by the height of the bars. For example, a bar plot can be used to illustrate the population by province. In this case, each bar is a province and its height represents the population of the province.
• Comovement: This term characterizes the observed relationship between two series. A positive comovement means that the two series are moving in the same direction on average: they go up and down together. A negative comovement means that the two series move in opposite directions on average: when one series goes up, the other goes down. A comovement only characterizes observed relationships. The existence of a comovement does not imply that an actual relationship exists.
• Histogram: This is a particular bar chart used to illustrate the distribution of a series. The range of the series (its minimum value to its maximum value) is divided into intervals and the height of the bars corresponds to the number of observations included in each interval. For example, the histogram of personal income is used to illustrate the income distribution. If we divide the the range of income by intervals of 10 thousand, the height of the first bar would be the number of individuals with income between 0 and 10 thousand, the height of the second would be the number of individuals with income between 10 and 20 thousand, and so on.
• Line chart: This is a chart showing the evolution of a series through time. The X-axis represents time, the y-axis represents the value of the observations and the points are connected by lines.
• Relationship: This term characterizes the link between two variables. A relationship exists if an underlined mechanism links the two variables. The relationship is direct if one variable is causing the other and it is indirect if one variable is causing the other through a third variable. The identification of a relationship requires advanced statistical techniques and/or strong theoretical background. In this course, we are not equipped to identify relationships.
• Scatter plot: This is a chart used to illustrate comovements between two series. The X-axis represents the values of one series and the y-axis represents that values of another series. In general, the points are not connected by lines. However, if we want information about the evolution of the series over time, the points are connected and the date is added on top of each observation.
• Short term fluctuations: This is the behaviour of a series over shorter periods of time. By shorter, we means shorter than the period spanned by our dataset. The types of short term fluctuations will be covered in details in the next module. For now, we define as the movement of a series over a few months, few quarters or few years.
• Trending behaviour: This is the behaviour of a series on average over a long period of time. By long period, we mean the period spanned by our dataset. A positive trend means that the series in increasing on average over the period and a negative trend means that the series is decreasing on average over the period.
• Volatility: This term characterizes the intensity of the short term fluctuations. If two series have the same measurement units and same scale, the one that fluctuates with higher ups and lower downs is more volatle than the other. We cannot compare the volatility of two series with different measurement units and different scale.
Lessons
1. Visualizing Times Series
2. Creating Charts With the Data Software
3. Creating Charts Exercises
Activities and Assignments
• Quiz 1 is due this week. See your Course Schedule for due dates.
Data Files
You may require the following files to complete this module:
Climate_module2.rda
Climate_module2.xlsx
Climate_module3.xlsx
ClimateEX_module2.rda
ClimateEX_module2.xlsx
ClimateEX_module3.xlsx
Rcolor.pdf