程序代写代做代考 C LAB 3: DATA VISUALIZATION

LAB 3: DATA VISUALIZATION

Skills: functions, file I/O, data visualization

Background

One of the first steps of working with any data set is to create some sort of global overview, and this usually involves making some sort of visualization of the data. For one dimensional data, a simple plot of the dependent variable (say, amplitude) versus the independent variable (say, time) does the trick. However, multi-dimensional data sets offer more of a challenge. In this exercise, we will look at how to best visualize the relation between three related quantities. The data file PeakSepDataSet.dat contains 8114 lines of 5 columns, described as follows:

Column Description Unit
——– —————————- —————
1 Reference number –
2 Peak separation km s-1
3 Flux shell parameter dimensionless
4 System viewing inclination degrees
5 (ignore)

The visualization that we wish to explore is the relation between peak separation (Δj), shell parameter (_Sj_), and viewing inclination (_Ij_).

Part 1

Use the mplot3d library (tutorial) to plot the data as a 3D scatter plot. Useful axis ranges are 0 to 500 km/s for Δ, 0.9 to 2 for _S_ and 0 to 90 for _I_ (note that these choices do exclude some of the data). You can change the default viewing location via the ax.view_init(elev=,azim=) command. Work with these commands to provide the most informative representation of this data set. Can you describe any general trends? INCLUDE THE 3D SCATTER PLOT AT THE VIEWING LOCATION YOU THINK IS BEST IN YOUR SUBMITTED REPOSITORY.

Part 2

If we plot only two variables in a 2D plot, then we are losing the information contained in the third variable. To get this back, we can change the darkness of the plotted symbol according to the third variable, in matplotlib with the c and cmap arguments to the scatter function. Make this plot where white corresponds to a value of 0 and black to a value of 90 degrees, with a colourbar (the colorbar function) to indicate what the symbol colours mean. If you can figure out how to make the plot with seaborn instead of matplotlib that’s fine too.

Part 3

We’re not restricted to black and white: remake the plot using a colour scheme for the point colours. matplotlib and seaborn both have many different colour schemes to choose from and this tutorial provides lots of information, including choosing schemes that help people who are colorblind.) Don’t forget to include a colourbar in your plot! INCLUDE THE 2D COLOUR PLOT WITH A COLOUR SCHEME THAT YOU THINK WORKS AND THE INCLINATION LEGEND IN YOUR SUBMITTED REPOSITORY.

Part 4

Our 2D colour plot, while an improvement, still has limitations. The most significant arises from overlapping points obscuring features of the data. One way to handle this is to plot averages over regions in the var1-var2 plane instead of individual points. To accomplish this:

(a) Introduce grids in _S_ and Δ. If the grid spacings are _s_ and δ, we have

_Si = S0 + (i − 1)s_ for _i = 1,Ns_

_Δj = Δ0 + (j − 1)δ_ for _j = 1,Nd_

Your grid should span 0 to 2 in _Ns_ steps for _S_ and 0 to 500 km/s in _Nd_ steps for Δ. The intersections of these grids now define cells or boxes in the _S_-Δ plane.

(b) For each (_S, Δ, I_) data point, assign it to one of the cells (or boxes) defined by the grid above. For each box, keep track of the values of _I_ assigned to it.

(c) When all data points have been processed, determine for each box the number of assigned points, and the average and standard deviation of the _I_ values assigned to each box.

(d) Use the pyplot.imshow function to produce three plots in the _S_-Δ plane: the number of points in each box, the average _I_ in each box, and the standard deviation of _I_ in each box. (Hint: make sure you understand how the origin argument to this function works.) Make sure your plots have suitable titles and axis labels.

INCLUDE YOUR CELL-BASED PLOT OF NUMBERS, AVERAGE INCLINATION, AND STANDARD DEVIATION OVER THE _S-Δ_ GRID IN YOUR SUBMITTED REPOSITORY.

THE PYTHON CODE USED TO MAKE ALL THREE PLOTS SHOULD ALSO BE INCLUDED IN YOUR SUBMITTED REPOSITORY AS LAB3.PY.