代写代考 Pandas_Visualizations

Pandas_Visualizations

Visual Analytics – Pandas Visualizations¶
Pandas has a built-in visualization library that builds off matplotlib. Here are some examples of what you can do with it.

Install necessary libraries¶

#!pip install numpy
#!pip install pandas
#!pip install matplotlib

Import necessary libraries¶

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

Import data¶

df1 = pd.DataFrame(np.random.randn(50,4), columns=[‘A’,’B’,’C’,’D’])
df1.head()

0 0.011407 -2.573139 0.629844 0.863108
1 1.000630 0.093029 1.749768 0.904575
2 0.510490 1.356406 1.458783 0.225151
3 -0.395484 0.294955 -0.902467 -0.747303
4 -0.542393 -0.740049 -0.022740 0.376725

df2 = df1 + 10
df2.head()

0 10.011407 7.426861 10.629844 10.863108
1 11.000630 10.093029 11.749768 10.904575
2 10.510490 11.356406 11.458783 10.225151
3 9.604516 10.294955 9.097533 9.252697
4 9.457607 9.259951 9.977260 10.376725

Matplotlib has several style sheets that can be used to alter the appearance of a plot. Just import matplotlib and use plt.style.use() prior to drawing a plot.

plt.style.use(‘bmh’)

Plot Types¶
There are several plot types built-in to pandas:

df.plot.area
df.plot.barh
df.plot.density
df.plot.hist
df.plot.line
df.plot.scatter
df.plot.bar
df.plot.box
df.plot.hexbin
df.plot.kde
df.plot.pie

These can also be called using the kind argument with plot, e.g. for hist, df.plot(kind=’hist’). To make other plots with this style syntax, just set kind equal to one of the key terms in the list above (e.g., ‘box’,’barh’, etc.)

Histogram¶
Use df[‘col_name’].hist() to plot a histogram of count values.

df1[‘A’].plot.hist()

df1[‘A’].plot.hist(rwidth=0.9) # set width of each rectangle

df1[‘A’].plot.hist(edgecolor=”black”) # set edge color to “black”

plt.style.use(‘dark_background’) # change style

df1[‘B’].hist()

plt.style.use(‘ggplot’)

df1[‘A’].plot.hist(bins=50)

df2.plot.area(alpha=0.4)

df2.loc[0:10].plot.bar()

Stacked Bar plots¶

df2.loc[0:10].plot.bar(stacked=True)

Line Plots¶

df1.plot.line(y=’B’,figsize=(12,3),lw=1)

Scatter Plots¶

df1.plot.scatter(x=’A’,y=’B’)

Color based on another column¶
You can color data to represent a third axis by using the ‘c’ argument. Here we color the points based on the ‘C’ column

df1.plot.scatter(x=’A’,y=’B’,c=’C’,cmap=’coolwarm’)

Size based on another column¶
You can size datapoints to represent a third axis by using the ‘c’ argument. Here we size of the points are based on the ‘C’ column.

with np.errstate(divide=’ignore’,invalid=’ignore’,over=’ignore’,under=’ignore’):
df1.plot.scatter(x=’A’,y=’B’,s=df1[‘C’]*200)

C:\Users\roman\anaconda3\lib\site-packages\matplotlib\collections.py:922: RuntimeWarning: invalid value encountered in sqrt
scale = np.sqrt(self._sizes) * dpi / 72.0 * self._factor

df2.plot.box()

Hexagonal Bin Plot¶
This for useful for bivariate data.

df3 = pd.DataFrame(np.random.randn(2000, 2), columns=[‘A’, ‘B’])
df3.plot.hexbin(x=’A’,y=’B’,gridsize=25,cmap=’Oranges’)

Kernel Density Estimation plot (KDE)¶

df2[‘B’].plot.kde()

df1.plot.density()

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts