代写代考 COMP2420/COMP6420 - Introduction to Data Management, Analysis and Security¶

COMP2420/COMP6420 – Introduction to Data Management, Analysis and Security¶
Live Coding Lecture – Independent Sample T Tests¶

# Important Imports
# MAKE SURE YOU RUN THIS CELL FIRST

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# hypothesis testing imports
from scipy import stats

# ignore warnings
import warnings
warnings.filterwarnings(‘ignore’)

dataset = sns.load_dataset(‘iris’)
dataset.head()

sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

dataset.species.unique()

array([‘setosa’, ‘versicolor’, ‘virginica’], dtype=object)

setosa = dataset[dataset.species == ‘setosa’]
versicolor = dataset[dataset.species == ‘versicolor’]

plt.figure(figsize=(12,8))
ax1 = sns.distplot(setosa.sepal_length)
ax2 = sns.distplot(versicolor.sepal_length)
plt.axvline(np.mean(setosa.sepal_length), color = ‘b’, linestyle = ‘dashed’, linewidth = 5)
plt.axvline(np.mean(versicolor.sepal_length), color = ‘orange’, linestyle = ‘dashed’, linewidth = 5);

Null Hypothesis: The means of both populations are equal. (The two population’s sepal length are from the same species)
Alternate Hypothesis: The means of both populations are not equal.(The two population’s sepal length are not from the same species)

setosa_sample = np.random.choice(setosa.sepal_length, N)
versicolor_sample = np.random.choice(versicolor.sepal_length, N)

t_value, p_value = stats.ttest_ind(versicolor_sample, setosa_sample)
print(f”T Value obtained is {t_value}.”)
print(f”P Value obtained is {p_value}.”)

T Value obtained is 6.12675097082924.
P Value obtained is 3.814930721300677e-07.

A large t-score tells you that the groups are different.
A small t-score tells you that the groups are similar.

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts