COSC 2673/2793 | Machine Learning
Week 1 Lab Exercises: **Introduction to Python**
We will be using the Python programming language for all lab exercises & assignments in this course. Python is a great general-purpose programming language on its own, but with the help of a few popular libraries (numpy, matplotlib, scikit-learn) it becomes a powerful environment for machine learning.
We expect most students to be familiar with python. If you are unfamiliar with Python, you will need to get yourself up to speed. However, don’t stress too much. This course doesn’t focus on programming, so you won’t need to learn too much Python to be able to complete the labs and assignments.
The following lab exercise will provide you a quick crash course on both the Python programming language basics so that you can get yourself up to speed.
Note:
This tutorial is NOT designed to give you a comprehensive understanding of python and numpy. It is aimed at refreshing your python skills and giving you a start. If you like to learn python from scratch, a LinkedIn learning course is provided at Canvas week 0 module)
Before starting the lab
• You need to install anaconda on your local machine or log in to AWS classroom setup for the course (Instructions at Canvas week 0 module)
• You Need to be familiar with starting an IPython/Jupyter notebook (Instructions at Canvas week 0 module)
Check Python version
In this course we will be using python 3. You can double-check your Python version at the command line after activating your environment by running shell command python –version. In the notebook environment you can run shell commands by adding a ! in front of the command. Run below cell to get the python version for your notebook.
In [ ]:
!python –version
Python version should be 3.5 or above. Please check your installation if the above condition is not meat.
Basic datatypes and operations
Like most programming languages, python has several data types and supports many mathematical and logical operations.
Text Type
str
Numeric Types
int, float, complex
Sequence Types
list, tuple, range
Mapping Type
dict
Set Types
set, frozenset
Boolean Type
bool
Binary Types
bytes, bytearray, memoryview
Numeric Types
Unlike c++ (or many other programming languages) we do not have to define the data type of a variable in python. It is inferred automatically and, the data type of a variable can be printed as:
In [ ]:
x = 3
print(type(x))
We can also perform common mathematical operations on these numeric data types
In [ ]:
print(x) # Prints “3”
print(x + 1) # Addition; prints “4”
print(x – 1) # Subtraction; prints “2”
print(x * 2) # Multiplication; prints “6”
print(x ** 2) # Exponentiation; prints “9”
x += 1
print(x) # Prints “4”
x *= 2
print(x) # Prints “8”
Same can be done with floats
In [ ]:
y = 2.5
print(type(y))
print(y, y + 1, y * 2, y ** 2)
Logical operators
Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols
In [ ]:
t = True
f = False
print(type(t)) # Prints “
print(t and f) # Logical AND; prints “False”
print(t or f) # Logical OR; prints “True”
print(not t) # Logical NOT; prints “False”
print(t != f) # Logical XOR; prints “True”
String support in python
In [ ]:
hello = ‘hello’ # String literals can use single quotes
world = “world” # or double quotes; it does not matter.
print(hello)
print(len(hello))
hw = hello + ‘ ‘ + world # String concatenation
print(hw)
hw12 = ‘%s %s %d’ % (hello, world, 12) # sprintf style string formatting
print(hw12)
Some useful methods in string class for data pre-processing and cleaning
In [ ]:
s = “hello”
print(s.capitalize()) # Capitalize a string; prints “Hello”
print(s.upper()) # Convert a string to uppercase; prints “HELLO”
print(s.replace(‘l’, ‘(ell)’)) # Replace all instances of one substring with another;
# prints “he(ell)(ell)o”
print(‘ world ‘.strip()) # Strip leading and trailing whitespace; prints “world”
**More detailed descriptions on python strings are provided in** [W3Schools Python Tutorial](https://www.w3schools.com/python/python_strings.asp)
Containers
Python includes several built-in data types to store collections of data: lists, dictionaries, sets, and tuples
Lists
List are like arrays and they are very flexible:
• Can be sliced & resized
• Can contain elements of different types.
In [ ]:
list1 = [1,2,3,4] #define and initialize a list
print(list1[0], list1[2])
list1[2] = “three” # change element to a different data type
print(list1[0], list1[2])
print(list1[-1]) # Negative indices count from the end of the list; prints “2”
list1.append(5) # Add a new element to the end of the list
print(list1) # Prints “[3, 1, ‘foo’, ‘bar’]”
#print length of list
print(“Length of list is: “, len(list1))
We can also access several elements of a list at ones using slicing.
In [ ]:
list2 = list(range(10)) # range is a built-in function that creates a list of integers
print(list2)
print(list2[2:4]) # Get a slice from index 2 to 4 (exclusive);
print(list2[2:]) # Get a slice from index 2 to the end;
print(list2[:2]) # Get a slice from the start to index 2 (exclusive);
print(list2[:-1]) # Slice indices can be negative, start to end – 1;
print(list2[1:7:2]) # Get a slice of every other element from index 1 to 6
print(list2[::-1]) # reverse the list order
list2[2:4] = [8, 9] # Assign a new sublist to a slice
print(list2)
Loop over list elements
In [ ]:
temparatures = [30.2, 35.1, 36.6, 33.1]
for temparature in temparatures:
print(temparature+1)
If you want access to the index of each element within the body of a loop, use the built-in enumerate function:
In [ ]:
temparatures = [30.2, 35.1, 36.6, 33.1]
for inx, temparature in enumerate(temparatures):
print(‘Temparature of patient {} is: {:.2f} degrees’.format(inx+1, temparature))
List comprehensions
Can be used to modify the elements of a list
In [ ]:
nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
squares.append(x ** 2)
print(squares)
In python there is an easier & cleaner way of doing this.
In [ ]:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares)
Conditions can also be added to list comprehensions.
In [ ]:
nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares)
**More details on list can be found at** [W3Schools Python Tutorial](https://www.w3schools.com/python/python_lists.asp)
Dictionaries
A dictionary stores a key and value pairs. For example you can use a dictionary to hold the information for a given student.
In [ ]:
d = {‘first name’: ‘John’, ‘last name’: ‘Doe’, ‘age’: 27, ‘ML marks’: 85} # Create a new dictionary with some data
print(d[‘first name’]) # Get an entry from a dictionary
print(‘age’ in d) # Check if a dictionary has a given key
d[‘AA marks’] = 68 # Set an entry in a dictionary
print(d)
print(d.get(‘APT marks’, ‘N/A’)) # Get an element with a default;
print(d.get(‘ML marks’, ‘N/A’)) # Get an element with a default;
del d[‘age’] # Remove an element from a dictionary
print(d) # “age” is no longer a key;
Loop over Dictionaries
In [ ]:
d = {‘first name’: ‘John’, ‘last name’: ‘Doe’, ‘age’: 27, ‘ML marks’: 85}
for key in d:
value = d[key]
print(‘Key `{}` has value `{}`’.format(key, value))
If you want access to keys and their corresponding values, use the items method:
In [ ]:
d = {‘first name’: ‘John’, ‘last name’: ‘Doe’, ‘age’: 27, ‘ML marks’: 85}
for key, value in d.items():
print(‘Key `{}` has value `{}`’.format(key, value))
Similar to list, comprehensions allow you to easily construct dictionaries.
In [ ]:
nums = [0, 1, 2, 3, 4]
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)
Sets
Sets are useful in representing unordered data. Since sets are unordered, you cannot make assumptions about the order in which you visit the elements of the set. It may not be in the order you put elements in.
In [ ]:
animals = {‘red’, ‘green’, ‘red’, ‘blue’}
for idx, color in enumerate(animals):
print(‘#%d: %s’ % (idx + 1, color))
Similar to list, comprehensions allow you to easily construct sets.
In [ ]:
from math import sqrt
nums = {int(sqrt(x)) for x in range(30)}
print(nums)
Tuples
A tuple is an (immutable) ordered list of values. A tuple is in many ways similar to a list; one of the most important differences is that tuples can be used as keys in dictionaries and as elements of sets, while lists cannot.
In [ ]:
d = {(x, y): x*y for x in range(10) for y in range(5)} # Create a dictionary (2D array) with tuple keys
t = (2, 4) # Create a tuple
print(type(t)) # Prints “
print(d[t]) # Prints “5”
print(d[(1, 1)]) # Prints “1”
**More details on tuples can be found at** [W3Schools Python Tutorial](https://www.w3schools.com/python/python_tuples.asp)
Functions
Python functions start with the def keyword. No need to define the output parameters like c++.
Write a function to output if the input number is a whole square of another number.
In [ ]:
def is_whole_square(x):
output = int(round(x ** (1. / 2))) ** 2 == x
return output
Print whole square number between 0 and 100
In [ ]:
whole_squares = [x for x in range(100) if is_whole_square(x)]
print(whole_squares)
We can also define functions that has default parameters
In [ ]:
def is_whole_pwr(x, pwr=2):
output = int(round(x ** (1. / pwr))) ** pwr == x
return output
In [ ]:
whole_pwr2= [x for x in range(100) if is_whole_pwr(x)]
print(whole_squares)
In [ ]:
whole_cube = [x for x in range(100) if is_whole_pwr(x, pwr=3)]
print(whole_cube)
**More details on functions can be found at** [W3Schools Python Tutorial](https://www.w3schools.com/python/python_functions.asp)
Numpy
Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. Let’s start with one dimensional arrays. A numpy 1D-array can be initiated from a python list and can access the elements using square brackets.
Defining 1d-arrays
In [ ]:
import numpy as np
x = np.array([1,2,3,4,5,6,7])
print (x, x.shape, x.dtype)
# can also change the data type
y = x.astype(np.float)
print (y, y.shape, y.dtype)
# access elements with slicing
print(x[1])
print(x[3:5])
We can also initialize multidimensional arrays. Multidimensional arrays can have one index per axis.
In [ ]:
b = np.array([[1,2,3,4],[5,6,7,8], [9,10,11,12]]) # 2 dimentional array
print(b.shape)
print(b[0, 0], b[0, 1], b[1, 0])
# access elements with slicing
print(b[:-1, :1])
# we can use `:` to denote whole column or row
print(b[:-1, :])
Some more important attributes of an ndarray object are:
• ndarray.ndim: the number of axes (dimensions) of the array.
• ndarray.shape: the dimensions of the array.
• ndarray.size: the total number of elements of the array.
• ndarray.dtype: an object describing the type of the elements in the array.
• ndarray.itemsize: the size in bytes of each element of the array.
Numpy also provide methods to create special type of arrays
In [ ]:
a = np.zeros((2,2)) # Create an array of all zeros
print(“a =”, a,’\n’)
b = np.ones((1,2)) # Create an array of all ones
print(“b =”, b,’\n’)
c = np.eye(2) # Create a 2×2 identity matrix
print(“c =”, c,’\n’)
d = np.random.random((2,2)) # Create an array filled with random values
print(“d =”, d,’\n’)
Basic array operations
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.
In [ ]:
a = np.array([20, 30, 40, 50])
b = np.arange(4)
print(b)
c = a – b
print(c)
d = b**2
print(d)
e = 10 * np.sin(a)
print(e)
f = a < 35 print(f) We can also do matrix operations on multidimensional arrays In [ ]: A = np.array([[1, 1],[0, 1]]) B = np.array([2, 3]) C = A.dot(B) print(C) Universal functions NumPy provides familiar mathematical functions such as sin, cos, and exp. Within NumPy, these functions operate elementwise on an array, producing an array as output. In [ ]: B = np.random.rand(3,2) print(B) print("") print(np.exp(B)) print("") print(np.sqrt(B)) print("") C = np.random.rand(3,2) print(np.add(B, C)) Numpy provides many useful functions for performing computations on arrays; one of the most useful is sum In [ ]: x = np.array([[1,2],[3,4]]) print(np.sum(x)) # Compute sum of all elements print(np.sum(x, axis=0)) # Compute sum of each column print(np.sum(x, axis=1)) # Compute sum of each row print(np.mean(x)) # Compute mean of all elements; **You can find the full list of mathematical functions provided by numpy in the documentation** [Mathematical functions](https://numpy.org/doc/stable/reference/routines.math.html) Iterating Iterating over multidimensional arrays is done with respect to the first axis: In [ ]: x = np.random.randint(0,100,size=(3,5)) for row in x: print(row) If you want to go over each element of a multidimensional array you can use: In [ ]: for element in x.flat: print(element) Shape manipulation An array has a shape given by the number of elements along each axis. The shape of an array can be changed with various commands. Note that the following three commands all return a modified array, but do not change the original array: In [ ]: x = np.random.randint(0,100,size=(3,4)) print("x = ", x) print("Flattened x = ", x.ravel()) # returns the array, flattened print("Reshaped x = ", x.reshape(6,2)) print("Tanspose of x = ", x.T) Copies Simple assignments make no copy of objects or their data. In [ ]: a = np.random.randint(0,100,size=(3,4)) print(a, '\n') b = a # a and b are two names for the same ndarray object print(b is a, '\n') # if a is changed it will change b as well a[2,2] = 225 print(a, '\n') print(b, '\n') Shallow copy Different array objects can share the same data. The view method creates a new array object that looks at the same data. In [ ]: a = np.random.randint(0,100,size=(3,4)) print(a, '\n') b = a.view() # a and b share the same data print(b is a, '\n') # but they are not the same object # still if a is changed it will change b as well a[2,2] = 225 print(a, '\n') print(b, '\n') Deep copy The copy method makes a complete copy of the array and its data. In [ ]: a = np.random.randint(0,100,size=(3,4)) print(a, '\n') b = a.copy() # a and b are different objects print(b is a, '\n') # Now if a is changed it will NOT change b a[2,2] = 225 print(a, '\n') print(b, '\n') **A more detailed Numpy tutorial is at** [NumPy quickstart](https://numpy.org/devdocs/user/quickstart.html) and [NumPy: the absolute basics for beginners](https://numpy.org/devdocs/user/absolute_beginners.html) Plotting For plotting graphs and images in python we can use matplotlib library. Below we have a simple example that uses matplotlib.pyplot In [ ]: import numpy as np import matplotlib.pyplot as plt x = np.arange(-5,5,.1) y2 = x**2 y3 = x**3 plt.plot(x,y2) plt.plot(x,y3) plt.xlabel('x value') plt.ylabel('y value') plt.title('Plot functions') plt.legend(['$y = x^2$','$y = x^3$']) plt.grid(True) plt.show() **More details on matplotlib can be found at** [W3Schools Python Tutorial](https://matplotlib.org/tutorials/introductory/pyplot.html)