Week 1 Notes
Note
Keep an eye weekly pages as they might be updated throughout the week.
Week 1 Overview
Completing Assignment 1 will require that you start to build a fundamental understanding of concepts that are a bit more complex than what you have learned so far. If you haven’t already, be sure to read the overview for Assignment 1 and watched the lectures posted below.
Quick Links:
Lecture Materials Live Class Recordings
Lecture Materials
Lectures for Week 1
Exceptions and Files Recursion
Some Notes on Testing Modules
Exceptions and Files
Videos
Exceptions and Files Lecture Part 1 Exceptions and Files Lecture Part 2
Recursion Videos
Recursion Lecture
Notes on Recursion
Note
This is a loosely adapted version of the recursion lecture. I say quite a bit more in the actual lecture, so I encourage you to watch. However,
consider this your companion to the video and reference. In this lecture we are going to talk about recursion.
Let’s start with an example. Take the onion, as you might imagine every onion develops differently. They are different sizes and they have different amounts of outer and inner layers. So each time we peel a new onion, the work effort to remove all of the layers down to the core will be different. If we think of the action required to remove one outer layer of an onion as a single operation, then one onion might require 10 peel operations. Another might require 15. And another might only need 5.
Contents
Week 1 Overview Lecture Materials
Exceptions and Files Recursion
Some Notes on Testing Modules
Live Class Recordings
FDP ot tnirP
So given a basket of onions to peel, as you might imagine, if we were to peel each onion by hand, that would be a lot of work! But, perhaps we can automate the work to save us some time, effort, and the inevitable onion tears…
So let’s imagine that the onion is a Python object that we can write some code to peel.
Note
Don’t worry too much about the Onion object. You can take a look at by expanding the following section, but for now, just assume that the
Onion object has a random number of layers, can remove layers one at a time, and can report wether or not it still has layers.
The countlayers() function looks like a pretty good way to go about peeling the layers off of our onion object, right? In fact, it is! Although the onion is not the perfect metaphor for recursive properties, it provides us with a constrained conceptual model that we can build upon.
Okay, so let’s talk about what we mean when we say recursive.
In programming languages, a recursive function call means that you call a function from within that same function. So you can see what that looks like in this pseudo code here:
Take a minute and think about why the pseudo code here is problematic.
Imagine what would happen if we were to run the main function right now?
We would actually receive a RecursionError exception. Effectively what we did was put the program into an in nite loop! That’s bad. So when we write recursive functions, we need to take care that our code contains a terminal condition, something to ensure that at some point the recursive call will end.
Let’s take a look at the onion peeler again, this time we will make some modi cations to apply recursive principles rather than nested loops.
def countlayers():
onions = Onion(), Onion(), Onion() layers = 0
for onion in onions:
while onion.is_layer():
onion.removelayer()
layers += 1
print(layers)
def recurse():
recurse() # <- Recursive function call
def main():
recurse() # <- Normal function call
def peel(onion, layer_count) -> int: onion.removelayer()
if onion.is_layer():
return peel(onion, layer_count + 1) # recursive call
else:
return layer_count
def countlayers_recursively():
onions = Onion(), Onion(), Onion() layers = 0
for onion in onions:
layers += peel(onion, 0)
print(layers)
So the code we have written here, is probably not the best case for demonstrating the value of recursion, but I think by approaching it this way, internalizing the goal of using recursive logic is a little easier.
Fortunately, the onion is not a very complex structure, so the advantages of recursion are less apparent here. So let’s think of another metaphor that is complex.
How about a tree?
Slightly more complex, right? So let’s say you want to count the leaves on a tree…think about how you might do that without recursion. Each branch has the possibility to contain in nite branches with in nite leaves! Of course, a real tree is constrained by physical and environmental conditions, so there are some nite bounds, but a computational tree, given enough memory, is limitless.
Let’s take a more practical look using nested lists.
Note
sum functions adapted from Alex Thornton
def simple_sum(num_list: [int]) -> int: total = 0
for n in num_list: total += n
return total
print(simple_sum([1,2,3,4,5,6]))
def list_sum(num_list: [[int]]) -> int: total = 0
for l in num_list: for n in l:
total += n return total
print(list_sum([[1],[2,3],[4,5,6]]))
def complex_sum(num_list: [int or [int]]) -> int:
total = 0
for obj in num_list:
if type(obj) == list:
for n in obj: total += n
else:
total += obj # if not a list, must be integer
return total print(complex_sum([1,2,[3,4],[5],6]))
Running each of these functions right now, is probably unnecessary. You can assume that each will return a sum of the integers passed as function parameter. But, feel free to take a minute to run each if you would like.
Rather, pay attention to the increasing complexity of each function as we increase the complexity of our list parameter. Notice how each parameter creates additional rules to take into account, and how each function requires additional logic to handle the complexity.
By the time we get to complex_sum, we are working with two nested levels and a single integer. But, like the tree with in nite branches and leaves, what happens when we need to add a third level of nested lists? How about a fourth level?
This is where recursion becomes truly invaluable. We can replace all three of the sum functions here and any additional functions required to support deeper nesting, with a single recursive function.
Pretty cool, right?
Okay, one nal thought. Guess what else is like a tree with branches or an in nitely nestable list (toggle to reveal the answer)?
Some Notes on Testing
Although We will be diving into testing more formally around the mid-point of the quarter, I wanted to give you some notes on how to think about testing your code. We can discuss the topics here in a bit more detail after quiz in our next class.
Notes
Now that you have assignment 0 wrapped up, let’s take a minute to re ect on the experience.
Reviewing the discussions many of you have had over the past week on Zulip, it looks like you are starting to recognize the complexity that goes into writing even a small program.
Interpreting and implementing program requirements is an essential part of the programming process that will become more familiar to you as you work through this course and many others. However, even with practice, you will nd that it is important to develop strategies to reduce the uncertainty that accompanies the application of requirements.
In a way, the validity checker for assignments 0 and 1 “tests” your program for you. It offered some sense of con dence that your program was functioning according to the assignment requirements. For the remaining assignments in this class, you will be responsible for checking the validity of your programs. So in this lecture, I want to talk about one of the ways that programmers go about accomplishing this goal.
In large programming projects where teams of programmers work together to write software, the practice of writing tests is quite common. In fact, there are programming paradigms such as Agile, Cleanroom, Spiral, and Waterfall that have integrated testing directly into their methodologies. We won’t be learning about paradigms and models in this class, but it’s important to recognize how pervasive testing is throughout the software development industry. As you might imagine, this pervasiveness exists because the process of writing tests against your code, works. Tests can signi cantly reduce the development time, code complexity, and bug tracking.
So how do we start?
Well, the rst thing we do is create a hypothesis about how our program and the functions within it, are expected to behave. For example, one test might be to input a value and observe the output. Does the output align with the expectations set in the program requirements? If so, then the test can be said to ‘pass.’ If it doesn’t, then obviously the test fails, but how do we determine why? This was a source of confusion for many of you while using the assignment 1 validity checker as the validity checker could only report a failure when the program did not return an expected output. A better test might have examined the input and output at the level of functions rather than the program.
Testing individual functions, however, does require some planning. If the functions in your program perform many operations, it can be dif cult to con gure a test that can be relied upon. So one of the bene ts of thinking about tests before or in tandem with your program design is that the responsibilities of your functions will inevitably be scoped to a manageable size.
So how do we test at the level of functions?
Let’s take, for example, a program that manages an email list. We’ll assume that one of the requirements for this program is to support removing certain email addresses from the mail list.
We’ll start by adapting this small function written by Alex Thornton for ICS 32. The ‘remove_from’ function accepts two parameters: a list and a value. It returns the same list, except without the passed value, if it exists.
At least, that’s what we hope it does. Rather than assume it does and continue to build our email list management program, we should test it rst:
def remove_from(the_list: list, value) -> list: new_list = []
for element in the_list:
if element != value:
return new_list
new_list.append(element)
>>> emails = [“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
>>> remove_from(emails, “user2@example.com”)
[“user1@example.com”, “user3@example.com”, “user4@example.com”]
>>> emails
[“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
>>> remove_from(emails, “user4@example.com”)
[“user1@example.com”, “user2@example.com”, “user3@example.com”]
>>> emails
[“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
def recursive_sum(nested_list) -> int: total = 0
for obj in nested_list: if type(obj) == list:
total += recursive_sum(obj) # <- recursive call else:
total += obj # if not a list, must be integer return total
print(recursive_sum([1,2,3,4,5,6]))
print(recursive_sum([[1],[2,3],[4,5,6]]))
print(recursive_sum([1,2,[3,4],[5],6]))
So far so good. We start with a test list, then pass some test conditions to the function to see if it returns the results we expect. Here, after each call to remove_from we receive a new list with the value we passed to it missing. But have we tested every possible condition? What happens if a passed value does not exist?
Is that what we want to happen? What if our program needs to con rm that a removal actually occurred? Well, we could compare emails to the results, or we could give that responsibility to the remove_from function:
So now we have a function that will raise an exception if the passed value does not exist, providing us with a convention (exceptions) that we can use to handle this new condition.
Can you think of any others? What happens if the email list has multiple emails with the same address? Currently, all instances of value will be removed from the new list, but what if we only wanted to remove duplicates? By writing tests against our functions, we expose conditions that we might not otherwise think to consider.
Now, having to write create these tests cases each time we want to test a program can quickly become unwieldy. Fortunately, Python gives us a way to automate much of the work!
The assert statement is a convenient shortcut for inserting debugging tests into your program. Here we test a normal case, or case that represents typical behavior for the remove_from function:
And here we test an error case, or case that generates a result we don’t expect:
If both assertions pass, then we should not expect to see anything from the shell when running the program. However, if an assertion fails, the debugger will notify you of the failure:
We will get into the more formal test driven approach to software development in a few more weeks, once you have had a chance to experience working with larger, more complex programs. But for now, focus on thinking about the how the code you write should behave, and then write some assertion statements to con rm that it behaves as you expected. I think you will nd that new edge conditions will arise that you might not have otherwise considered. And remember, testing is not about quantity! A program with 100 tests is not necessarily better than one with 10. Your goal should be to identify different conditions worth testing, not variations on the same test.
Modules
In upcoming assignments, you will be required to start organizing your code a bit more than you have for a0 and a1. One way to organize code is to divide it into multiple les, or modules. In this lecture, we’ll be focusing on how you should go about ful lling this requirement.
The code that we write to create a program in Python is stored in plain text les. Aside from some common conventions such as le extensions (.py), syntax, and indentation, Python les are no different then any other plain text le that you might write. So far, you have probably written most of your Python programs in a single le with a .py extension. Generally speaking, this is a good practice. Having all of your code in a single le simpli es things quite a bit. When all of your code (and tests!) is in one location you don’t have worry about connecting multiple les. However, as your programs grow in complexity and size, you will discover that managing all of your code in a single le quickly turns into a time consuming challenge. So let’s turn to the lecture to learn how we can improve the readability and reuse of our code with modules.
>>> emails = [“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
>>> remove_from(emails, “user5@example.com”)
[“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
def remove_from(the_list: list, value) -> list: new_list = []
for element in the_list:
if element != value:
new_list.append(element) else:
return new_list
raise ValueError(‘value not found in list’)
>>> emails = [“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”]
>>> remove_from(emails, “user5@example.com”)
Traceback (most recent call last):
File “maillist.py”, line 39, in
remove_from(emails, “user5@example.com”)
File “maillist.py”, line 33, in remove_from
raise ValueError(‘value not found in list’)
ValueError: value not found in list
assert remove_from([“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”], “user4@example.com”) == [“user1@example.com”, “user2@example.com”, “user3@example.com”]
try:
remove_from([“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”],
“user5@example.com”)
assert False, “ValueError should have been thrown” except ValueError:
pass
>>> assert remove_from([“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”], “user4@example.com”) == [“user1@example.com”, “user2@example.com”, “user3@example.com”]
Traceback (most recent call last):
File “maillist.py”, line 38, in
assert remove_from([“user1@example.com”, “user2@example.com”, “user3@example.com”, “user4@example.com”], “user4@example.com”) == [“user1@example.com”, “user2@example.com”, “user2@example.com”]
AssertionError
Videos
Modules Lecture
Working with Modules
In Python, a module is nothing more than a le with the .py extension. However, unlike the programs you have written so far, a module does not produce output when executed directly from the shell. Let’s take a look at the following example:
If we were to run this program, basicmath.py, on the shell we would not receive any output from the Python interpreter. We would, however, be able to call the modules functions directly:
If we attempt to call these functions before we load basicmath.py the interpreter will return a Traceback that tells us that ‘add’ (or ‘subtract’) is not de ned. So what is the difference? Well, by executing the basicmath.py module rst we have loaded it into our program, which in this case is the Python shell. The Python programs that we write in .py les, operate in much the same way. Let’s write a small program that makes use of the basicmath.py module:
Notice that the rst line of code is an import statement followed by the name of our basicmath python module. We don’t need to specify the .py extension as it is implied that we are importing Python les. So, in the same way that you “imported” the Path module into your program for a1, we are able to import python les that we write.
Scope
Another important convention to consider in this code is how the add and subtract functions are accessed. You will see that when the variables are passed to the basicmath functions, the name of the module must rst be referenced. Python, as well as most programming languages, operate under a concept called scope. Scope refers to the availability of classes (we will cover these in more detail later in the quarter), functions, and variables at a given point of execution in a program. In the example above, notice how the variable result is used in the add and subtract functions of the basicmath module as well as the mathrunner program. We can reuse variable names in this way due to Python’s scoping rules. result is locally scoped to the add and subtract functions, meaning that it only exists in the moment that each of those functions is being called. Now, if we wanted to make use of result outside of the add and subtract functions, we could modify the basicmath program like so:
# basicmath.py
def add(a: int, b: int) -> int: result = int(a) + int(b)
return result
def subtract(a: int, b: int) -> int: result = int(a) – int(b)
return result
>>> add(2, 5)
7
>>> subtract(9, 3)
6
# mathrunner.py
import basicmath
a = input(“Enter your first value: “)
b = input(“Enter your second value: “)
result = basicmath.add(a,b)
print(result)
result = basicmath.subtract(a,b)
print(result)
# basicmath.py
result = 0
def add(a: int, b: int) -> int: result = int(a) + int(b)
return result
def subtract(a: int, b: int) -> int: result = int(a) – int(b)
return result
def get_last_result() -> int: return result
# mathrunner.py
import basicmath
a = input(“Enter your first value: “)
b = input(“Enter your second value: “)
result = basicmath.add(a,b)
print(result)
result = basicmath.subtract(a,b)
print(result)
print(basicmath.get_last_result())
Which will output:
Wait, why is the value of result still zero? Well, even though the variable result is scoped outside of both functions, or global, Python still gives precedence to the local instance of result. So the add and subtract functions create a local instance of result while get_last_result, in absence of any instantiationofalocalresultinstance,deferstotheglobalinstanceofresult.Tousethegloballyscopedvariable,weneedtotellPythonthatisour intention:
# basicmath.py
result = 0
def add(a: int, b: int) -> int: global result
result = int(a) + int(b) return result
def subtract(a: int, b: int) -> int: global result
result = int(a) – int(b) return result
def get_last_result() -> int: return result
Which will output:
There we go! So by setting the scope of the result variable inside each function to global we change the scope from local to global, allowing us to store a value in a shared variable. So scope can be thought of as consisting at two levels: global and local. The Python interpreter assumes a local rst stance when interpreting program code and when a local instance is unavailable it assumes global.
Important
Scope refers to the availability of a particular object in a program. That availability is interepreted locally rst and then globally.
Definition Access
When writing modules that contain many functions performing many different types of operations you will nd that you need some of those functions to perform operations within your module, but you might not necessarily want those functions to be called outside of your module. In programming terms, we describe these two types of functions as private and public. While some programming languages do provide modi ers for declaring this intention at the point of compilation, Python does not. Rather, all functions in Python are public by default. There is no way to prevent another program from calling functions in your module that should not be called!
To get around this feature, and the absence of such modi ers is considered a feature in Python, programmers have adopted some formal conventions for communicating intent. When writing functions that you don’t intend to be used outside your module, they should be prepended with a single underscore (_) character. This convention is used for functions, constants, and classes in Python:
We will be looking more closely at your use of naming conventions like private and public access as we move forward with assignments this quarter. Not only do we want to you to continue to adopt formal Python programming conventions, but thinking about access will help you to improve the structure of your modules. So start considering what operations in your program need to be conducted outside the scope of your module and more importantly, what code should not be accessed.
Namespaces
As your programs grow in size and complexity the importance of understanding scope will become increasingly relevant. Without the ability to scope the objects that we create, every Python programmer would have to create unique object names! Imagine if object naming behaved like Internet domain names, each one having to be recorded in a central registry to avoid duplicates. Through scope, and a convention called namespaces, most programming languages can avoid this unnecessary complication. A namespace is a dictionary-like collection of symbolic names tied to the object in which they are currently de ned.
Namespaces are not a perfect solution, but they serve well to help programmers identify and differentiate the modules and programs that they write. It is important to avoid using known namespaces, particularly those that are used by Python’s built-in objects. For example, you should avoid creating an object named int or tuple because they already exist in every instance of a Python program. Imagine if we had named our basicmath module math, which is part of the standard library! To put it more concretely, what if we also had an add function in our mathrunner.py program? Namespaces allow us to differentiate between like named objects:
Enter your first value: 5
Enter your second value: 2
7
3
3
# Specifying private intent
def _myprivatefunction():
_myprivateconstant = “525600” # minutes in a year class _myprivateclass():
Enter your first value: 5
Enter your second value: 2
7
3
0
The two add functions being printed will produce distinctly different results, namespaces allow us to differentiate between them in the code we write. Python also provides us with a naming convention for the modules that we import. Notice how the import statement and subsequent use of namespace differ in this revised version of mathrunner.py. This can be useful when the modules you are importing make use of a longer namespace that you might not want to type every time it is used.
Python makes use of four types of namespaces, listed in order of precedence:
Local
Inside a class method or function
Enclosed
Inside an enclosing function in instances when one function is nested within another
Global
Outside all functions or class methods
Built-In
The built-in namespace consists of all of the Python objects that are available to your program at runtime. Try running dir(__builtins__) on the IDLE shell to see the full list of built-ins. Many of the names should look familiar to you.
I have included the following tip originally written by Alex Thornton, who also teaches this class at UCI. I think it’s a great explanation of how modules make use of the __name__ variable in Python and wanted to share it with you.
Alex Thornton’s Explanation of Executable Modules
When you load a module in IDLE and execute it (by pressing F5), the code in that module is executed. If it generates any observable result, like printing output or asking the user for input, you’ll see that in the interpreter. Otherwise, you’ll see a standard >>> interpreter prompt, and all of the module’s de nitions will now be available — so, for example, if the module de nes a function, you could now call it.
As we’ve seen, modules in Python have names. We can check the name of the currently-executing module at any time by accessing the global variable __name__, which is present in every module.
In general, the names of modules are indicated by their lenames; a module written in a le boo.py has the name boo. But there’s one special case that we haven’t talked about: When you execute a module in Python (i.e., by pressing F5 in IDLE), it’s given the special name __main__ while it runs. (Anything you de ne in the Python interpreter will be considered part of the __main__ module, as well.)
This little fact can be a useful way of differentiating between whether a module has been executed (i.e., is it the “entry point” of a program?) or whether it’s been imported. In general, importing a module shouldn’t suddenly cause things to happen — output to be generated, input to be read, and so on — but should, instead, simply make de nitions available that weren’t previously. Even executable modules, the ones we expect to be able to execute as programs, should behave differently when imported than they do when executed. s To facilitate this distinction, we can simply check the module’s name by accessing the __name__ variable. If its value is __main__, the module has been executed; if not, the module has been imported. So, in an executable module, we typically write the code that causes things to happen when the module is executed in an if __name__ == ‘__main__’: block, so that it will only happen if the module has been executed. Meanwhile, if the module is imported, its de nitions will become available to another module, but there will otherwise be no effect.
Live Class Recordings
Note
Quiz recordings and results will be posted within 24 hours of recording.
Monday Live Quiz and Discussion Wednesday Live Discussion
By Mark S. Baldwin © Copyright 2021.
# mathrunner.py
import basicmath as m def add(a, b):
print(a + b)
a = input(“Enter your first value: “)
b = input(“Enter your second value: “)
print(m.add(a,b))
print(add(a,b))