Functions, text files, reading data files with Python

Intro and Objectives

We’ll do a few different things this session:

  • We’ll be creating and using functions in Python.

  • We’ll create and use list comprehensions and various string function

  • Next we’ll learn a bit about the structure of text files and tools for working with them

  • We’ll end our crash course into the basics of Python programming by building a simulation model of the famous Monty Hall 3-Door Problem.

  • Finally I’ve included a several more advanced, OPTIONAL, activities about reading csv files, json files and even Excel files with Python. Within these topics I also touch on things like using PyCharm or Spyder for debugging, the conda package management program, and a few other more advanced things.

Readings

  • WToP - pages 41-75

Downloads

Fundamental Activities

We’ll finish our intro to basic Python programming:

  • Creating and using functions

  • List comprehensions

  • Basic string work

Let’s start with learning the very basics of how to create Python functions, test and debug them and document them.

Now we are going to explore an interesting and useful feature of Python known as list comprehensions. We’ll also dig a little bit deeper into string formatting for creating nice output.

The string functions chapter in Whirlwind Tour of Python covers useful string functions and string formatting in Python. The second half of the chapter covers regular expressions which we’ll do in a future session. By the way, there is yet another way to create formatted strings beyond the “old style” % operator and the format method. This newish approach is known as f-strings. A nice comparison of the three approaches can be found at this blog post.

Finally, let’s use our newfound knowledge of the basics of Python to do something kind of fun. I’m sure many of you know about the famous Monty Hall 3-Door Problem made famous on Let’s Make a Deal! and in a Parade Magazine column that set off a firestorm of controversy. Follow the link to play a few rounds of the game so that you understand the rules. It’s well known that the optimal strategy in the classic version of this puzzle is to always switch. It’s a counterintuitive result and is always great fun to discuss (we play every year in my MIS 4460/5460 class).

What we are going to do is to build a Monty Hall 3-Door Problem simulator in Python. Our goal is to create a program that lets us compare the strategies of always switching, always staying with the first door chosen, or randomly switching or staying based on the flip of a fair coin (I’ll leave this last strategy to the proverbial “reader” to do). This will force us to think through how to design multiple functions that can be repeatedly used by a main calling program to run the simulations and compile the results. I’ve included a skeleton Jupyter notebook with some of the code structure already created. Now let’s work through filling in the gaps and getting this simulation program working.

For the remainder of the materials in this section as well as the upcoming session on data wrangling with regular expressions, we will be working with all kinds of text based files. Please take a look at the IntroToTextFiles.pdf document available in the Downloads_ReadingFiles.zip available above. Explore some of the links in that document so that you have a basic understanding of text file formats, text encodings, and text editors.

OPTIONAL Advanced Activities

Feel free to pick and choose from the following based on your interests, needs, and time. None of them are absolutely essential for this class, though of course, they may be quite useful to your final project or to your professional work.

All of the files associated with these activities are available in the Downloads_ReadingFiles.zip file.

We’ll learn more about text files, using Python to read CSV, JSON and even Excel files. We’ll also start to see more advanced programs and will use Spyder for program development and debugging.

In the first two screencasts we’ll use Jupyter notebooks. Then I’ll introduce using Spyder for these same file reading problems. I’ll focus on creating a new Project in Spyder and using the built in debugging tools.

In addition to covering reading Excel data files into Python data structures, these next two screencasts also covers:

Explore (OPTIONAL)