This lesson is being piloted (Beta version)

Getting arguments from the command-line

Overview

Teaching: 30 min
Exercises: 40 min
Questions
  • Why are command-line arguments useful?

  • How can I add command-line arguments to my programs?

Objectives
  • Gain experience reading and modifying code written by someone else.

  • Implement command-line arguments with argparse.

Before proceeding

Make sure you have followed the setup instructions to download and unzip the intermediate_python_data.zip file.

Command-line Arguments Make Your Programs Easier to Use and Reuse

Have you ever written a short program, and then a few months later discovered that you that you can’t remember exactly how to use it? While good comments and docstrings help, command-line arguments with usage messages can also help.

For many single-use programming tasks and small programs, hard-coding values into the program can be sufficient. However, to make programs reusable it is helpful to supply arguments to the program from the command-line.

For example, command-line arguments that specify the input data file to use will make a data analysis program much easier to reuse compared to the same program with a hard-coded input file.

In the Functions episode, we explored using function arguments to customise the behaviour of a function. Function arguments can supply data, specify options, and generally allow us to modify the function behaviour within the limits allowed by the function design. Good function arguments allow our functions to be reused in different situations or programs.

Command-line arguments to a program are exactly the same. They just operate for the program as a whole rather than a single function. The goals are the same: specify the allowable variations to behaviour for a program.

In the exercises that follow, pay attention to how arguments to the primary function directly influence the command-line arguments.

Introducing Word Count

Have a look at the file wordcount.py. It contains three functions. The first, read_file, returns the entire contents of a file as a single string. The second, word_count returns a dictionary containing the count of words found in a string. The third, print_counts displays the results. The function signatures are:

def read_file(filename):
    """Returns the contents of a text file as a single string, with newlines
    converted to spaces.
    """

def word_count(text, characters_to_ignore=",.?", case_sensitive=False):
    """Returns an ordered dictionary containing the sorted count of words in
    a string, with the word as dictionary key.
    """

def print_counts(counts, min_count=2):
    """Prints the word counts. Only words with a count greater than or equal to
    `min_count` are displayed.
    """

Don’t worry about how these functions work. The important thing for this episode is knowing what the inputs and outputs are.

Run wordcount.py

Try running the wordcount.py file from the command-line:

$ python3 wordcount.py

What happens?

Solution

Nothing. The file defines the functions, but does not call them.

Running word_count in the simplest possible way

We need a new program to use the three provided functions. Our program will read from a file, count the words, and display the results. The data inputs we have to work with are:

  1. The input file name.
  2. The punctuation characters to ignore in the file.
  3. Whether the comparisons are case-sensitive.
  4. The minimum count threshold to display.

Items 2, 3, and 4 all have default values in the functions, so let’s just use those for now. All that is left is the input file.

The simplest possible program to call word_count

Having looked at the possible inputs, the simplest thing is to hard-code the file name, relying on default function arguments for everything else. The “sample-text.txt” file contains some text designed to test the word count functions, so we will use that.

Here is a program fragment that is missing some essential parts. Your challenge is to complete the program and run it.

Save your program as “wordcount1.py”.

from wordcount import read_file, word_count, print_counts

data = read_file("")      # add the filename
counts = word_count(...)  # add appropriate function arguments
print_counts(...)         # add appropriate function arguments

The expected output is:

line: 7
this: 4
is: 2
a: 2
does: 2

Solution

from wordcount import read_file, word_count, print_counts

data = read_file("sample-text.txt")
counts = word_count(data)
print_counts(counts)

The solution is available here.

Introducing argparse

Being able to run the code at all is great progress. However, wordcount1.py is quite limited. The file name is hard-coded, and all the other parameters just use the default values. Without modifying the program, we can’t run it on different files, or display all words regardless of their frequency. And if our text has different punctuation characters, we can’t handle them.

It would be great if we could pass all these things in when running our program.

It would be even greater if our program could also display a helpful message when we supply the wrong arguments, or just when we ask for help.

argparse is a module in the Python Standard Library that does all of this for us. We just need to tell it what our arguments are and it handles the rest. It is not my goal to reproduce the argparse documentation here, but to summarise, argparse provides a consistent approach to:

To illustrate what argparse code looks like, here is an example from the official argparse tutorial. The example specifies a single positional argument and then prints the supplied text.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("echo")
args = parser.parse_args()
print(args.echo)

Calling the program without any arguments prints an automatically generated error message:

$ python3 prog.py
usage: prog.py [-h] echo
prog.py: error: the following arguments are required: echo

Calling the program with the -h (or --help) flags prints an automatically generated help message:

$ python3 prog.py --help
usage: prog.py [-h] echo

positional arguments:
  echo

optional arguments:
  -h, --help  show this help message and exit

And calling the program with any other values echoes the supplied text:

$ python3 prog.py foo
foo

Designing our User Interface

Even for simple user interfaces, it is helpful to think about what we want first.

Here, we would like to specify the following:

Write wordcount2.py

Your challenge is to implement the arguments in our design by completing wordcount2.py. This program has the basic elements in place, however it contains some missing sections and a couple of errors. To fix it, you will need to pay attention to the error messages that come from running wordcount2.py. You may also need to refer to the argparse documentation.

The following test cases show the test command line and the expected output for some combinations of inputs. Make sure you try other combinations as well to see if the outputs look OK.

No arguments

$ python3 wordcount2.py
usage: wordcount2.py [-h] [-p PUNCTUATION] [-c] [-m MIN_COUNT] file
wordcount2.py: error: the following arguments are required: file

Help

$ python3 wordcount2.py --help
usage: wordcount2.py [-h] [-p PUNCTUATION] [-c] [-m MIN_COUNT] file

positional arguments:
  file                  The input text file

optional arguments:
  -h, --help            show this help message and exit
  -p PUNCTUATION, --punctuation PUNCTUATION
                        Punctuation to ignore when counting words.
  -c, --case-sensitive  Force a case-sensitive count. By default, case is
                        ignored.
  -m MIN_COUNT, --min-count MIN_COUNT
                        The minimum word count threshold for display.

File argument

$ python3 wordcount2.py sample-text.txt 
line: 7
this: 4
a: 3
is: 2
why: 2
does: 2
or: 2
two;;: 2

Semi-colon in punctuation

$ python3 wordcount2.py sample-text.txt --punctuation ",.?;"
line: 7
this: 4
a: 3
two: 3
is: 2
why: 2
does: 2
or: 2

Case-sensitive comparisons

Note that the count for some words has changed. Try setting the -min-count flag to 1 to see the full set of words.

$ python3 wordcount2.py sample-text.txt --case-sensitive
line: 6
this: 3
is: 2
a: 2
does: 2
or: 2
two;;: 2

Solution

wordcount2_solution.py contains a working version of the program.

Key Points

  • Using command-line arguments can make your programs easier to use and reuse.