Tutorial Part II: Numpy and Matplotlib

Python is a general-purpose language; libraries provide specialized functionality. The central library for most scientific applications of Python is numpy. It was developed to supplant two earlier versions, Numeric and numarray; you might still find references to these in old tutorials on the web.

The numpy library provides classes and functions for working with arrays of numbers. The central class is the ndarray. The “nd” part stands for “N-dimensional”, because ndarrays are designed from the start to handle any number of dimensions from zero up to more than you will ever need. Usually one works with 1, 2, 3, or occasionally 4 dimensions.

For those coming from Matlab, it is important to remember that Matlab was built around linear algebra matrices; support for numbers of dimensions other than 2, and for number types other than floating point, was tacked on later. In contrast, numpy was designed from the start for N-dimensional arrays of any number type supported by the C language—and for arrays of strings, or of objects of any type. Consequently, there are many differences in practice between Matlab matrices, or their extensions to dimensionality other than 2, and numpy ndarrays. With a little practice, I think you will see that ndarrays are better-suited to most scientific programming than are Matlab matrices.

Note

Numpy includes a matrix subclass of the ndarray base class. Please ignore it. It is not commonly used, it provides only small advantages and only under highly restricted circumstances, and elsewhere it can cause hard-to-diagnose problems.

One more point for those coming from Matlab: although there are many similarities to Python/Numpy/Matplotlib, such as functions with the same name and similar arguments, there are are also basic differences in language design. It is often possible to translate Matlab code into Python with only small adjustments, but usually this does not result in good Python code, so it should be considered only a first step, perhaps an early part of the learning process.

Ndarray basics

We start by illustrating some characteristics of the Numpy ndarray, the basic multidimensional array data type.

Simple line plots

Plotting with Matplotlib is illustrated here with a few very simple examples.

Masked arrays

Oceanographic data sets are usually messy, with glitches and gaps; masked arrays are designed to help us track and manage missing or invalid data.

Dates and times

Here is some work in progress on techniques for handling dates and times.

Fitting polynomials

In response to a question about polynomial fitting in xarray, here is an example showing old and new numpy functions, and how to use them with xarray.

to be continued…