Data Analysis With Python

Note

This web site was developed for a course that I no longer teach. I think it may still be useful, though, so I try to keep it up to date, and I occasionally add material.

The purpose of the Data Analysis course is to provide an introduction to a variety of concepts and techniques used in interpreting atmospheric and ocean measurements and numerical model output, with an emphasis on time series. In most cases, applying these techniques requires computing, hence at least minimal programming. This raises the question of which computer language and other software to use.

For this course we have chosen Python as the language. It is widely used in almost all applications of computing (e.g., Facebook and Google are major users of Python), and it is gaining prominance in many areas of science, including biology (especially neuroscience and bioinformatics) and the earth sciences (with atmospheric science being perhaps the most rapid adopter). Do you want to verify the LIGO gravitational wave discovery? It’s all here in Python.

If you will need to do some programming as a student and/or as a professional, a moderate amount of time spent learning good tools and practices at an early stage will pay off ever after. By working in Python with both simulated and real data sets, we will try to develop computing skill along with an understanding of basic statistics and data analysis techniques. Along the way we will introduce tools such as Distributed Version Control Systems (Mercurial and git) and the Python Sphinx web-site generator used to make this site.