Introduction to Python for the practicing neuroscientist

To be frank: this notebook is rather boring. In this class, we will use the software package Python. The best way to learn new software (and probably most things) is when motivated by a particular problem. Would you read assembly instructions for furniture you do not plan to own? Probably not. In other notebooks we will pursue specific questions driven by neuronal data, and use our desire to understand these data to motivate the development and application of computational methods. But not in this notebook. Here, we focus on basic coding techniques and principles in Python in the abstract, without motivation. You - poor learner - must trust that these ideas and techniques will eventually be useful. We begin by dipping our toe into the Python pool, and learning the basic strokes; the fun and interesting parts in the “real world” of neuronal data happen later.

Let us delay no further. In the following examples, you are asked to execute code in Python. If your Python experience is limited, you should actually do this, not just read the text below. If you intend to ignore this advice - and not execute the code in Python - then instead walk to the local coffee shop, get a double espresso, and return to attempt these examples.

This notebook follows in spirit and sometimes in detail notebook 2 of MATLAB for Neuroscientists, an excellent reference for learning to use MATLAB in neuroscience with many additional examples. If you have not used Python before, there are many excellent resources online (e.g., the Python Data Science Handbook).

Starting Python

There are two ways to interact with this notebook. First, you could run it locally on your own computer using Jupyter. This is an excellent choice, because you’ll be able to read, edit, and excute the Python code directly and you can save any changes you make or notes that you want to record. The second way is to open this notebook in your browser and execute the examples directly in your browser, without installing additional software on your computer. In any case, we encourage you to execute each line of code in this file!

Note

Throughout this notebook, we assume that you are running Python 3. Most of the functions used here are the same in Python 2 and 3. One noteable exception however is division. If you are using Python 2, you will find that the division operator / actually computes the floor of the division if both operands are integers (i.e., no decimal points). For example, in Python 2, 4/3 equals 1. While, in Python 3, 4/3 equals 1.333.

We encourage you to use Python 3 for the sake of compatibility with this notebook, as well as for compatibility with future releases of Python.

On-ramp: analysis of neural data in Python

We begin this notebook with an “on-ramp” to analysis in Python. The purpose of this on-ramp is to introduce you immediately to some aspects of Python. You may not understand all aspects of the Python language here, but that’s not the point. Instead, the purpose of this on-ramp is to illustrate what can be done. Our advice is to simply run the code below and see what happens…

Q: Try to read the code above. Can you see how it loads data, extracts useful information to print, then selects an interval of data to plot?

A: If you’ve never used Python before, that’s an especially difficult question. Please continue on to learn more!

Example 1: Python is a calculator

Execute the following commands in Python:

Q: What does Python return? Does it make sense?

Example 2. Python can compute complicated quantities.

Enter the following command in Python:

Q: Does this answer make sense?

Q: Can you use parentheses to change the answer?

Example 3. Python has useful built in functions.

A function is a program that operates on arguments. Standard math functions and variables (and other useful things) can be accessed from the math and numpy modules. To use the math and numpy modules, we must first import both:

Note

In this style, we indicate which module, or namespace, contains the function we want to call: x = np.arange(10) or plt.plot(x, y).

You will often begin your data analysis with import statements, to load the functionality you need. We can now call functions from math using numpy.*. For example,

Above, sin is the sine function. It operates on the argument 2*pi. Notice that, once we have imported the numpy module, Python knows the value of pi. Here’s another example function that operates on arguments:

Q: What is math.atan?

A: To answer this, try using Python Help. To start the Python Help, simply put a ? at the end of math.atan and then run this code block.

Code
math.atan?

You should see a description of the function pop up at the bottom of the window.

Note

Python Help is extremely useful, but may not work in a web browser. You can always look there when you have questions about a function, or search the internet for help, i.e., google it.

Example 4. We can use Python to define lists of numbers.

In Python, there are several different data structures that are designed to store more than one element. Here we will focus on the array data structure, but if you are curious to know how and when to use other structures, there is a good explanation here. Let’s define an array:

Example 5. We can manipulate arrays by scalars.

A scalar is a single number. Consider,

Q: What do you find?

A: Notice that the scalar operates on each element of the array.

Example 6. We can manipulate arrays with arrays.

Let’s create an array and multiply it by itself,

Q: What does this return?

A: We see that the operator * performs element-by-element multiplication of the values in array a.

Example 7. More examples of manipulating arrays with arrays.

Q: What operation does np.multiply() perform?

Example 8. We can probe the variables we’ve defined in Python.

To see a list of the variables you’ve defined, type who or whos in a code block by themselves. Notice whos provides more information.

Code
who
Interactive namespace is empty.
Code
whos
Interactive namespace is empty.
Note

The functions who and whos can be extremely useful, but may not work in a web browser.

Example 9. Determine the dimensions of an array

To examine the dimensions of an array, we can ask for the shape,

We find that the shape of a is (1,4) or 1 row and 4 columns. Notice we have two options to execute the shape function:

  • In a.shape we return the attribute shape of the variable a.

  • In np.shape(a) we apply the function shape from numpy to the variable a.

The result is equivalent.

Example 10. Sometimes we need to reset the workspace

By doing so, we get rid of all the variables. To do so, type %reset and enter y

Code
%reset

Q. What command could we use to confirm there are no variables in the workspace?

A. Consider who.

Code
who
Note

The %reset command is an example of a magic. Magics are commands that start with the % symbol and use a language other than Python. They are only available in the notebook environment. In fact, the set of magics that is available is specific to the notebook kernel. This means that if you have a Jupyter notebook running a Ruby kernel the magics will be different.

Example 11. We can define matrices in Python.

A matrix is an array with more than one dimensio. Consider the following:

This creates a matrix with two rows and three columns. Consider,

Q: Can you see the two rows and three columns?

We can manipulate matrices like we manipulate vectors.

Example 12. Indexing matrices and vectors.

Matrices and vectors are arrays of numbers, and sometimes we want to access individual elements or small subsets of these lists. That’s easy to do in Python. Consider,

Python indexes from 0 (like C, C++, Java, and unlike MATLAB and Fortran which start at 1). To access the 2nd element of a or b, type a[1] / b[1]. We’ll be a bit fancier with our printing now to distinguish variables. Calling str(a) converts the variable a to a string that can be printed easily. Adding two strings just concatenates them: "hi" + " bye" = "hi bye".

Q. Do the results make sense? How would you access the 4th element of each vector?

We can combine a and b to form a matrix with a as the first row and b as the second. Note that we apply the function array() to the list [a,b], which it converts to a matrix.

To learn the size (or shape) of c we use shape():

The shape of c is [2 5]. It has two rows and five columns. To access the individual element in the 1st row and 4th column of c, type c[0,3]

We access matrices using ‘row, column’ notation. So c[0,3] means print the element in row 0, column 3 of c.

Q. How would you print all rows in the 2nd column of c?

Example 13: We can find subsets of elements in matrices and vectors.

Often we are interested in only some of the elements of a matrix or vector. For example, we might want to look at the data from a single experimental trial which is stored in a particular row of a matrix. Alternatively, we might want to find out when the values in a time series cross a given boundary. Doing this is simple in Python.

Slicing

Slicing means that we want to look at a specific portion of a vector or matrix, for example, the first row of a matrix. We will continue with the matrix c from the previous example. The notation ‘:’ means ‘all indices’. To access all columns in the entire first row of c, type c[0,:]. To access the 2nd thru 4th columns of the first row of c, type c[0,1:4].

The notation 1:4 means all integers from 1 up to, but not including 4, which in this case gives columns 1, 2, and 3.

Leaving out the number before the colon tells Python to start at index 0. Leaving out the number after the colon tells Python to continue all the way to the end.

We can also tell Python how to step through the indices. To access only the even columns of c, we can use the following:

This code tells Python to start at 0, continue to the end, and step by 2. The result should be the values in row 0, columns 0, 2, and 4 of c. We could write this explicitly as c[0,0:5:2].

#### Selecting elements that satisfy a condition Sometimes we’re interested in locating particular values within a matrix or vector. As an example, let’s first define a vector.

Q. Calculate the shape of a. What is the maximum value of a? Hint: Use the max() function.

Now let’s find all values in a that exceed 10.

This is called logical indexing, let’s look at what a>10 returns:

When we index a using this array lgIdx we get back only the entries in a corresponding to True, as above:

Sometimes we want to know the actual indices in a where a > 10. We can get them using the nonzero() array method, which returns the index of all entries that were True, or non-zero.

The command nonzero() can be used as both a function and a method. A method is called by adding it after the object it is meant to operate on with a period in between ( lgIdx.nonzero() ). A function is called with the argument explicitly provided inside the parentheses ( nonzero(lgIdx) ). Basically, a function and a method do the same thing, but a function needs to be given an argument, while a method assumes that the argument is the object that the method is attached to. Note that if we use nonzero() as a function, we need to tell it to look in NumPy for the definition (i.e. add `` at the beginning of the function call).

Now we have another way to select the desired elements of a:

We can use these two types of indexing to change subsets of the values of a.

Q: How does a change in the first and second print statements?

We can perform these same logical operations for a matrix,

Notice that the last line collapses the True entries to an array, ordered by row and then by column. If you’ve used MATLAB, this is the opposite of what it does!

Example 14: Plotting data in Python.

It’s not easy to look at lists of numbers and gain an intuitive feeling for their behavior, especially when the lists are long. In these cases, it’s better to visualize the lists of numbers by plotting them. Consider

Q. Looking at the values in ‘y’ printed above, can you tell what’s happending?

A. Not really … let’s visualize y vs x instead.

matplotlib

To visualize y versus x let’s plot it. To do so, let’s first import some basic plotting routines from matplotlib, which provides a nice 2D plotting library. We’ll also tell Python to show matplotlib graphics inline, in this notebook.

Let’s start by plotting a simple example for x and y,

Q. Does the plot above make sense for the variables x and y?

Now, let’s go back to the definitions of x and y that we started this example with and plot y versus x.

The plot of x versus y should look a bit jagged, and not smooth like a sinusoid. To make the curve smoother, let’s redefine x as,

Q. Compare this definition of x to the definition above. How do these two definitions differ?

Q. What is the size of x? Does this make sense?

Now let’s replot the sine function.

Q. Does this plot make sense, given your knowledge of x, y, and trigonometry?

Example 15: What if we want to compare several functions?

Continuing the example in the previous section, let’s define a second vector

and plot it:

We’d now like to compare the two variables y and z. To do this, let’s plot both vectors on the same figure, label the axes, and provide a legend,

Notice that we’ve included a third input to the function plot. Here the third input tells Python to draw the curve in a particular color: 'r' for red. There are many options we can use to plot; to see more, check out the documentation for plot.

We can also label the axes, give the figure a title, and provide a legend,

To futher edit this plot, you might decide - for example - that the font size for the labels is too small. We can change the default with:

Example 16: We can make random numbers in Python.

To generate a single Gaussian random number in Python, use the function in the NumPy random module.

Let’s generate a vector of 1000 Gaussian random numbers:

… and look at a histogram of the vector:

Q. Does this histogram make sense? Is it what you expect for a distribution of Gaussian random variables?

See Python Help (hist?) to learn about the function hist().

Example 17: Repeating commands over and over and over . . .

Sometimes we’ll want to repeat the same command over and over again. For example, what if we want to plot sin(x + k*pi/4) where k varies from 1 to 5 in steps of 1; how do we do it? Consider the following:

That’s horrible code! All I did was cut and paste the same thing four times. As a general rule, if you’re repeatedly cutting and pasting in code, what you’re doing is inefficient and typically error prone. There’s a much more elegant way to do this, and it involves making a for loop. Consider:

Now let’s declare a for loop where k successively takes the values 1, then 2, then 3, …, up to 5. Note, any code we want to execute as part of the loop must be indented one level. The first line of code that is not indented, in this case show() below, executes after the for loop completes

The small section of code above replaces all the cutting-and-pasting. Instead of cutting and pasting, we update the definition of y with different values of k and plot it within this for-loop.

Q. Spend some time studying this for-loop. Does it make sense?

Note

Important: Python uses indentation to define for loops.

Example 18: Defining a new function.

We’ve spent some time in this notebook writing and executing code. Sometimes we’ll need to write our own Python functions. Let’s do that now.

Our function will do something very simple: it will take as input a vector and return as output the vector elements squared plus an additive constant.

If have a vector, v, and a constant, b, we would like to call:

vsq = my_square_function(v, b)

This won’t work! We first need to define my_square_function. Let’s do so now,

The function begins with the keyword def followed by the function name and the inputs in parentheses. Notice that this first line ends with a colon :. All of the function components that follow this first line should be indented one level. This is just like the for loop we applied earlier; the operations performed by the for loop were indented one leve.

When defining the function, the code the function executes should be indented one level.

The text inside triple quotes provides an optional documentation string that describes our function. While optional, including a ‘doc string’ is an important part of making your code understandable and reuseable.

The keyword return exits the function, and in this case returns the expression x * x + c. Note that a return statement with no arguments returns None, indicating the absence of a value.

With the function defined, let’s now call it. To do so we first define the inputs, and then run the function, as follows:

Q. Try to make a function, my_power, so that y = power(x,n) evaluates \(y = x^n\), (in Python you can use x**n to take the power)

Example 19: Load data into Python

For our last example let’s load a data file on the web in the .csv format into Python.

To do so, let’s first import the pandas module,

Now, let’s load a data file using the function read_csv,

The variable df that holds the loaded data is a Python DataFrame. We can think of it as a simple table that holds our data.

Let’s print it,

We see that the columns in the dataframe consist of two variables: d and t. Our collaborator who provided the data tells us that these correspond to the voltage recording (d) and a time axis (t) for her data.

Let’s define variables to hold the data corresponding to each key,

Note

Here we convert the data in each column to a numpy array, because we’d (probably) like numpy to function on these values.

Now, let’s plot the LFP data versus the time axis,

Example 20: Compute statistics and compare

Contributed by @mateouma

Let’s do some statistics. First, our standard imports.

With numpy, we can find the mean and standard deviation of our LFP data.

Now, let’s use numpy to randomly generate numbers according to a normal distribution with the same mean and standard deviation as the LFP data. The syntax is np.random.normal(mean, sd, size), where mean, sd, and size are variables or numbers.

We can use a histogram to compare the distribution of the data with a normal distribution with the same mean and standard deviation.

As we can see, our data doesn’t look normally distributed, but in practice we should use a statistical test to make this assessment.