Evaluate your evaluation methods! A key to meaningful inference.

1. Introduction

What is Inference?

Inference is a fundamental concept in both everyday life and scientific investigation. It is the process of drawing conclusions based on evidence and reasoning. Inference allows us to make predictions, understand relationships, and gain insights from data and observations.

Why is Inference Important?

Decision-Making: Inference helps us make informed decisions. Whether you’re deciding what to wear based on the weather forecast or determining the best strategy for your experiment, inference plays a crucial role in evaluating options and outcomes.
Understanding Brain Function: Inference helps neuroscientists draw conclusions about brain activity from experimental data, bridging the gap between observed neural signals and underlying brain processes.
Clinical Applications: Inference is essential for diagnosing and treating neurological disorders. By analyzing patient data, clinicians can infer the underlying causes of symptoms and tailor treatments accordingly.

Everyday Examples of Inference

Weather Predictions: Meteorologists use data from satellites, weather stations, and historical patterns to infer future weather conditions. This helps us prepare for what’s coming, whether it’s bringing an umbrella or planning for a sunny day.
Medical Diagnoses: Doctors use symptoms, medical history, and test results to infer the most likely cause of a patient’s condition. This process is critical in providing accurate diagnoses and effective treatments.
Electrophysiology: Recording electrical activity from neurons allows researchers to infer the roles of specific neurons or networks in processing information.
Behavioral Studies: By observing behavior in response to stimuli, neuroscientists infer the neural mechanisms underlying perception, decision-making, and learning.

So, what is Inference?

Here, we focus on statistical inference - using data from a sample to make inferences about a population. We will learn to apply statistical inference in this Unit.

Inference Skills To Develop

Ask Questions: Cultivate curiosity by asking questions about the world around you. Why did something happen? What might influence this happening? Asking questions leads to deeper understanding and better inference skills.
Gather Evidence: Collect relevant information and data. The more evidence you have, the stronger your inferences will be. Evaluate the quality and reliability of your sources. We can arrive at better conclusions through better data collection.
Think Critically: Analyze the evidence and consider multiple perspectives. Avoid jumping to conclusions without thorough examination. Critical thinking helps in making sound inferences.

What’s next?

In this Unit, we will practice making inferences from from noisy data.
To do so, we will use data from a specific example. We will build models from these data to make inferences about a population (i.e., statistical inference).
We will think critically about our inference results, ask questions about the interpretation (i.e., check our model), and update our model to improve the inferences we make from the data.

2. The data: swim lessons versus drownings.

You receive data from a community organization interested in understanding the impact of swimming lessons on swim safety. The data consist of the following information from in N = 299 communities:

swim_lessons - the number of swim lessons in a community,
drownings - the number of drownings (per 100,000 individuals) in a community,
x, y - the geographic location (latitude and longitude) of the community,

We’re interested in inferring the relationship (if any) between swim_lessons and drownings.

What relationship do you expect to observe between swim_lessons and drownings?

A very reasonable expectation is that:

Expectation: Drownings decrease as the number of swim lessons increase.

That’s a fine statement.

However, we’d like to go beyond this qualitative expecation and put some numbers to it.

For example, does adding 1000 swim lessons to a community (e.g., by providing free programs, by promotion through advertising) reduce the number of drownings by 1, by 10, by 100?

In what follows, we’ll infer this expected relationship from the data.

But before we begin our inference procedure, let’s start by visualizing the data.

Look at the data. Do these data match the intuitive expectation that the number of swim lessons influences the number of drownings?

We’re provided one additional piece of information: the location (x,y) of each community with reported results.

Let’s also plot these geographic locations.

What do you observe in the geographic locations of the communities?

NOTE: Data visualization is often essential to statistical inference:

We visualize the data to draw conclusions (inferences) about a larger population based on the sample of collected data.

Here we utilize data visulzation to gain some intuition for the data.

However, our focus is statisitcal inference, as we discuss next.

3. Make an inference: How do the number of swim lessons relate to drownings?

Now, let’s perform statistical inference.

To start, we’ll perform the following three steps:

Choose a model for the data.
Use the data to infer model parameters.
Summarize conclusions.

Our initial statistical inference appraoch takes the form (McCullagh & Nelder, 1989; page 391):

Initital statistical inference procedure. We start here, but will modify this procedure later.

We’ve already collected the data. Our next step is to choose a model for these data.

Step 1: Choose a model for the data.

Choosing a model requires we apply our prior knowledge as scientists.

In this case, let’s apply our intuition to express our model in words.

Initial Model: I expect more swim lessons reduce the number of drownings.

That’s a fine model. To perform statistical inference, let’s express our model as an equation:

drownings = m swim_lessons + b

Compare our model to the equation for a line: ( y = mx + b ). Match each variable to the correct interpretation.

( y ) →
( x ) →
( m ) →
( b ) →

Here, our initial model is a line.

That’s a very simple model of the data.
We do not expect a line will capture all of the complexity in our data.

However, a line (or linear relationship) often provides a useful first step.

And, a line is relatively simple to compute and interpret.

Step 2. Use the data to infer model parameters.

Having chosen our model – a line – our next step is to infer the parameters in our model.

There are two parameters in the model:

m, the slope, and
b, the intercept.

We’re primarily interested in the slope (m). This parameter indicates the relationship between swim_lessons and drownings.

What sign (positive or negative) do you expect for the slope m?

I expect the slope is positive because I expect drownings will increase with swim_lessons.
I expect the slope is negative because I expect drownings will decrease with swim_lessons.

We expect \(m<0\).

We expect a negative relationship between drownings and swim_lessons
We expect more swim lessons will reduce the number of drownings.

Let’s infer \(m\) from the data.

Meaning:

The slope estimate m represents the estimated change in the number of drownings when the number of swim_lessons increases by 1.
The standard_error represents the variability of the slope estimate.

Interpretation:

If the number of swim_lessons increases by 1, we estimate the number of drowningsto increase by 0.0014, on average.
The standard error (0.0002) is small compared to the slope estimate, so the variability in the slope estimate is small; i.e., the estimate is precise.

Remember our model is a line.

A line is easy to visualize.

Since we’ve now inferrred the model parameters, let’s visualize the inferred line by plotting it with the data.

Looking at this plot, what is the slope of the inferred line?

Step 3. Summarize conclusions.

We have chosen the model (Step 1) and inferred model parameters (Step 2).

Our last step is to summarize the conclusions of our statistical inference.

Let’s start by considering an explict numerical conclusion from our model:

How does the number of drownings change if we increase the number of swim lessons by 1000?

We find an increase of 0.0014 drownings for a one-unit increase in the number of swim_lessons. How does the number of drownings change when the number of swim_lessons increases by 1000.

Answer:

Our model of the data is a line:

drownings = m swim_lessons + b

We inferred the model parameter m from the data and found:

drownings = 0.0014 swim_lessons + b

Consider what happens if the number of swim lessons increases by 1. According to our fit model, the number of drownings increases by:

\(0.0014 * 1\) swim lesson \(= 0.0014\)

That’s a small increase, in both the number of swim lessons and number of drownings.

Alternately, consider what happens if the number of swim lessons increases by 1000. Plugging into our model, we find the number of drownings increases by:

\(0.0014 * 1000\) swim lessons \(= 1.4\)

That’s perhaps a more interpertable result; a community making a policy decision to promote more swim lessons (e.g., through subsidies, though advertisements) might expect 1 more drowning.

In this way, we can always interpert the results of a model. It’s often useful to plug in values, and see what happens.

Summary

We’ve applied a statistical inference approach and found a compelling result:

The number of drowings increases with more swim lessons.

The slope estimate m in our model looks convincing:

The standard error is small (0.0002) compared to the slope (0.0014); i.e., the estimate is precise.

These results are certainly passable for peer-reviewed publication and perhaps enough to motivate a new public policy:

To prevent drownings, discourage swim lessons.

What’s our next step?

4. Something’s not right … evaluate your evaluation method.

To look for a relationship between swim_lessons and drownings, our initial statistical inference appraoch required we chose a model of the data.

We chose a line to model the data.

This simple model resulted in a precise (i.e., low varaibility) estimate of the relationship between swim_lessons and drownings.

But, was this a good model choice?

An important step in any statistical inference is to check our model.

Let’s update our initial statistical inference appraoch to include this fundamental step (McCullagh & Nelder, 1989; page 391):

Our updated statistical inference appraoch becomes:

Choose a model for the data.
Use the data to infer model parameters.
Check and refine the model.
Summarize conclusions.

Notice that we’ve included a new step: Check and refine the model.

In checking the model, we look for discrepancies between the data and model.

These discrepancies help us determine whether the model is suitable or inadequate.

If we determine the model is inadequate, then we return to Step 1 (red arrow), consider an alternative model for the data, and repeat the inference and model checking steps.

If we determine the model is suitable, then we continue to Step 4 and summarize our conclusions.

This statistical inference loops profoundly changes the process of analysis (McCullagh & Nelder, 1989; page 392).

In what follows, we’ll illustrate this updated statistical inference loop by checking the model (Step 3) we developed for the swim_lessons and drownings.

Residual Analysis

Many appraoches exist to check a model (McCullagh & Nelder, 1989; Chapter 12).

Here, we implement a common and powerful appraoch: residual analysis.

Residuals are the differences between the observed data and model predicted values.

In our case, the residuals are the difference between the estimated drownings from our model (the line we fit to the data) and the actual values for drownings.

Remember our plot of the estimated model (solid red line) and the data (blue dots):

The residuals are the distance from each blue dot (the data) to the red line (the estimated model).

In the plot above, each orange line indicates a residual.

If the model is a good fit to the data, then we expect a patternless set of residuals (i.e., the residuals are randomly scattered around zero).

Alternatively, if we find patterns in the residuals, we might wonder about our model choice, and consider strategies to update and improve our model.

NOTE: Other approaches exist to check our model. These include goodness-of-fit metrics (like R-squared), comparison with alternative models (using techniques like AIC), and cross-validation to assess how well the model generalizes to new data. There’s unfortunately no one best strategy to check a model. Here we focus on residual analysis because the appraoch is intuitive and provides a good gauge of how our model is doing.

Let’s plot the residuals for our original model:

Do you see any patterns in this plot of the residuals?

A: (Short answer)

No. At first glance, these residuals look mostly patternless.

Our visualization of the residuals revelas no systematic patterns.

Instead, the residuals appear to fluctuate around 0.

That means our model is ok, right?

NO!

This visualization of the residuals is not very informative.

The horizontal axis (the variable index) is arbitrary.

To observe systematic patterns in the residuals, let’s plot the residuals versus the predictor in our model: the number of swim_lessons.

Now, what pattens do you see in the residuals, when plotted versus the model variable swim_lessons?

A: (Short Answer)

Inspection of residuals versus swim_lessons reveals two clear patterns:

As swim_lessons increases, a systematic decrease in the residuals.
As swim_lessons increases, the variability of the residuals tends to increase.

We conclude from this model check that something isn’t right.

Our model doesn’t adequately describe features in the data.

We’ve plotted the residuals versus the predictor swim_lessons. We could also plot residuals versus the outcome variable drownings. Doing so, what do you see?

A: (Extension / Challenge Question)

Conclusions

Our analysis of the residuals suggests an important result

Our inital model fit (a line) is no good.

This is a very useful result.

Our initial statistical inference produced a counter-intuitive result:

As swim lessons increase, so do drownings.

But this initial inference is based on a model - the line - that’s not a good representation of the data.

Clear trends exist in the residuals.

Checking the model is our first step to improving the model, and thereby improving the statistical inferences we make from the data.

NOTE: In this case, our model check (Residual Analysis) failed; the resiudals were not randomly scattered around 0.

Because the model check failed, we can dismiss the counter-intuitve relationship identified in our initial model (i.e., that more swim_lessons increase the number of drownings) and continue model development.

However, we emphasize that further model development should be pursued regardless of the relationship our initial model identified. Even if the initial statistcal inference proposed an intuitive result (i.e., that more swim_lessons decrease the number of drownings), we would still continue model development; when the model check fails, we’re not satisified with the initial model, no matter the intuitive appeal of the result.

5. Iterate the statistical inference loop: improve your scientific conclusions by improving your inference model.

Remember our statistical inference loop:

In Mini 3, we proposed our initial model

Initial model: drownings = m swim_lessons + b

In Mini 4, we checked our model by visualing the residuals (i.e., the difference between our initial model predictions and the data).

We found systematic discrepancies (i.e., patterns) in the residuals and concluded:

Our inital model fit (a line) is no good.

With this knowledge, our task now is to refine the model.

Many strategies exist to refine our model. To get started, consider this question: What is missing from the model?

A: (Short Answer)

Many things could be missing, including the age of individuals in the community, or the time of year in which the data were collected, or the socioeconomic status of each community. The goal of a model is not to capture every possible cause of drownings. Instead, the goal of the model is to develop a simple, but useful, description of the population, given the small sample of data we get to observe. In the words of George Box: “All models are wrong, but some are useful.”
There could be issues with the sampling of data. Maybe the data were collected incorrectly. That’s possible, but not the issue here.

To update our model, consider these data we examined in previous Minis:

The geographic location of each community.

Let’s again consider the initial model residuals versus the geographic location of each communitiy:

We observe that these residuals vary with geographic location.

More specficily, we notice spatial patterns in the residuals:

positive residuals near the coast,
negative resiudals away from the coast.

These observations suggest an important model refinement:

Update the model to include distance to the ocean as another predictor.

There are infinite features we could include in the model. Why include distance to the ocean?

A: (Multiple Choice)

(Correct) The feature makes intuitive sense: we might expect more drownings at locations near to the ocean, and fewer drownings at locations far from the ocean.
(Correct) We have data for this predictor (i.e., we can determine the distance from each community to the ocean).

With this idea for our new model, let’s implement each step of our statistical inference loop:

Choose a model for the data.
Use the data to infer model parameters.
Check and refine the model.
Summarize conclusions.

Step 1: Choose a model for the data.

Our refined model of drownings includes two predictors:

The number of swim lessons (swim_lessons).
The distance from the ocean (distance_from_ocean)
- measured as the geodesic distance in kilometers as the crow flies from Myrtle Beach).

Our refined model becomes:

Refined model: drownings = \(m_1\) swim_lessons + \(m_2\) distance_from_ocean + b

Remember that our initial model - a line in Mini 3 - contained one outcome (drownings) and one predictor (swim_lessons).

Our refined model contains one outcome (drownings) and two predictors (swim_lessons and distance_from_ocean).

Therefore, our refined model is a 2-dimensional plane in the 3-dimensional space of the data.

To visualize our 3-dimensional data, let’s load these new data and plot it.

Look at the three-dimensional plot. Can you sketch (or imagine) a plane that slices through these points?

A: It’s difficult to see, but maybe.

Step 2. Use the data to infer model parameters.

With these data and our refined model, let’s continue our statistical inference loop and infer model parameters.

To do so, we’ll fit a plane to the data.

The idea is exactly the same as fitting a line to the data in our original model.

By fitting a plane to the data, we’ll infer three model parameters:

\(m_1\), the slope corresponding to the predictor swim_lessons.
\(m_2\), the slope corresponding to the predictor distance_from_ocean.
\(b\), the intercept.

We’re primarily interested in \(m_1\), the slope corresponding to the predictor swim_lessons.

This parameter indicates the relationship between swim_lessons and drownings.

However, we now include the additional predictor distance_to_ocean and also infer its slope (\(m_2\)) from the data.

We’re not necessarily interested in the relationship between drowings and distance_to_ocean. Motivated by our scientific knowledge (e.g., that drownings may depend on a community’s distance to the ocean) and our residual analysis in the initial model, we include this new predictor.

What sign (positive or negative) do you expect for the slope \(m_1\)?

A: (Multiple Choice)

(Incorrect) I expect the slope is positive because I expect drownings will increases with swim_lessions.
(Correct) I expect the slope is negative because I expect drownings will decrease with swim_lessions.

What sign (positive or negative) do you expect for the slope \(m_2\)?

A: (Multiple Choice)

(Incorrect) I expect the slope is positive because I expect drownings will increases with distance_from_ocean.
(Correct) I expect the slope is negative because I expect drownings will decrease with distance_from_ocean.

We expect \(m_1<0\) and \(m_2<0\).

We expect a negative relationship between drownings and swim_lessons, i.e., we expect increased swim lessons will decrease the number of drownings.
We expect a negative relationship between drownings and distance_from_ocean, i.e., we expect increased distance from the ocean will decrease the number of drownings.

Let’s infer \(m_1\) and \(m_2\) from the data.

Meaning:

The slope estimate m_1 represents the change in the number of drownings for a one-unit change in the number of swim_lessons.
The slope estimate m_2 represents the change in the number of drownings for a one-unit change in the distance_from_ocean.
For each parameter, the standard_error represents the variability of the slope estimate.

Interpretation of inferred value for m_1:

The sign of m_1 is negative.
If the number of swim_lessons increases by 1, we estimate the number of drownings decreases by 0.0003, on average.
The standard error (0.0003) is big compared to the slope estimate, so the variability in the slope estimate is large; i.e., the estimate is unreliable.

Interpretation of inferred value for m_2:

The sign of m_2 is negative.
If the distance_from_ocean increases by 1 kilometer, we estimate the number of drownings decreases by 0.06, on average.
The standard error (0.0087) is small compared to the slope estimate, so the variability in the slope estimate is small; i.e., the estimate is precise.

Our refined model is a 2-dimenisonal plane in the 3-dimensional space of variables (swim_lessons, distance_from_ocean, and drownings).

Since we’ve now inferrred the model parameters from the data, we can visualize the inferred plane.

In other words, let’s plot the model with the data.

Step 3. Check and refine the model.

Let’s now check the refined model.

To do so, let’s plot the new model errors (i.e., the residuals) versus a predictor in our model (here, the number of swim_lessons).

Wait, I thought this was a 2-dimensional model of 3-dimensional data. Why are the residuals only 1-dimensional?

A: (Multiple Choice)

(Correct) The residuals are the difference between the observed number of drownings (drowning) and the predicted number of drowning by the model. Both of these quantities are 1-dimensional.
(Incorrect) The data are 3-dimensional and the model is 2-dimensional so 3-2 = 1.

What pattens do you see in the residuals, when plotted versus the model variable swim_lessons?

A: (Short Answer)

Inspection of residuals versus swim_lessons reveals one pattern:

As swim_lessons increases, the variability of the residuals tends to increase.

Remember that, in our initial model, we observed a systematic trend in the residuals.

Original Model: As swim_lessons increase, the average residuals value decreases.

In our reifned model, visual inspection reveals no systematic trend in the residuals

Refined Model: As swim_lessons increase, the average residuals value remains approximately constant.

This is a nice improvement compared to our original model.

Including the additional variable distance_to_ocean has reduced a systematic trend in the residuals.

Let’s also plot the residuals of the new model versus the geographic location.

Do you observe any patterns in the residuals that depend on location?

A: (Multiple Choice)

(Incorrect) Yes, the residuals appear to vary with geographic location. Nearer to the ocean, the residuals tend to be more positive (darker blue colors).
(Correct) No, the residuals do not appear to vary with geographic location.

Step 4. Summarize conclusions.

Having chosen the model (Step 1), inferred model parameters (Step 2), and checked our model (Step 3), our last step is to summarize the conclusions of our statistical inference.

To do so, let’s consider explict numerical conclusions from our model:

How does the number of drownings change if we increase the number of swim lessons by 1000 or distance from the ocean by 100 kilometers?

We found \(m_1=-0.0003\) with standard error \(0.0003\). How does the number of drownings change when the number of swim_lessons increases by 1000.

Our model of the data is a plane:

drownings = \(m_1\) swim_lessons + \(m_2\) distance_from_ocean + b

We inferred the model parameter m_1 from the data and found:

drownings = -0.0003 swim_lessons + …

If the number of swim lessons increases by 1000, then the number of drownings changes by:

drownings = -0.0003 * 1000 = -0.3

So, by this calculation, increasing the number of swim lessons by 1000 reduces the number of drownings by -0.3.

However, we have no confidence in this result!

We found a standard error of 0.0003 for the estimate of \(m_1\). This standard error is as large as the estimate \(m_1\) itself.

We therefore would not be surprised if repeated experiments found \(m_1\) values between approximatley:

\(m_1 \pm 2 * \mathrm{standard \, error} = -0.0003 \pm 2*0.0003 = [-0.0009, 0.0003]\)

So, due to the large standard error, we cannot confidently conclude whether \(m_1\) is positive or negative.

In other words, we cannot conclude whether increasing swim lessons increases or decreases the number of drownings.

We found \(m_2=-0.06\) with standard error \(0.0087\). How does the number of drownings change when the distance_to_ocean increases by 100 kilometers.

Our model of the data is a plane:

drownings = \(m_1\) swim_lessons + \(m_2\) distance_from_ocean + b

We inferred the model parameter m_2 from the data and found:

drownings = … -0.06 distance_from_ocean + …

If the distance from the ocean increases by 100 kilometers, then the number of drownings changes by:

drownings = -0.06 * 100 = -6

So, by this calculation, increasing the distance to the ocean by 100 kilometers reduces the number of drownings by -7.

In this case, we are confident in this result!

We found a standard error of 0.0087 for the estimate of \(m_2\). This standard error is much smaller than the estimate of \(m_2\) itself.

We therefore would not be surprised if repeated experiments found \(m_2\) values between approximatley:

\(m_2 \pm 2 * \mathrm{standard \, error} = -0.06 \pm 2*0.0087 = [-0.077, -0.043]\)

So, due to the small standard error, we can confidently conclude that \(m_2\) is negative.

In other words, we conclude that increasing distance from the ocean decreases the number of drownings.

Summary

Our residual analysis suggests an important result: Our refined model (a plane) improves the model fit.

We’ve made progress improving our model!

Our initial model produced a significant - but nonsensical - result:

Initial model conclusion: As swim lessons increase, so do the number of drownings.

However, performing our model checks, residual analysis revealed a poor model fit.

The poor model fit motivated a cycle through our statistical inference loop.

We refined the model by adding another predictor - the distance to the ocean.

The refined model changed our results dramatically:

We improved our model fit (by reducing a systematic trend in the residuals).

However, we still observe that the variability of the residuals tends to increase with the number of swim lessons.

We found a (weak) relationship between increasing swim lessons and decreasing drownings.

While this result makes intuitive sense, our evidence for this relationship is not convincing (the variability in the estimate is large).

We found a (strong) relationship between increasing distance from the ocean and decreasing drownings.

This result also makes intuitive sense and our evidence for this relationship is strong (the variability in the estimate is small).

We show in the next Minis that continued model refinement in our statistical inference loop can further strengthen our results. - If you’re interested, check out the other Minis associated with this Unit.

To conclude, we applied our statistical inference loop to make inferences from noisy data.

We started by choosing a model and inferring the model parameters (i.e., the slopes).

But that is not enough!

We must also check our model, in our case by plotting the residuals.

1. Introduction

What is Inference?

Why is Inference Important?

Everyday Examples of Inference

So, what is Inference?

Inference Skills To Develop

What’s next?

2. The data: swim lessons versus drownings.

3. Make an inference: How do the number of swim lessons relate to drownings?

Step 1: Choose a model for the data.

Step 2. Use the data to infer model parameters.

Step 3. Summarize conclusions.

Summary

4. Something’s not right … evaluate your evaluation method.

Residual Analysis

Model Refinement: infinite choice

Conclusions

5. Iterate the statistical inference loop: improve your scientific conclusions by improving your inference model.

Step 1: Choose a model for the data.

Step 2. Use the data to infer model parameters.

Step 3. Check and refine the model.

Step 4. Summarize conclusions.

Summary