## Abstract

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of *Explorations in Statistics* explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can account for different initial values of the response. But this creates a problem: percent change is really just a ratio, and a ratio is infamous for its ability to mislead. This means we may fail to find a group difference that does exist, or we may find a group difference that does not exist. What kind of an approach to science is that? In contrast, analysis of covariance is versatile: it can accommodate an analysis of the relationship between absolute change and initial value when percent change is useless.

- absolute change
- analysis of covariance
- ordinary least-squares regression
- percent change
- symmetrized percent change

this tenth paper in *Explorations in Statistics* (see Refs. 5–13) explores the analysis of a potential change in some physiological response. As researchers, we often express the absolute change in the physiological thing we care about as a percent change so we can account for different initial values of the response. But this creates a problem: percent change is really just a ratio. And, as we saw in our last exploration (13), a ratio can wreak havoc and mislead us. Before we explore the vagaries of analyzing the change in the physiological thing we care about, we need to review the software we will use to investigate change.

### R: Basic Operations

The first paper in this series (5) summarized R (22) and outlined its installation. For this exploration there are three additional steps: download Advances_Statistics_Code_Change.R^{1} to your Advances folder, confirm that you installed the beeswarm, boot, coin, and MASS packages in our previous explorations (8, 9, 12), and install the extra package car.^{2}

To install car, open R and then click Packages | Install package(s) . . . .^{3} Select a CRAN mirror close to your location and then click OK. Select car and then click OK. When you have installed car, you will see

`package car successfully unpacked and MD5 sums checked`

in the R Console.

#### To run R commands.

If you use a Mac, highlight the commands you want to submit and then press ↵ (command key + enter). If you use a PC, highlight the commands you want to submit, right-click, and then click Run line or selection. Or, highlight the commands you want to submit and then press Ctrl + R.

### Metrics of Change: an Overview

Suppose we want to estimate the impact of a novel drug on the physiological thing we care about. How would we do this? In each of our *n* subjects, we would measure the physiological thing we care about first before and then after the drug. If the initial value–the value before the drug–is *y*_{i}, and if the final value–the value after the drug–is *y*_{f}, then the absolute change Δ*y* in the physiological thing we care about is just

The absolute change Δ*y* is the simplest metric with which to assess change. But it can be imperfect: the absolute change Δ*y* may depend, in part, on the initial value *y*_{i}. On the one hand, if *y*_{i} varies substantially between groups–suppose *y*_{i} represents the initial diameter of arteries and arterioles–then simply because of physical constraints, larger values of Δ*y* are likely to be associated with larger values of *y*_{i}. On the other hand, if *y*_{i} varies within a group–suppose now that *y*_{i} represents the initial blood pressure of healthy controls–then for mathematical reasons alone, if *y*_{i} is smaller, then *y*_{f} can increase more: the lower you start, the higher you can climb. If *y*_{i} is bigger, then *y*_{f} can decrease more: the higher you start, the farther you can fall.

The traditional solution to this dependency of absolute change Δ*y* on initial value *y*_{i} is to standardize Δ*y* to *y*_{i}:

Often, this ratio^{4} is rescaled to percent change, %Δ, by multiplying by 100:
(1)

Of course, this metric of change has its own quirks: if the initial value happens to be 0, then the percent change is undefined. And, as *y*_{i} gets smaller and smaller, %Δ gets bigger and bigger, approaching ∞ or −∞ depending on whether *y*_{f} increases or decreases from *y*_{i}.

There is another quirk: for some pair of initial and final values, the magnitude of the percent change depends on the direction of the comparison: that is, the magnitude depends on which value is the reference value. Suppose the initial value *y*_{i} = 1 and the final value *y*_{f} = 2. If we compare *y*_{f} to *y*_{i}, then the %Δ is

and we conclude that the final value is 100% greater than the initial value. In contrast, if we compare *y*_{i} to *y*_{f}, then the %Δ is

and we conclude that the initial value is 50% less than the final value. Of course, it is not at all clear why we would compare the initial value to the final value–after all, the initial value precedes the final value–but the fact that the magnitude of the percent change depends on the direction of the comparison creates some logical dissonance.

We can resolve that dissonance if we compute the symmetrized %Δ (1–3): (2)

Symmetrized %Δ is better behaved mathematically than is %Δ. If *y*_{i} happens to be 0, then the symmetrized %Δ is 100%. If *y*_{i} differs from 0 and *y*_{f} = 0, then the symmetrized %Δ is −100%. As with %Δ, if *y*_{f} = *y*_{i}, then the symmetrized %Δ is 0%. And a pleasing property (24) is that, for some pair of initial and final values, the magnitude of the symmetrized percent change is unaffected by the direction of the comparison:

and

These two metrics of relative change purport to account for differences in the initial value. But just as with other ratios (13), if there is no relationship between Δ*y* and *y*_{i}, then the mere calculation of %Δ and symmetrized %Δ creates a relationship (Fig. 1). If there is a relationship between Δ*y* and *y*_{i}, then the calculation of %Δ and symmetrized %Δ exaggerates the strength of that relationship.

### The Examples

Suppose we develop our initial thought experiment in which we wanted to estimate the impact of some drug on the physiological thing we care about. First, we randomly assign 10 sheep to each of 2 groups: a control group and a treated group. Next, imagine we suspect that the impact of the drug will be associated with the initial value of the physiological thing we care about: the bigger the initial value, the bigger the drug-induced decrease. And so, in each member of the treated group, we somehow elevate the physiological thing we care about before we make our initial measurement and then administer the drug. We define our null hypothesis to be that the thing we care about will decrease the same amount in the control and treated groups, and we establish a critical significance level of α = 0.05 (14). Table 1 lists the observations from this simulated experiment.

On average, the thing we care about decreased 96 units in the treated group and 86 units in the control group (Table 1, Δ*y* column). If we assess our null hypothesis using an exact permutation method (12), we reject the null hypothesis (*P* < 0.001) and conclude that the absolute change Δ*y* differs between the two groups. This confirms our suspicion that the impact of the drug depends on the initial value of the thing we care about. But then we remember that absolute change can be an imperfect metric. And so we wonder: what if we assess our null hypothesis using the standardized metrics %Δ and symmetrized %Δ?

The physiological thing we care about decreased 90% (Table 1, %Δ column) and 82% (Table 1, s%Δ column) in each of the two groups. For both metrics we fail to reject the null hypothesis (*P* = 0.23), and we conclude that the percent change %Δ and the symmetrized percent change %Δ are similar in the two groups. The commands in *lines 438–444* of Advances_Statistics_Code_Change.R return these results. Your results will differ.

In our last exploration (13), we used analysis of covariance (4, 15–17, 21, 23) to compare the relationship between a numerator and a denominator in two groups.^{5} We can use the same approach here, but first, we need to see how we generated the observations in Table 1.

Suppose the first-order model (3)

defines the true relationship between the absolute change Δ*Y* and the initial value *Y*_{i}. In this statistical model, ξ is the random error associated with the measurement of *Y*_{i} and is distributed normally with a mean of 0 and a standard deviation σ_{ξ} = 1. We generated measured values of *Y*_{i} by adding random measurement error to each known initial value. Next, β_{0} represents the magnitude of Δ*Y* when the measured initial value, *Y*_{i} + ξ, is 0; β_{1} represents the true slope of the relationship between Δ*Y* and *Y*_{i} + ξ; and ε represents random error in Δ*Y* at each measured initial value. The random error ε is also distributed normally with a mean of 0 and a standard deviation σ_{ε} = 1. The command in *line 436* of Advances_Statistics_Code_Change.R returns the observed values of Δ*y* and the measured values of *y*_{i} listed in Table 1. Your values will differ.

It turns out that we defined the coefficients β_{0} and β_{1} as

in *lines 156–159* of Advances_Statistics_Code_Change.R. By using these values in *Eq. 3*, we generated the observed values of Δ*y* and the measured values of *y*_{i} for the control (*group 0*) and treated (*group 1*) groups; see Table 1.

If we use analysis of covariance (4, 15–17, 21, 23) to estimate the true relationship between the absolute change Δ*Y* and the measured initial value *Y*_{i} + ξ in our two groups, then we obtain

where is the predicted value (see Ref. 11) of the absolute change in the thing we care about and *y*_{i} is the initial value of the thing we care about. We also learn that the estimates of the slope of the relationship between the absolute change and the initial value are similar in the two groups (*P* = 0.94); see Fig. 2. The commands in *lines 370–372* of Advances_Statistics_Code_Change.R execute the analysis of covariance, and *lines 441–444* return these values. Your values will differ.

In the denouement of our thought experiment we now have four statistics: the absolute change Δ*y*, the percent change %Δ, the symmetrized percent change s%Δ, and the slope of the relationship between the absolute change Δ*y* and the initial value *y*_{i}. If we dispense with absolute change because it can be impacted by initial value, then we want to ask, is our conclusion about the control and treated groups consistent regardless of the statistic we compute? The answer is yes: the control and treated groups demonstrated similar changes regardless of whether we focused on percent change, symmetrized percent change, or the slope of the relationship between the absolute change and initial value.

So far, so good. The standardized metrics percent change and symmetrized percent change lead us to the same scientific conclusion as the regression technique analysis of covariance. Perhaps we should have expected this: percent change (*Eq. 1*) and symmetrized percent change (*Eq. 2*) are ratios, and we defined the relationship between the numerator Δ*Y* and the denominator *Y*_{i} to be a straight line through the origin (see Ref. 13).^{6}

But what happens if the relationship between the numerator Δ*Y* and the denominator *Y*_{i} is a straight line that intersects someplace other than the origin? Suppose we defined the coefficients β_{0} and β_{1} as

in *lines 161–164* of Advances_Statistics_Code_Change.R. By using these values in *Eq. 3*, we generate the observed values of Δ*y* and the measured values of *y*_{i} for the control (*group 0*) and treated (*group 1*) groups listed in Table 2.

In this situation, the physiological thing we care about decreases 74% and 76% (Table 2, %Δ column) and 59% and 61% (Table 2, s%Δ column) in the two groups. For each of these standardized metrics we reject the null hypothesis (*P* < 0.001) and conclude that %Δ and symmetrized %Δ differ in a small but statistically convincing manner in the two groups. The commands in *lines 448–454* of Advances_Statistics_Code_Change.R return these results. Your results will differ.

In contrast, analysis of covariance estimates the true relationship between the absolute change Δ*Y* and the measured initial value *Y*_{i} + ξ in our two groups as

and reveals that the slope of the relationship between the absolute change and the initial value is similar in the two groups (*P* = 0.61; see Fig. 2). The commands in *lines 370–372* of Advances_Statistics_Code_Change.R execute the analysis of covariance, and *lines 451–454* return these values. Your values will differ.

Again we have trouble (see Ref. 13). If we analyze percent change and symmetrized percent change, then we conclude that the physiological thing we care about decreased more in the treated group than it did in the control group. But if we analyze the relationship between absolute change and initial value using analysis of covariance, then we conclude that the relationship–the slope Δ*Y*/*Y*_{i}–is identical in the two groups (see Fig. 2).

We make sense of these conflicting conclusions as we did when we explored ratios: we recall that percent change (*Eq. 1*) and symmetrized percent change (*Eq. 2*) are useful only when the relationship between the numerator and the denominator is a straight line through the origin. Here, for each of these standardized metrics, it is not.

Suppose we define the coefficients β_{0} and β_{1} in *Eq. 3* as

The physiological thing we care about decreases less in the treated group regardless of whether we consider percent change (−87% vs. −91% in the control group) or symmetrized percent change (−78% vs. −83% in the control group) (see Fig. 3). If we use analysis of covariance, we learn that

These slopes differ convincingly (*P* < 0.001).

Last, suppose we define the coefficients β_{0} and β_{1} in *Eq. 3* as

The physiological thing we care about decreases less in the treated group regardless of whether we consider percent change (−72% vs. −74% in the control group) or symmetrized percent change (−57% vs. −59% in the control group); see Fig. 3. If we use analysis of covariance, we learn that

Once again, these slopes differ convincingly (*P* < 0.001).

In each of these situations, the result of our single simulation is typical. We can satisfy ourselves of this assertion if we bootstrap the *P* values from 99 additional simulations (see Ref. 8). When the slope of the true relationship between absolute change and initial value is identical in the two groups, percent change and symmetrized percent change lead us to the same scientific conclusion as analysis of covariance only if the relationship between absolute change and initial value goes through the origin (Table 3). When the slope of the true relationship between absolute change and initial value differs in the two groups, percent change and symmetrized percent change lead us to the same scientific conclusion as analysis of covariance only if the relationship between absolute change and initial value goes through the origin for at least one of the two groups (Table 4).

Residual plots confirm that each of our analysis of covariance models is appropriate (not shown; see Ref. 11).

### Practical Considerations

*Equation 3* defines a statistical model for the relationship between the absolute change and the initial value of the physiological thing we care about. If instead of the absolute change, we want to think about the final value *Y*_{f} of the thing we care about, we can define an analogous model for the relationship between the final value *Y*_{f} and the initial value *Y*_{i}:

In this model, = 1 + β_{1}.

### Summary

Just as a ratio is seductively simple, so too are the standardized metrics percent change and symmetrized percent change. If the relationship between the absolute change Δ*y* and initial value *y*_{i} is not a straight line through the origin, then percent change and symmetrized percent change will lead us astray. They are, after all, just ratios. And, as we just discovered, this means we may fail to find a group difference that does exist, or we may find a group difference that does not exist.

In contrast, as this exploration has demonstrated, analysis of covariance, a regression technique that others (19, 20, 25, 26) have advocated for the analysis of absolute change, is versatile and provides more detailed information about possible group differences in the physiological thing we care about.

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## ACKNOWLEDGMENTS

We began our collaboration in June 2009 at the Reading for the Advanced Placement Statistics exam in Louisville, KY. We connected over a former housemate, friend, and colleague: Jerry Waldvogel, an energetic, well-known biologist and science educator who had been at Clemson University since 1989. In one of life's cruel twists of fate, Jerry had died unexpectedly just 10 days before the Reading. Jerry would be quite amused–and quietly pleased–to know he had any role whatsoever with this paper.

We thank Gerald DiBona (University of Iowa College of Medicine, Iowa City, IA), John Ludbrook (Department of Surgery, The University of Melbourne, Melbourne, Victoria, Australia), and Matthew Strand (National Jewish Health, Denver, CO) for their helpful comments and suggestions.

## Footnotes

↵1 This file is available through the Supplemental Material link for this article at the

*Advances in Physiology Education*website.↵2 The car package accompanies

*An R Companion to Applied Regression*(18).↵3 The notation click

*A*|*B*means click*A*, then click*B*.↵4 If the final value

*y*_{f}is expressed as a fraction of the initial value*y*_{i}, that is,*y*_{f}/*y*_{i}, this quantity is equivalent statistically to the absolute change Δ*y*standardized to the initial value*y*_{i}:↵5 For two or more groups, analysis of covariance estimates–and compares statistically–components of the relationship between a numerator and a denominator. Typically these components include the

*y*-intercept and the slope of the relationship between the numerator and the denominator.↵6 The latter observation holds also for symmetrized percent change (

*Eq. 2*).

- Copyright © 2015 The American Physiological Society