Whether as educator or scientist, I always have difficulty in explaining statistics in a simple way to students or colleagues involved in biological studies. Biologists examine their results extremely critically and carefully choose the appropriate analytic methods depending on their scientific objectives. However, no such close attention is usually given to the choice of relevant statistical tools and to assessment of how statistical methods may influence the validity of the results. We thus encourage interdisciplinary science and we support collaboration between biologists and statisticians from the very beginning of a project.

This illumination describes a simple way to explain the limitations of statistics to scientists and students to avoid the publication of misleading conclusions.

Let’s consider a biological phenomenon to be observed. We will compare this biological event to a geometrical figure (for instance, a parallelepiped). Because biological processes are unknown and cannot be observed easily, we must first postulate that no geometrical figure (such as the parallelepiped) can be described directly by anybody.

Statistical tools help one to analyze experimental data that themselves describe the studied biological event. With this in mind, the different experimental designs or the different statistical tools (1, 3) available for scientists (*t*-test, ANOVA, regression, etc.) can be compared with different lights that illuminate the geometrical figures. Just as the biologists observe the biological processes that they are studying indirectly, the observers cannot see the geometrical figures that they would like to describe, but only their shadows.

The first point is that each experimental design or each statistical test allows a specific description of the studied biological event, and this description may differ from that provided by another experiment or another test. It is indeed well known that a researcher looking at the putative effect of a treatment by comparison with a control group can conclude that the effect of the treatment is significant or not depending on the experiment or on the statistical test that is used (3). In our comparison, the different shadows of the parallelepiped do indeed differ depending on the position of the lights. In other words, we can deduce only a specific part of the parallelepiped from the shadow depending on the position of the light. Moreover, the representation of the parallelepiped is more or less similar to the shape of the original figure. For instance, the shadow may be higher or smaller than the original figure with a light located beside or above the figure, respectively. Similarly, in biology, the difference between two groups of subjects (a control and an experimental group) may be amplified or reduced relative to the reality depending on the experimental design or the statistical method. In the first case, the probability of detecting false-positive effects is too high, and in the second case the probability of obtaining false-negative effects is too high (the statistical test does not allow the researcher to detect any significant difference; Fig. 1). This simple analogy emphasizes that scientists may use the wrong statistical techniques to analyze their data (1). It is just like a physician who uses the wrong drugs to cure his patients of an illness. If this is so, the scientist will overestimate or underestimate the treatment effect he is studying, just like the shape of the shadow, which may amplify or reduce the shape of the original figure (Fig. 1).

Even if the researcher uses the appropriate experimental design and the appropriate statistical method, s/he may be unable to detect any significant difference between a control group and an experimental group. This may be due, for instance, to a low number of subjects per group. Sample size is indeed a key parameter that determines the power of the experimental design (2). For instance, it can be easily calculated using the *t*-test formula that the number of subjects per group increases when the difference we wish to detect is lower or similar to the intragroup variability due to technical and/or biological sources of differences (Fig. 2).

We can illustrate this problem by using lights with different powers (Fig. 3). Although the light induces a shadow similar to that of the original parallelepiped, this shadow may be black, gray, or even almost white if the light is of low power. In the last case, it would be impossible to observe the parallelepiped.

Similarly, scientists may be unable to describe the biological phenomenon they are looking at due to an inappropriate number of samples despite appropriate experimental design and appropriate statistical methods. This phenomenon may be compared with a physician who uses the right drug but gives the wrong dose of the drug.

We hope that this Illumination will constitute a helpful description of statistics. The goal is to reduce the number of scientific studies published that use statistical methodology incorrectly, as observed, for instance, in molecular biology (1).

- © 2004 American Physiological Society