![]() |
|
|
The Journal of Nutrition Vol. 128 No. 3 March 1998,
pp. 661-663
Department of Health Studies and Gerontology, University of Waterloo, Waterloo, Ontario, N2L 3G1 Canada
Animal models make an important contribution to understanding the mechanisms whereby nutrition supports optimal human development. This is because studies conducted in humans often may be susceptible to factors such as bias or confounding due to the fact that ethical considerations preclude the inclusion of the appropriate control measures. In view of the ease with which they breed, and their relatively short gestation period, animals such as pigs or rodents (e.g., rats and mice) are often the species of choice in studies of this type. However, one important difference between these species and humans is that the animals are normally multiparous, i.e., a single pregnancy results in the birth of more than one offspring, termed collectively as a litter. Therefore, for the conclusions drawn from such studies to be considered valid, it is important that factors relating to litters be taken into account both in the design of the study and in the statistical analysis of the data. Although this has become the norm in fields such as toxicology, where the teratogenic potential of drugs or other environmental agents is an important consideration (Tilson 1992 Generally the type of research question addressed by these studies is the relationship between particular levels of a dietary nutrient (ranging from deficiency to excess) during the prenatal and preweaning period and various developmental outcomes, usually measured longitudinally from birth until weaning or beyond. Typically, in such studies, although the outcome is measured in the offspring, the treatment is implemented through random assignment of the pregnant and lactating dam to diets of different composition. In using this approach, however, it is important to be aware that it is highly likely that animals within a litter will tend to resemble each other to a greater degree than will animals from different litters, i.e., there is a potential for correlation among observations within a litter. Examples of factors that will contribute to these similarities within a litter are those related either to genetic inheritance or to maternal environment, such as, for example, intrauterine environment and/or factors related to maternal performance during lactation. Thus strictly speaking, the design of such experiments is hierarchical, with dams nested within treatments and pups nested within litters. Notwithstanding, it is not uncommon for published research in this area to fail to take the effects of litter into account with respect to the design and execution of such studies. Findings are often based on analyses conducted on individual pups from a very small number of litters, in some cases no more than two! Moreover, another issue that arises in the context of the litter is that maternal variables may confound the interpretation of the treatment effect, such that effects observed in the pups in fact may be due to the effects of the treatment on the health of the dam. There is also the concern in longitudinal studies of this type that differential mortality and/or inappropriate sampling procedures with respect to litter might lead to biased results. Exhaustive discussion of these issues is available from various sources in the literature, (e.g., Wainwright and Ward 1997 Design.
As in all experimental studies, random assignment of pregnant animals to dietary treatment groups is a prerequisite for eliminating potential sources of bias at the outset of the study. But nonetheless the possibility remains that the beneficial effects of randomization might be compromised due differential mortality or "drop-out" of subjects during the course of the experiment. As an example of this, consider a study on the effects of deficiency of a specific nutrient on birth weight in two strains of mice. In this study, animals of strain A are significantly heavier at birth than those of strain B, suggesting that strain B is more sensitive to the growth-retarding effects of the treatment. However, there is also a significant reduction in litter size in the treated dams of strain A. This would suggest that the pups weighed at birth represent only the healthiest offspring of this strain, whereas, in contrast, those from strain B represent the full range of the effects of treatment. Thus by taking the effects on litter size into account, one is led to the opposite interpretation that it is in fact strain A that is more sensitive to the in utero nutritional deficiency. Moreover, it is not possible in this instance to determine whether the effects are direct effects of the treatment on the developing offspring or indirect effects mediated through effects of the treatment on maternal physiology. To address this, it becomes necessary to include further measurements of appropriate endpoints in both the dams and their offspring. Similar concerns apply when the treatment interval includes not only gestation, but also lactation, where the dam suckles her own pups. Sometimes, in the case of studies of postnatal nutritional interventions, it is possible to circumvent this by feeding the offspring directly, as in the feeding of artificial milk substitutes to rat pups through indwelling gastrostomy tubes or in the hand rearing by bottle feeding of piglets. But the variation related to the litter of origin of the pups still cannot be ignored in the design of such studies. As an example, consider a study in bottle-fed piglets where the control and experimental group each consist of a separate litter with 10 piglets in each litter. In such a case, it is clear the effects of treatment are completely inseparable from the effects of litter. Thus to control for litter-effects in these types of artificial rearing studies, it is imperative that at the outset of the study animals from any one litter be assigned randomly to each of the treatment groups. A similar concern obtains in developmental studies using litters where the effects of the treatment are assessed longitudinally at various time points. Very often the descriptions of the sampling procedures used in these studies are not explicit with respect to how animals are sampled from litters, and it is in fact possible that in some cases entire litters may have provided the sample at each of the time points, such that the effects of age may be biased by the effects of litter. The best approach in such instances is to equate litter size at birth, usually with equal representation of males and females and then to select randomly one animal of each gender from each litter at each time point. One concern in using this procedure is that litter size will be introduced as a potential source of bias because it in turn will decrease over time. This can be prevented by keeping spare litters of "replacement pups" of the same age to equalize litter size across time, as long as these pups are suitably identified so that they cannot be to used to generate data for the study.
Statistical analysis.
Consideration of the probability of "type 1" and "type 2" errors are integral to the validity of conclusions based on the statistical analysis of the data. The probability of type 1 error is set by the experimenter as the
![]()
INTRODUCTION
Introduction
References
), it appears to be an issue that often is overlooked in developmental nutritional studies. The purpose of this article therefore is to present the relevant arguments with the intent of increasing awareness of this topic in the community of nutritional researchers.
, Zorrilla 1997
) and the following is intended merely to provide a summary overview, dealing first with issues of design and then of analysis.
level and is the probability of falsely rejecting the null hypothesis, i.e., claiming an effect when there is none. Type 2 error (
) is the probability of failing to identify a true effect, and is related inversely to the power of the experiment. Power (1-
) in turn is defined as the probability of identifying correctly an effect of a specified size, usually expressed in terms of "effect size", i.e., the absolute difference between the groups, divided by the standard deviation of the measure. The decision about the magnitude of effect sought depends on the judgement of the experimenter, based on an understanding of "real-world" context. Power is related to the effect size and the sample size, such that the larger the difference between the groups, the smaller the variance, and the larger the sample size the greater the power. Power is also related to the
level, in that, for a given effect size and sample size, as the
level becomes more stringent, e.g., changes from 0.05 to 0.01, this change is accompanied by an increase in the probability of a Type 2 error, and hence a decrease in power.
level and hence increase the probability of a Type 1 error such that one may falsely claim statistical significance for the effects shown (Holson and Pearce 1992
). Biostatisticians advise various ways of dealing with this. The most conservative approach is that of including litter as a random nested factor in the analysis, which besides controlling for litter effects, also allows one to assesses their significance. Alternatively, other multilevel analytical approaches can be used that first assess the degree of correlation in the data and then adjust the analysis accordingly (Lefkopoulou et al. 1989
). Another approach is to use only one score per litter, either a litter mean score based on the average of some subset of pups that are tested in the litter or the score of a single animal per litter. Because it is rarely practical to obtain outcome measures on all offspring in a litter, this two-stage sampling procedure is fairly common, where first a random sample of pregnant dams is assigned to treatment and then a second random sample of one or more treated pups is assessed with respect to outcome. But this is not to say that this approach is without its problems, these being specifically a reduction in power related to increased variability of measures based on a single animal. This can be offset to some extent by averaging across more than one score per litter and thereby decreasing the influence of outliers. But, given that measurements per pup are usually labor and/or resource intensive and that increasing the number of pups measured in a litter does not increase the sample size (if based correctly on the number of litters), the question then becomes of how to maximize power for the same amount of effort expended in testing the pups. Again, simulations comparing effects on power of averaging across pups within a litter with that of testing the same number of pups, but each from different litters, indicate that one obtains greater power with the latter approach that uses a larger numbers of litters (Holson and Pearce 1992
).
level, where a certain proportion of these outcomes may be significantly different by chance (also known as "experiment-wise" error). There are various ways of controlling for this statistically. The approach usually taken is that of using more conservative procedures to compare the groups, such as Tukey or Bonferroni t tests, that adjust the
level for the number of tests being conducted. The drawback associated with this is a concomitant decrease in power. A more powerful alternative approach is that of designating a discrete set of questions (no more than the degree of freedom) based on the experimental hypothesis being tested, and to address only these questions using preplanned comparisons (Cook and Farewell 1996
). In some cases, repeated measures can be useful, as for example, in the case of the longitudinal design described above, where one animal from each litter in a group is observed at each time point, the data would be analyzed as repeated measures on litter, where the treatment by time interaction would provide an indication of differences in the response over time between the treatments. Further post hoc tests then could be used to identify the source of this interaction. When multiple outcome measures are included in a study, some researchers advocate the use of multivariate procedures to control experiment-wise error, although this remains controversial (Huberty and Morris 1989
). A final point that needs to be considered with respect to the power of testing multiple outcomes is that, because of differences in the variability of various measures, a sample size (based on litters) that is appropriate for one measure, will yield very low power for a measure with higher variability. In such cases, an effect that is statistically nonsignificant might be interpreted as "no effect," rather than as an indication of low power, which is of particular concern when issues related to the safety of the treatment are at stake.
| |
FOOTNOTES |
|---|
Manuscript received 15 September 1997. Initial reviews completed 23 September 1997. Revision accepted 11 November 1997.
| |
LITERATURE CITED |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
M. E Symonds, T. Stephenson, and H. Budge Early determinants of cardiovascular disease: the role of early diet in later blood pressure control Am. J. Clinical Nutrition, May 1, 2009; 89(5): 1518S - 1522S. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Downing, C. Balderrama-Durbin, J. Hayes, T. E. Johnson, and D. Gilliam No Effect of Prenatal Alcohol Exposure on Activity in Three Inbred Strains of Mice Alcohol Alcohol., January 1, 2009; 44(1): 25 - 33. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Y. Szeto, A. Aziz, P. J. Das, A. Y. Taha, N. Okubo, S. Reza-Lopez, A. Giacca, and G. H. Anderson High multivitamin intake by Wistar rats during pregnancy results in increased food intake and components of the metabolic syndrome in male offspring Am J Physiol Regulatory Integrative Comp Physiol, August 1, 2008; 295(2): R575 - R582. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. D. Sinclair, C. Allegrucci, R. Singh, D. S. Gardner, S. Sebastian, J. Bispham, A. Thurston, J. F. Huntley, W. D. Rees, C. A. Maloney, et al. DNA methylation, insulin resistance, and blood pressure in offspring determined by maternal periconceptional B vitamin and methionine status PNAS, December 4, 2007; 104(49): 19351 - 19356. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C McCann and B. N Ames Is docosahexaenoic acid, an n-3 long-chain polyunsaturated fatty acid, required for development of normal brain function? An overview of evidence from cognitive and behavioral tests in humans and animals Am. J. Clinical Nutrition, August 1, 2005; 82(2): 281 - 295. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Tarin, S. Perez-Albala, S. Perez-Hoyos, and A. Cano Postovulatory Aging of Oocytes Decreases Reproductive Fitness and Longevity of Offspring Biol Reprod, February 1, 2002; 66(2): 495 - 499. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Tarín, S. Pérez-Albalá, A. Aguilar, J. Miñarro, C. Hermenegildo, and A. Cano Long-Term Effects of Postovulatory Aging of Mouse Oocytes on Offspring: A Two-Generational Study Biol Reprod, November 1, 1999; 61(5): 1347 - 1355. [Abstract] [Full Text] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||