Journal of Nutrition LabDiet, Your World of Nutritional Answers

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Shi, H.
Right arrow Articles by Kristal, B. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shi, H.
Right arrow Articles by Kristal, B. S.

© 2002 The American Society for Nutritional Sciences J. Nutr. 132:1039-1046, 2002


Nutritional Models

Characterization of Diet-Dependent Metabolic Serotypes: Primary Validation of Male and Female Serotypes in Independent Cohorts of Rats1 ,2

Honglian Shi*, Karen E. Vigneau-Callahan{dagger}, Alexander I. Shestopalov*, Paul E. Milbury**, Wayne R. Matson{dagger} and Bruce S. Kristal*,{ddagger}3

* Dementia Research Service, Burke Medical Research Institute, White Plains, NY 10605; {dagger} ESA, Incorporated, Chelmsford, MA 01824; ** Antioxidants Research Laboratory, Jean Mayer U.S. Department of Agriculture Human Nutrition Research Center on Aging at Tufts University, Boston, MA 02111; and {ddagger} Departments of Biochemistry and Neuroscience, Cornell University Medical College, New York, NY 10021

3To whom correspondence should be addressed. E-mail: Bkristal{at}burke.org.

Back


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 LITERATURE CITED
 
Our research seeks to identify serum profiles, or serotypes, that reflect substantial changes in food intake in both male and female rats. This report validates previously defined subsets of redox-active low-molecular-weight metabolites using independent cohorts of ad libitum consumption (AL) and energy or dietary restricted (DR) 6-mo-old male and female rats. In the male study, both hierarchical cluster analysis (HCA) and principal component analysis (PCA) distinguished the dietary groups of origin in the second male cohort with >85% accuracy using 56 analytically and biologically valid metabolites. Further analysis revealed that 29 metabolites (nine previously unidentified metabolites + 20 chosen from the 56 metabolites) enabled HCA to distinguish dietary groups at 100% efficacy. In the female study, the 63 previously identified serum metabolites were sufficiently robust to enable classification of the dietary intake of two female cohorts (cohorts 2 and 3) that were independent of the cohort on which these markers were initially identified (cohort 1). Classification accuracy was 94 and 100% using HCA and PCA, respectively, in the female cohort 2. HCA and PCA revealed that the 63-metabolite profile distinguished AL and DR samples at 91 and 100% accuracy in the female cohort 3, establishing the 63-metabolite dataset as our baseline profile. These studies used independent cohorts to validate and potentially improve upon previously defined metabolic serotype in male and female rats and set the stage for pattern recognition–based approaches to establish metabolome-based categorical separations.


KEY WORDS: • dietary restriction • HPLC • serum metabolite • multivariate • biomarker • rats


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 LITERATURE CITED
 
Dietary restriction (DR),4 undernutrition without malnutrition, has been an important area of research interest for >70 y; substantial evidence documents its beneficial effects in animals. DR extends longevity in essentially all animals in which it has been tested, including multiple mammalian species (rat, mouse, guinea pig), and DR is the most potent and reproducible known means of extending life span and reducing morbidity in higher animals (1Citation –11Citation ). DR alters many basic physiologic processes, including metabolism, hormonal balance and the generation of, detoxification of and resistance to reactive oxygen species (12Citation ). These data are consistent with multiple epidemiologic studies in humans showing the detrimental effects of obesity on health (13Citation ). The beneficial effects of DR on animal health reveal that the physiologic effect of DR in mammals is large, and encourage us to hypothesize that a serum profile that can distinguish long-term patterns of energy intake exists and can be identified using the DR rat as a model system.

To establish serotypes that accurately reflect substantial changes in food and/or energy intake such as that which occurs in animals subject to DR (1Citation –10Citation ,14Citation –17Citation ), we are investigating low-molecular-weight molecules that are chemically redox active using HPLC coupled with coulometric electrochemical array detectors. Initial studies showed that HPLC separations coupled with coulometric array detectors (18Citation –27Citation ) could detect ~1200 compounds in rat sera, of which ~300 were analytically reliable and ~240/290 were biologically reliable in young female/male rats [(14Citation ) and companion paper(28Citation )]). Among them, 101 (female) and 112 (male) metabolites differed between AL and DR rats by automated analysis. As described in the companion paper (28Citation ), attention to analytical issues reduced the number of metabolites under study to 63 in female rats and 52 in males. Metabolite subsets enabled both HCA and PCA to group the initial cohorts of both male and female rats with 100% accuracy.

These previously obtained data demonstrate feasibility, that is, that quantitative analysis of selected sera metabolites can yield sufficient information by which to classify the dietary intake of a group of rats and set the stage for validation of these metabolic serotypes in independent cohorts. In this report, we utilize independent cohorts of male and female rats to address the next stage of this problem, i.e., whether specific metabolites exist that robustly (e.g., across different cohorts of rats, and eventually across species) enable determination of dietary group of origin.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 LITERATURE CITED
 
Animal husbandry.

Details of the rats and husbandry conditions used in this study were reported previously (14Citation ). Briefly, male and female Fischer 344 x Brown Norway F1 rats were obtained monthly from the National Institute on Aging colony at Harlan (Indianapolis, IN). The basic animal husbandry for the animals used followed NIA/NIH guidelines as implemented by Harlan. Rats were fed NIH-31 (AL rats only) or vitamin/mineral–fortified NIH-31 (DR rats only); see (14Citation ) for detailed diet compositions. All rats were housed individually, and DR feeding regimens (40% less food than eaten by AL rats) were implemented at 6 wk of age. Rats were imported from Harlan at 5 mo of age and killed ~1 mo later. As noted in the previous report (14Citation ), we found that rats in our colony that consumed food AL ate slightly less food than at Harlan, and restriction was therefore slightly modified. Specifically, once adjusted, food restriction was imposed at a level of ~35% less than ad libitum consumption (14Citation ). Sera were collected after killing by decapitation to avoid known differential effects of anesthesia on variables of interest. Collected blood was allowed to clot on ice for 30 min before centrifugation (1000 x g, 10 min). Female cohorts 1 and 2 had 8 AL and 8 DR rats; female cohort 3 had 6 AL and 5 DR rats; male cohort 1 had 5 AL and 8 DR rats; male cohort 2 had 7 AL and 8 DR rats. All animal experiments were performed under institutionally approved protocols and complied with the Guide for the Care and Use of Laboratory Animals (29Citation ).

HPLC methodology.

HPLC separations and coulometric array detection was conducted essentially as described previously, using an ESA CoulArray system (ESA, Chelmsford, MA) (14Citation ,18Citation ,25Citation ,26Citation ).

Statistical analysis.

Data analysis is described in the text. Data were analyzed using the programs CEAS 504 (ESA), Statview 5.0.1 (SAS Institute, Cary, NC) and Pirouette 2.7/3.0 (Infometrix, Woodinville, WA). Descriptions of the applications of the techniques of hierarchical cluster analysis (HCA) and principal component analysis (PCA) are presented in the companion paper (28Citation ); mathematical formulas are available in the Pirouette manual. Other aspects of the data analysis are described in the text.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 LITERATURE CITED
 
To establish serotypes that reflect substantial changes in food intake, we are investigating redox-active serum metabolites that differentiate dietary groups utilizing both male and female Fischer 344 x Brown Norway F1 rats. Primary studies showed that there exist subsets of metabolites that can distinguish AL and DR samples of male and female rats, respectively [see accompanying paper (28Citation )]. In this report, we present data showing that these metabolites previously identified as informative in one cohort can also be used to distinguish dietary groups in independent cohorts. The first part of the Results and Discussion section shows that the markers of female rats are sufficient to allow discrimination of the groups by HCA and PCA. The second portion shows that initially identified male serotype is also capable of distinguishing the groups with HCA and PCA. Subsequent analyses identified additional metabolites that could be used to establish more powerful data set for future studies.

    Validation of female serotype. The companion paper (28Citation ) identified 63 variables that were biologically (across cohorts) and analytically robust and that were sufficient to discern dietary group in cohort 1 using classification algorithms. The previous study (on cohort 1) was carried out using two basic scaling techniques, range scaling and autoscaling, and three grouping algorithms, single, complete and centroid. Each of the six combinations of preprocessing and grouping algorithms yielded 100% accuracy with cohort 1. As noted in that report, however, autoscale preprocessing involves mean-centering and variance-scaling the data, and is a generally robust technique. In contrast, in range scale preprocessing, which is commonly used for graphing, the highest value in the dataset is assigned a value of 1 and the lowest a value of 0. Remaining values are determined by proportion. This method is highly sensitive to outliers and has limited utility in subsequent cohorts because the overall range in a subsequent cohort may change. For this reason, we will not pursue range scale preprocessing further, and will focus on autoscale-based analysis. Each of the three grouping methods (single, centroid, and complete) correctly identified the dietary group of origin of 100% of the cohort 1 samples, but the complete grouping algorithm gave the best separation in cohort 1. Therefore, our initial validation test (in cohort 2) will use the complete grouping algorithm.

    HCA with the 63 previously identified metabolites distinguished AL and DR rats with 94% accuracy. HCA was used to determine whether the metabolites previously shown to identify the dietary group of origin of a given rat in one cohort retained this ability in an independent cohort. Using autoscale preprocessing and the complete grouping algorithm, HCA was 94% accurate at distinguishing the diet group of the 16 rats in cohort 2. Analyses using single and centroid grouping algorithms are also shown for comparison (Fig. 1Citation ).



View larger version (29K):
[in this window]
[in a new window]
 
Figure 1. Hierarchical cluster analysis (HCA) distinguishes ad libitum consumption (AL) and dietary restricted (DR) serotypes in female cohort 2. Dendrogram of analysis of the sera from 16 6-mo-old AL and DR female Fischer 344 x Brown Norway F1 rats based on manually confirmed quantitation of 63 metabolites previously identified as potential markers. Three independent analyses were conducted as described in the text. Relative similarity within the total study population increases as one moves from right (0.0) to left (1.0, biochemical identity) on the horizontal axis. The most distinct groups are linked at "0" similarity by Pirouette. Heavy horizontal line added for emphasis. Shaded samples were misclassified. Accuracy defined as {16 - [(number of DR in AL set) + (number of AL in DR set)]/16} and is expressed as a percent.

 
    PCA of the DR serotype. To further evaluate the ability of the variables being studied to encode sufficient information to identify their group of origin, we examined this dataset using PCA (also called eigenvector analysis). The data presented in panel A of Figure 2Citation show that there were no outliers in the dataset from cohort 2. The two-dimensional projection of a plot of the first three principal components (Fig. 2Citation B) shows that a planar separation of the two dietary groups existed. Figure 2Citation C shows that the principal components identified were, as expected, weaker than in the initial study [>30% in Principal Component 1 vs. >50% initially (28Citation )], but nearly 60% of the variability was still captured within the first two principal components. We further addressed the ability of PCA to capture the analyte by analyte variability present by looking at modeling power. Modeling power analysis revealed that 10 components captured nearly 60% of the variation of nearly 60% of the analytes (Mean modeling power ± SD = 59 ± 15%; Median modeling power 59%).



View larger version (17K):
[in this window]
[in a new window]
 
Figure 2. Principal component analysis (PCA) studies reveal that most variability present in the dataset of female cohort 2 can be captured using the previously defined 63 metabolites. PCA of the sera from 16 6-mo-old ad libitum consumption (AL) and dietary restricted (DR) female Fischer 344 x Brown Norway F1 rats based on manually confirmed quantitation of 63 metabolites previously identified as potential markers. Autoscale preprocessing. Total number of components used was set at 10. (A) Multivariate outlier analysis supports validity of the test dataset. Each point represents an individual rat. Outliers would be to the right of the dark vertical bar or above the dark horizontal bar. Analysis was conducted using the PCA package in Pirouette. (B) The graphic shown is a two-dimensional projection of three principal component axes. The three axes are marked Factor 1, Factor 2 and Factor 3 and refer to Principal Components 1, 2 and 3, respectively. Rotation of the axes was carried out to highlight group separation and should not be used to quantitate degree of separation. Each mark (solid squares, DR; open circles, AL) represents an individual rat. Double dotted line added to emphasize split between groups, regions occupied by AL or DR areas as noted. (C) Component analysis. Percent refers to the percentage of variability capture by the components. Thin line, specific component, thick line, cumulative. Only first four components shown for clarity. (D) Modeling power, or the ability of the components to accurately describe each metabolite in the dataset, is described as (1 - [metabolite residual variance/total metabolite variance]). Therefore, as the components accurately capture the variation present in a given metabolite, the second term approaches zero and modeling power approaches 1. Thus, metabolite variation across the different rats in a cohort is better captured as modeling power approaches 1, and more weakly captured as numbers approach 0. Data are presented in order of increasing modeling power.

 
    Eliminating metabolites based on analytical issues did not weaken the dataset. The data presented in Figures 1Citation and 2Citation reveal that it was possible to use all 63 metabolites to distinguish dietary groups using predefined algorithms in an independent cohort. We then sought to further improve our dataset.

As described in the companion paper (28Citation ), our initial strategy was chosen to reduce Type II statistical errors (believing that the serum concentration of a given metabolite does not differ between AL and DR when in reality it does) at the expense of making an increased number of Type I statistical errors (believing that the serum concentration of a given analyte does differ between AL and DR when in reality it does not). We expect that many of the metabolites initially identified represent statistical noise (artifacts). Because removing noninformative data can help classification algorithms work more efficiently, we again pared the dataset by keeping only those metabolites that reached a significance of P <= 0.2. Overall, t test analysis determined that 37 of the 63 metabolites tested differed between AL and DR rats at P <= 0.2. Note that only 1 in 25 metabolites would be expected to be significant in both P <= 0.2 tests by chance alone, suggesting that the majority of metabolites in this 37 metabolite dataset displayed a real statistical difference (i.e., P < 0.05) between AL and DR rats.

To determine whether the dataset remaining (37 metabolites) retained the properties of interest after this winnowing, we repeated the multivariate analyses described above on the same cohort used above. HCA showed that discrimination between dietary groups was retained in the smaller dataset, and, indeed, slightly improved (compare "single" in Figs. 1Citation , 3Citation and see below). In addition, empirical analysis suggested that the misclassification seen in the complete analysis resulted from a single metabolite because upon the removal of this analyte, 100% accuracy was obtained (Fig. 3Citation , Panel B). PCA analysis showed that there were no outliers in the 36-metabolite dataset (Fig. 4ACitation ). The two-dimensional projection of a plot of the first three principal components shows that a planar separation of the two dietary groups existed (Fig. 4Citation B). Figure 2Citation C shows that the primary principal component was better able to capture analyte variability after we reduced the metabolite number (~45% in Principal Component 1 vs. <30% initially). Modeling power was also enhanced (Mean modeling power ± SD = 70 ± 12%; Median modeling power 72%). These data suggest that we can clean the dataset of biologically or statistically "noisy" metabolites without losing any of the discriminating power of the metabolome.



View larger version (36K):
[in this window]
[in a new window]
 
Figure 3. Hierarchical cluster analysis (HCA) using a smaller dataset distinguishes AL and DR serotypes in female cohort 2. Dendrogram of analysis of the sera from 16 6-mo-old ad libitum consumption (AL) and dietary restricted (DR) female Fischer 344 x Brown Norway F1 rats based on manually confirmed analysis serum metabolites identified as potential markers (Panel A, 37 metabolites; Panel B, 36 metabolites). Three independent analyses were conducted as described in the text. Details as described in the legend to Figure 1Citation .

 


View larger version (19K):
[in this window]
[in a new window]
 
Figure 4. Principal component analysis (PCA) studies reveal that most variability present in the dataset can be captured using 36 metabolites. Details as in Legend to Figure 2Citation .

 
The discriminating power of the 63- and 37-peak datasets were then compared directly by determining the distance from the origin (extreme right-hand side of the dendrogram) to the first split of both the AL and DR sides of the dendrogram. Complete analysis of the 63-peak dataset (Fig. 1)Citation yielded a separation of 0.38 (0.244 + 0.136 = 0.380) with 94% accuracy. Complete analysis of the 37-peak dataset (Fig. 3Citation A) yielded a separation of 0.55 (0.330 + 0.218 = 0.548) with 94% accuracy. Complete analysis of the 36-peak dataset (Fig. 3Citation B) yielded a separation of 0.54 (0.198 + 0.337 = 0.535) with 100% accuracy. These data suggest that cleaning the dataset of variables that appear noisy within that dataset may indeed contribute to an apparent increase in separation between the groups. Tests of the effect of removing these variables on the ability to discern diet in independent cohorts will be examined below.

In addition, the ability of the 26 metabolites removed to distinguish AL and DR groups was tested. As shown in Figure 5Citation , clusters built on these 26 metabolites had only 60–63% accuracy in distinguishing AL from DR (note that 50% is the floor for accuracy given the nature of the analysis). We also further pursued these analyses using PCA. PCA analysis was also unable to distinguish AL and DR (data not shown). The inability of PCA to distinguish AL and DR can be appreciated by examining the series of 2-D plots shown in Figure 6Citation , which show the two-dimensional representations of components 1–5 against each other. Together these data indicate that the metabolites removed from this analysis did not contribute significant signal to the discrimination of cohort 2 using the 63-metabolite dataset.



View larger version (29K):
[in this window]
[in a new window]
 
Figure 5. The 26 metabolites removed from the 63-metabolite profile failed to distinguish dietary group in the female cohort 2. Dendrogram of analysis of the sera from 16 6-mo-old ad libitum consumption (AL) and dietary restricted (DR) female Fischer 344 x Brown Norway F1 rats based on manually confirmed analysis of 26 serum metabolites removed from the 63-metabolite profile as described in the text. Analyses were conducted using the single, centroid and complete grouping algorithms following autoscale preprocessing. The sample, designated AL12, is excluded from the determination of the overall accuracy in the centroid and single analysis because it is not grouped with the major groups. Other details as described in the legend to Figure 1Citation .

 


View larger version (19K):
[in this window]
[in a new window]
 
Figure 6. Two-dimensional plots of components 1–5 using the 26 eliminated metabolites failed to distinguish ad libitum consumption (AL) and dietary restricted (DR) samples. Principal component analysis (PCA) of the sera from 16 6-mo-old AL and DR female Fischer 344 x Brown Norway F1 rats based on manually confirmed analysis of 26 serum metabolites removed from the 63-metabolite profile as described in the text. The graphic shown is a series of two-dimensional plots of the first five component axes (denoted Component 1–5). Each mark (solid squares, DR; open circles, AL) represents an individual rat.

 
    Cohort 3 studies: "eliminated" metabolites can, however, contribute to separation mediated by the 37-metabolite set in other cohorts. The hypothesis that the 37-metabolite subset of the 63-metabolite profile contained equal discriminating capacity across cohorts was then tested directly. As shown in Figure 7ACitation , the 63-metabolite profile was able to distinguish dietary group using the previously defined approach in a third independent cohort (cohort 3) with 91% accuracy using HCA (Autoscale, complete). In contrast to the efficacy of HCA observed with the 63 metabolite dataset, HCA (Autoscale, complete) was unable to distinguish AL and DR rats using the 37-metabolite dataset (63% accuracy, Fig. 7Citation C). These data suggest two points: 1) there does exist at least one peak in the 26 cut during the previously described analysis that does contribute to distinguishing AL and DR sera under some conditions. We therefore elected to continue using 63 peaks for the immediate future. 2) HCA analysis, at least that attainable using the complete grouping method, lacks sufficient versatility/robustness to accomplish our aims.



View larger version (20K):
[in this window]
[in a new window]
 
Figure 7. Hierarchical cluster analysis (HCA) using complete analysis of the female cohort 3 reveals a loss of information in the 26 removed metabolites. Analysis of a third independent cohort using either the 63- (panel A) or 37-metabolite (panel B) profile. Dendrogram of analysis of the sera from 11 6-mo-old ad libitum consumption (AL) and dietary restricted (DR) female Fischer 344 x Brown Norway F1 rats based on manually confirmed analysis of the 63 (panel A) or 37 (panel B) serum metabolites. Analyses were conducted using autoscale preprocessing followed by complete grouping. Details as described in the legend to Figure 1Citation .

 
In contrast to HCA, the previously defined PCA-based approach readily distinguished the two dietary groups in cohort 3 (Fig. 8Citation ). These data suggest the possibility that HCA-based approaches are too weak overall to perform the necessary grouping. Another possibility, that the problem might lie in the grouping algorithm used for our HCA validation test (complete), was the problem. As shown in Figure 9Citation , single- and centroid-based analyses were sufficient to give 100% accuracy in those samples grouped, but were unable to assign groups to 18% (2 of 11) of the samples. Incremental analysis was able to group the samples with 91% accuracy. Thus, alternative grouping algorithms were better than the complete algorithm, but HCA still did not reach the accuracy and power offered by PCA. These exploratory approaches are consistent with the interpretation that PCA-based analyses are more powerful in addressing the grouping problem with our datasets. We are currently extending this approach to n-dimensional analysis based on pattern recognition driven categorization algorithms and supervised expert systems.



View larger version (18K):
[in this window]
[in a new window]
 
Figure 8. Principal component analysis (PCA) studies reveal that component-based analyses can offer 100% discrimination between ad libitum consumption (AL) and dietary restricted (DR) sera using either the 37- or 63-metabolite subset. PCA of the sera from 11 6-mo-old AL and DR female Fischer 344 x Brown Norway F1 rats based on manually confirmed quantitation of 63 or 37 metabolites previously identified as potential markers. Autoscale preprocessing. Details as in the legend to Figure 2Citation .

 


View larger version (30K):
[in this window]
[in a new window]
 
Figure 9. Hierarchical cluster analysis (HCA) showing that different clustering algorithms more effectively distinguish ad libitum consumption (AL) and dietary restricted (DR) sera using the 37-metabolite profile. Samples, rats and legend as in Figure 7Citation , panel B, but grouping algorithms as shown.

 
Validation of male serotype

A 52-metabolite data set was best in identifying the dietary groups in the first male cohort, but we analyzed all 76 metabolites in the second male cohort to avoid losing potentially valuable variables.5 After preliminary automated analysis of the sera from male cohort 2, each of the 76 peaks (metabolites) in the HPLC chromatograms of the sera from this cohort was inspected manually. Although the initial study had addressed analytical and biological validation including cohort-cohort instability [companion paper (28Citation )], the closer examination now conducted determined that 20 of the 76 peaks were not sufficiently analytically and biologically robust to be included in the study. The levels of these peaks displayed inconsistencies in intragroup DR and AL samples. The remaining 56 metabolites contained 42 of the 52 metabolites defined previously and 14 of the 56 metabolites that were not included in the previous set of 52 that best distinguished the sera in cohort 1. The levels of the remaining 56 metabolites were confirmed manually.

    HCA and PCA analyses with the 56 metabolite dataset distinguished AL and DR samples in male cohort 2 with >80% accuracy. PCA and HCA analyses were carried out to determine whether the 56-metabolite data set retained the ability to identify dietary group of origin. Incremental analysis with Autoscale preprocessing was chosen as the test case. The results show that HCA with the 56-metabolite data set separated the dietary groups with 87% accuracy, and the remaining three grouping algorithms (shown for comparison) also separated at >80% accuracy (Fig. 10)Citation . With the 56-metabolite data set, PCA showed that there were no outliers in the 15 samples analyzed (Fig. 11ACitation ). Despite relatively weak modeling power (see below), the first three components were able to distinguish AL and DR sera with 87% efficacy (Fig. 11Citation B). The first 2–3 mathematical factors (components) captured 50% of the total variability present, and 6–7 components captured 80% of the variability present (Fig. 11Citation C). Modeling power analysis revealed that even 10 components captured only about one third of the variation present in the dataset (Mean modeling power ± SD = 33 ± 18%; Median modeling power 34%, Fig. 11DCitation ). A two-dimensional multiplot showed that there were no two components that could efficiently separate the groups (Fig. 12Citation ). Excluding metabolite #671 (which was found to have a very high X-residual value) did not improve the separations (HCA and PCA analyses, data not shown).



View larger version (37K):
[in this window]
[in a new window]
 
Figure 10. Hierarchical cluster analysis (HCA) distinguishes ad libitum consumption (AL) and dietary restricted (DR) serotypes in the male second cohort. Dendrogram of analysis of the sera from 15 6-mo-old AL and DR male Fischer 344 x Brown Norway F1 rats based on manually confirmed analysis of 56 serum metabolites identified as potential markers. Four independent analyses were conducted as described in the text. Details as in the legend to Figure 1Citation .

 


View larger version (19K):
[in this window]
[in a new window]
 
Figure 11. Principal component analysis (PCA) with the 56 metabolites in male cohort 2. PCA of the sera from 15 6-mo-old ad libitum consumption (AL) and dietary restricted (DR) male Fischer 344 x Brown Norway F1 rats based on manually confirmed quantitation of 56 metabolites previously identified as potential markers. Autoscale preprocessing. Details as in legend to Figure 2Citation .

 


View larger version (19K):
[in this window]
[in a new window]
 
Figure 12. Two-dimensional multiplot of principal components fails to readily distinguish ad libitum consumption (AL) and dietary restricted (DR) sera. Principal component analysis (PCA) with 56 serum metabolites identified as potential markers in the sera from 15 6-mo-old AL and DR male rats. Autoscale was used as preprocessing algorithm. Dot, AL samples; triangle, DR samples.

 
    HCA and PCA analyses with a 20-metabolite dataset. Having validated the test set and conditions using the independent cohort 2, we then sought to optimize the dataset so as to enable more effective classification. To do this, we identified a subset of metabolites that had P < 0.2 (20 of 56, Table 1Citation ). Using this 20-metabolite dataset, HCA distinguished the samples at 66.7, 86.7, 80 and 80% efficacy with grouping algorithms, single, centroid, complete and incremental, respectively (Fig. 13Citation ). Thus, this shorter dataset had minimal effect on the efficacy of HCA.


View this table:
[in this window]
[in a new window]
 
TABLE 1 Two tailed, unpaired t test on the 56 metabolites confirmed by manual inspection in the male cohort 2

 


View larger version (40K):
[in this window]
[in a new window]
 
Figure 13. Hierarchical cluster analysis (HCA) analysis using 20 "robust" metabolites. Dendrograms of HCA analyses with 20 serum metabolites screened by different grouping algorithms. Four independent analyses were conducted. Details as in legend to Figure 1Citation

 
In contrast to HCA, separation based on PCA could distinguish dietary groups of origin with 100% accuracy with the 20-metabolite data set (Fig. 14Citation ). Analysis again showed that there were no outliers in the 15 samples analyzed (Fig. 14Citation A). Figure 14Citation B shows that 7 AL and 8 DR rats were distinguished by 3 components in PCA at 100% accuracy Two mathematical factors (components) captured >50% of the total variability present (compared with 2–3 components in the 56-metabolite data set), and 5–6 components captured 80% of the variability present, compared with 6–7 components in the 56-metabolite data set (Fig. 14Citation C). Modeling power analysis revealed substantial improvement (Mean modeling power ± SD = 72 ± 16%; Median modeling power 75%, Fig. 14Citation D). Figure 15Citation shows a two-dimensional multiplot of five components. Principal Component 1 could be used to isolate AL samples from DR samples at > 90% accuracy in conjunction with components 2, 3, 4 or 5. This separation differed from that obtained using the 56-metabolite dataset (Fig. 12)Citation , in which DR and AL samples could not be distinguished using any two components. The above results indicate that a subgroup of metabolites in the 76 metabolite data set identified in the primary study retained sufficient information to distinguish dietary group of origin by classification algorithms.



View larger version (21K):
[in this window]
[in a new window]
 
Figure 14. Principal component analysis (PCA) using 20 "robust" metabolites improves ad libitum consumption (AL) and dietary restricted (DR) separations. Autoscale was used as preprocessing algorithm. Details as in legend to Figure 2Citation .

 


View larger version (21K):
[in this window]
[in a new window]
 
Figure 15. Two-dimensional multiplot of principal components based on 20 "robust" metabolites readily distinguishes ad libitum consumption (AL) and dietary restricted (DR) sera. PCA with 20 serum metabolites identified as potential markers in the sera from 15 6-mo-old AL and DR male rats. Autoscale was used as preprocessing algorithm. Dot, AL samples; triangle, DR samples.

 
    Identification of additional metabolites with utility in distinguishing dietary groups. To avoid missing valuable metabolites, we examined analytically valid metabolites other than the 76 in the male cohort 2. Two hundred ninety-seven metabolites had been identified as analytically valid in rat serum using an HPLC system in our previous study (14Citation ). We reevaluated the other 221 metabolites and identified 9 that were able to help differentiate the dietary groups in the male cohort 2. These 9 metabolites had P <= 0.2.

In studies utilizing HCA, the inclusion of these 9 metabolites improved the ability of the 20-metabolite dataset to discern diet group of origin. The combined 29-metabolite dataset enabled HCA to distinguish the dietary group of origin with 100% accuracy using the incremental grouping algorithm (data not shown). PCA with these 29 metabolites separated the cohort at 100% accuracy (data not shown).

Back-testing analyses were then carried out to examine whether these 9 additional metabolites affected the separations previously obtained in the other cohorts. We quantified the 9 metabolites in the samples of the first male cohort and carried out analyses with the data set combining the 9 metabolites and the original data set. When the 9 metabolites were added to the subset of 52 metabolites in the first cohort, HCA grouped samples at 100% accuracy with grouping algorithm incremental similar to the original 52 metabolite subset. However, the 9 metabolites affected the grouping accuracy of the 52 metabolites with grouping algorithms single, centroid, and complete. X-residuals from PCA analysis based on the 61 (9 + 52) metabolites were less than those based on the original 52 metabolites (i.e., metabolite variability was better explained by the same number of components). The PCA based on the 61 (9 + 52) metabolites distinguished the two dietary groups of the first cohort at 100% accuracy as did the original 52 metabolites. This indicates that use of the 9 additional metabolites did not weaken the power of the data set to discriminate provided more powerful classification algorithms were used. We will therefore continue to evaluate these 9 new metabolites in our next studies, because more powerful classification algorithms will be utilized.

PCA vs. HCA

As noted at several points in this and the companion paper (28Citation ), PCA consistently outperformed HCA in enabling us to distinguish categories. At first consideration, this appears surprising because HCA is specifically designed to distinguish classes, whereas PCA is designed to simplify datasets by collapsing variables into synthetic mathematical factors termed components. One possibility, which we currently consider the most likely, is that the data reflected in the primary principal components had an increased signal:noise ratio relative to that observed in the raw data. Alternatively stated, this implies that larger numbered (i.e., later) principal components would be comprised primarily of noise. Another possibility is that the third "representational" dimension offered by rotation gives us additional viewing flexibility that increases separation. Two points argue against this being the major reason why PCA seems more effective. The first is that HCA works in multidimensional space, and therefore should not be hindered by the limitation of our visualization. A second is that two-dimensional visualizations of the PCA data also appeared better than the HCA data (e.g., compare Figs. 13Citation and 15Citation ). In addition, the possibility that the increased efficacy offered by PCA is "artifactual" would suggest that supervised cluster-based analysis (e.g., KNN or K-nearest neighbor) would be as efficacious as supervised PCA-based analysis (e.g., SIMCA, Soft Independent Modeling of Class Analogy) in distinguishing classes. Our preliminary evidence suggests that this is not true (unpublished data). Thus, overall, our evidence suggests that PCA-based data simplification is eliminating an aspect of biological noise in the metabolome that we have not yet recognized. We expect to continue to examine this issue as we move to larger datasets.

In conclusion, the major finding of this paper is that previously identified components of the serum metabolome robustly retain sufficient information to distinguish AL and DR female and male rats in independent cohorts. We utilized additional cohorts to further improve the serotypes. For example, after this validation experiment, data from the second male cohort was then used to further improve the profile by eliminating noninformative metabolites and identifying additional potentially informative metabolites. Although the 63-metabolite female serotype does retain some statistically noisy metabolites, paring the dataset to 37 appears to discard potentially useful information, and we therefore choose to retain these analytes in our current serotype model at this time. Our study also demonstrated the utility and arguably, the necessity, for the application of progressively more powerful grouping algorithms to establish categorical separations. As seen to a lesser extent in the female dataset, PCA is significantly more powerful than HCA in utilizing the data set to distinguish dietary groups of origin. In the analyses currently in progress, we are developing expert system– based approaches to further aid in categorical separations.


    ACKNOWLEDGMENTS
 
We thank Thomas Vogl and Walter Willett for their helpful discussions and contributions to the overall experimental design, and John Blass for comments on the manuscript.


    FOOTNOTES
 
1 Presented in part in oral presentation form at Experimental Biology 2001, March 31-April 4, Orlando, FL [Kristal, B. S., Vigneau-Callahan, K.E., Shi, H., Matson, W. & Milbury, P.E. (2001) Diet-dependent metabolic serotypes. FASEB J. 15: A65 (abs.)]. Back

2 Supported by National Institutes of Health National Institute on Aging R01-AG15354 (B.S.K.), ESA, Incorporated and the Winifred Masterson Burke Relief Foundation. Back

4 Abbreviations used: AL, ad libitum; DR, dietary restricted; HCA, hierarchical cluster analysis; PCA, principal component analysis. Back

5 For reasons discussed above, we will use only autoscale preprocessed data during the validation portions of this study. Our initial validation test in HCA will use the incremental grouping algorithm, which gave the best separations in the proof of principle study (28Citation ). Data using the single, centroid, and complete grouping algorithms on cohort 2 will be presented for comparison. Back

Manuscript received 9 June 2001. Initial review completed 25 July 2001. Revision accepted 11 February 2002.


    LITERATURE CITED
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 LITERATURE CITED
 

1. Kristal, B. S. & Yu, B. P. (1994) Aging and its modulation by dietary restriction. Yu, B. P. eds. Modulation of Aging Processes by Dietary Restriction 1994:1-36 CRC Press Boca Raton, FL. .

2. Weindruch, R. & Walford, R. (1988) The Retardation of Aging and Disease by Dietary Restriction 1988 Charles C. Thomas St. Louis, MO. .

3. McCay, C. M. (1935) Cellulose in the diets of rats and mice. J. Nutr. :435-447.

4. Maeda, H., Gleister, C. A., Masoro, E. J., Murata, I., McMahan, C. A. & Yu, B. P. (1985) Nutritional influences on aging of Fischer 344 rats: II Pathology. J. Gerontol. 40:671-688.[Abstract/Free Full Text]

5. Carlson, A. J. & Hoelzel, F. (1946) Apparent prolongation of the life span of rats by intermittent fasting. J. Nutr. 31:363-375.

6. Goodrick, C. L., Ingram, D. K., Reynolds, M. A., Freeman, J. R. & Cider, N. (1990) Effects of intermittent feeding upon body weight and lifespan in inbred mice: interaction of genotype and age. Mech. Age. Dev. :55.

7. Ross, M. H. & Bras, G. (1970) Food preference and length of life. Science (Washington, DC) 190:165-167.

8. Weindruch, R. H. & Walford, R. L. (2000) Dietary restriction in mice beginning at one year of age: Effects on life-span and spontaneous cancer incidence. Science (Washington, DC) 215:1415-1418.

9. Nolen, G. A. (1972) Effect of various restricted dietary regimens on the growth, health, and longevity of albino rats. J. Nutr. 102:1477-1494.

10. Tannenbaum, A. (1945) The dependence of tumour formation on the composition of the calorie-restricted diet as well as on the degree of restriction. Cancer Res. 5:616-625.[Free Full Text]

11. Stucklikova, E., Juricova-Horakova, E. M. & Deyl, Z. (1975) New aspects of the dietary effect of life prolongation in rodents: what is the role of obesity in aging. Exp. Gerontol. 10:141-144.[Medline]

12. Yu, B. P. (1996) Aging and oxidative stress: modulation by dietary restriction. Free Radic. Biol. Med. 21:651-668.[Medline]

13. Willett, W. C., Dietz, W. H. & Colditz, G. A. (1999) Guidelines for healthy weight. N. Engl. J. Med. 341:427-434.[Free Full Text]

14. Vigneau-Callahan, K. E., Shestopalov, A. I., Milbury, P. E., Matson, W. R. & Kristal, B. S. (2001) Characterization of Diet-Dependent Metabolic Serotypes: Analytical and Biological Variability Issues in Rats. J. Nutr. 131:924S-932S.[Abstract/Free Full Text]

15. Iwasaki, K., Gleister, C. A., Masoro, E. J., McMahan, C. A. & Yu, B. P. (1988) The influence of dietary protein source on longevity and age-related disease processes of Fischer rats. J. Gerontol. 43:B5-B12.[Abstract]

16. Iwasaki, K., Gleister, C. A., Masoro, E. J., McMahan, C. A., Seo, E.-J. & Yu, B. P. (1988) Influence of the restriction of individual dietary components on longevity and age-related disease of Fischer rats: the fat component and the mineral component. J. Gerontol. 43:B13-B21.[Abstract]

17. McCay, C. M., Crowell, M. F. & Maynard, L. A. (1935) The effect of retarded growth upon the length of lifespan and upon the ultimate body size. J. Nutr. 10:63-79.

18. Kristal, B. S., Vigneau-Callahan, K. E. & Matson, W. R. (1998) Simultaneous analysis of the majority of low-molecular-weight, redox-active compounds from mitochondria. Anal. Biochem. 263:18-25.[Medline]

19. Beal, M. F., Matson, W. R., Swartz, K. J., Gamache, P. H. & Bird, E. D. (1990) Kynurenine pathway measurements in Huntington’s disease striatum; evidence for reduced formation of kynurenic acid. J. Neurochem. 55:1327-1339.[Medline]

20. Matson, W. R., Gamache, P. H., Beal, M. F. & Bird, E. D. (1987) EC array sensor concepts and data. Life Sci. 41:905-908.[Medline]

21. Matson, W. R., Bouckoms, A., Svendson, C., Beal, M. F. & Bird, E. D. (1990) Generating and controlling multiparameter databases for biochemical correlates of disorders. Basic, Clinical and Therapeutic Aspects of Alzheimer’s and Parkinson’s Diseases II:513-516 Plenum New York, NY. .

22. LeWitt, P. A., Galloway, M. P., Matson, W. R., Milbury, P. M., McDermott, M. & Srivastava, D. K. (1992) Markers of dopamine metabolism in Parkinson’s disease: The Parkinson’s Study Group. Neurology 42:2111-2117.[Abstract/Free Full Text]

23. Ogawa, T., Matson, W. R., Beal, M. F., Myers, R. H., Bird, E. D., Milbury, P. E. & Saso, S. (1992) Kynurenine pathway abnormalities in Parkinson’s disease. Neurology 42:1702-1706.[Abstract/Free Full Text]

24. Beal, M. F., Matson, W. R., Storey, E., Milbury, P. E., Ryan, E. A., Ogawa, T. & Bird, E. D. (1992) Kynurenic acid concentrations are reduced in Huntington’s disease cerebral cortex. J. Neurol. Sci 108:80-87.[Medline]

25. Matson, W. R., Langials, P., Volicer, L., Gamache, P. H., Bird, E. D. & Mark, K. A. (1984) n-Electrode three dimensional liquid chromatography with electrochemical detection for determination of neurotransmitters. Clin. Chem. 30:1477-1488.[Abstract/Free Full Text]

26. Milbury, P. E. (1997) CEAS generation of large multiparameter databases for determining categorical process involvement of biomolecules. Coulometric Array Detectors for HPLC 1997:125-141 VSP International Science Publication Zeist, The Netherlands. .

27. Acworth, I. N. & Gamache, P. H. (1996) The coulometric electrode array for use in HPLC analysis, part 1: theory. Am. Lab. 5:33-37.

28. Shi, H., Vigneau-Callahan, K. E., Shestopalov, A. I., Milbury, P. E., Matson, W. R. & Kristal, B. S. (2002) Characterization of diet-dependent metabolic serotypes: II. Proof of principle in female and male rats. J. Nutr. 132:1031-1038.[Abstract/Free Full Text]

29. National Research Council (1985) Guide to the Care and Use of Laboratory Animals. Publication no. 85–23 (rev.) 1985 National Institutes of Health Bethesda, MD. .




This article has been cited by other articles:


Home page
Brief BioinformHome page
V. Shulaev
Metabolomics technology and bioinformatics
Brief Bioinform, June 1, 2006; 7(2): 128 - 139.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
R. L. Prentice, W. C. Willett, P. Greenwald, D. Alberts, L. Bernstein, N. F. Boyd, T. Byers, S. K. Clinton, G. Fraser, L. Freedman, et al.
Nutrition and Physical Activity and Chronic Disease Prevention: Research Strategies and Recommendations
J Natl Cancer Inst, September 1, 2004; 96(17): 1276 - 1287.
[Abstract] [Full Text] [PDF]


Home page
Sci Aging Knowl EnvironHome page
B. S. Kristal and U. Paolucci
Caloric Restriction in trans
Sci. Aging Knowl. Environ., July 9, 2003; 2003(27): pe19 - 19.
[Abstract] [Full Text]


Home page
J. Nutr.Home page
Y. Noguchi, R. Sakai, and T. Kimura
Metabolomics and its Potential for Assessment of Adequacy and Safety of Amino Acid Intake
J. Nutr., June 1, 2003; 133(6): 2097S - 2100.
[Abstract] [Full Text] [PDF]


Home page
J. Nutr.Home page
H. Shi, K. E. Vigneau-Callahan, A. I. Shestopalov, P. E. Milbury, W. R. Matson, and B. S. Kristal
Characterization of Diet-Dependent Metabolic Serotypes: Proof of Principle in Female and Male Rats
J. Nutr., May 1, 2002; 132(5): 1031 - 1038.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Shi, H.
Right arrow Articles by Kristal, B. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shi, H.
Right arrow Articles by Kristal, B. S.


Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Copyright © 2002 by American Society for Nutrition