Computational protocol: Principal Component Analysis of the Effects of Environmental Enrichment and (-)-epigallocatechin-3-gallate on Age-Associated Learning Deficits in a Mouse Model of Down Syndrome

Similar protocols

Protocol publication

[…] Two questions were addressed: the global differences over time and the progression of learning across sessions. The first question was tested by single variate analysis of the differences between experimental groups for three learning-related parameters (latency to reach the platform, Gallagher index and % of time spent in the periphery). Data were expressed as mean + S.E.M and analyzed using One-way ANOVA or ANOVA repeated measures. The second question was evaluated by estimating the linear effect of time-group interaction using a general linear-mixed model for each behavioral parameter. We associated random-effects terms with the animal factor in order to model within-subject correlation that appears due to the repeated nature of the data. Also, the variable “latency” was right-censored, since mice are allowed to swim a maximum of 60 s (Vock et al., ). Estimation of the coefficients and their associated p-values were based on maximum log-likelihood methods using the R library censReg (Henningsen, ). We used the plot of the model residuals vs. the fitted values to check model assumptions. Multiple comparisons for parametric model were used to address post-hoc comparisons using multest R package and glht function (Hand and Taylor, ; Dickhaus, ). Non-treated WT and Ts65Dn were considered as the reference groups for the comparisons. To control the false discovery rate (FDR) due to multiple post-hoc comparisons Benjamini-Hochberg method was used (Benjamini and Hochberg, ). This procedure was implemented both for the ANOVA and for the linear-mixed model in the R package multtest (Pollard et al., ). [...] The “learning” process is composed by many variables whose influence on performance may be great for some, whereas for others it may be so small that they can be ignored. For example, you might start with ten original variables, but might end with only two or three meaningful axes. This is known as reducing the dimensionality of a data set. PCA is the most commonly used technique to identify linear combinations of variables in a high-dimensional space best representing the variance that is present in the data. This is achieved by considering each variable to be an axis in a high-dimensional space. Individuals, or groups of individuals, can be represented as points in this space. PCA identifies a linear combination of the original variables, called principal component that accounts for the largest amount of the experimental variability. Once this first principal component is set, PCA finds successive orthogonal principal components that explain the maximum amount of the remaining variance given that the orthogonality constraint is met. Finally, the original data and the original variables can be projected in this new space defined by the principal components. In our analysis we were mainly interested in the variation among experimental groups as well as the variation of a given group along the learning sessions. To find the variables best representing these two types of between-group variation (within- and between-learning sessions), we used the group medians of each variable on each acquisition day. A supervised analysis using group means instead of variables measured on individuals is known as discriminant analysis, (c.f. Greenacre, ). Such methods are suitable for the analysis of behavioral data having several conditions with a number of replicates per condition. For reasons of robustness to outliers, however, we here prefer to use the medians instead of the means. The PCA was performed on 40 observations (eight experimental groups on five learning sessions, where the four trials of each learning session were averaged) corresponding to median group performances of seven variables on each acquisition day. Separately, a similar analysis was done for the three reversal sessions.The variables of interest were latency to target, percentage of time spent in target quadrant, percentage of time spent in the periphery, Whishaw index, Gallagher index, distance traveled, and speed. To allow for the combination of the original variables measured in different units, all variables were scaled to unit variance before the analysis (the default Z-score scaling was used).Since the PCA was performed on group medians (grouped data), points identified in the PCA space will correspond to groups of individuals. To identify points corresponding to individuals themselves, we used the technique of “adding supplementary points.” Given a single measurement corresponding to a point in the space of the original variables, we can identify the new coordinates of this point in the space defined by the principal components. Note that such points will not change the coordinate system, as they are added after the PCA is performed. Adding all 86 individuals appearing five times each as supplementary points, we identified the coordinates for each of the individuals. The R-package FactoMineR (Lê et al., ) was used for the PCA as it allowed for a straightforward inclusion of supplementary observations. Density plots were obtained using the statdensity_2d function from the ggplot2 R package (Wickham, ) with the parameters: n = 100, h = 5, and bins = 6. […]

Pipeline specifications

Software tools multtest, FactoMineR, Ggplot2
Applications Miscellaneous, Gene expression microarray analysis
Organisms Mus musculus
Diseases Alzheimer Disease