Bioinformatics (Oxford, England)

Combining multiple microarrays in the presence of controlling variables.

PMID 16705015


Microarray technology enables the monitoring of expression levels for thousands of genes simultaneously. When the magnitude of the experiment increases, it becomes common to use the same type of microarrays from different laboratories or hospitals. Thus, it is important to analyze microarray data together to derive a combined conclusion after accounting for the differences. One of the main objectives of the microarray experiment is to identify differentially expressed genes among the different experimental groups. The analysis of variance (ANOVA) model has been commonly used to detect differentially expressed genes after accounting for the sources of variation commonly observed in the microarray experiment. We extended the usual ANOVA model to account for an additional variability resulting from many confounding variables such as the effect of different hospitals. The proposed model is a two-stage ANOVA model. The first stage is the adjustment for the effects of no interests. The second stage is the detection of differentially expressed genes among the experimental groups using the residuals obtained from the first stage. Based on these residuals, we propose a permutation test to detect the differentially expressed genes. The proposed model is illustrated using the data from 133 microarrays collected at three different hospitals. The proposed approach is more flexible to use, and it is easier to accommodate the individual covariates in this model than using the meta-analysis approach. A set of programs written in R will be electronically sent upon request.