BMC bioinformatics

Alignment of time course gene expression data and the classification of developmentally driven genes with hidden Markov models.

PMID 26084333


We consider data from a time course microarray experiment that was conducted on grapevines over the development cycle of the grape berries at two different vineyards in South Australia. Although the underlying biological process of berry development is the same at both vineyards, there are differences in the timing of the development due to local conditions. We aim to align the data from the two vineyards to enable an integrated analysis of the gene expression and use the alignment of the expression profiles to classify likely developmental function. We present a novel alignment method based on hidden Markov models (HMMs) and use the method to align the motivating grapevine data. We show that our alignment method is robust against subsets of profiles that are not suitable for alignment, investigate alignment diagnostics under the model and demonstrate the classification of developmentally driven genes. The classification of developmentally driven genes both validates that the alignment we obtain is meaningful and also gives new evidence that can be used to identify the role of genes with unknown function. Using our alignment methodology, we find at least 1279 grapevine probe sets with no current annotated function that are likely to be controlled in a developmental manner.