BMC bioinformatics

Correlated fragile site expression allows the identification of candidate fragile genes involved in immunity and associated with carcinogenesis.

PMID 16981993


Common fragile sites (cfs) are specific regions in the human genome that are particularly prone to genomic instability under conditions of replicative stress. Several investigations support the view that common fragile sites play a role in carcinogenesis. We discuss a genome-wide approach based on graph theory and Gene Ontology vocabulary for the functional characterization of common fragile sites and for the identification of genes that contribute to tumour cell biology. Common fragile sites were assembled in a network based on a simple measure of correlation among common fragile site patterns of expression. By applying robust measurements to capture in quantitative terms the non triviality of the network, we identified several topological features clearly indicating departure from the Erdos-Renyi random graph model. The most important outcome was the presence of an unexpected large connected component far below the percolation threshold. Most of the best characterized common fragile sites belonged to this connected component. By filtering this connected component with Gene Ontology, statistically significant shared functional features were detected. Common fragile sites were found to be enriched for genes associated to the immune response and to mechanisms involved in tumour progression such as extracellular space remodeling and angiogenesis. Moreover we showed how the internal organization of the graph in communities and even in very simple subgraphs can be a starting point for the identification of new factors of instability at common fragile sites. We developed a computational method addressing the fundamental issue of studying the functional content of common fragile sites. Our analysis integrated two different approaches. First, data on common fragile site expression were analyzed in a complex networks framework. Second, outcomes of the network statistical description served as sources for the functional annotation of genes at common fragile sites by means of the Gene Ontology vocabulary. Our results support the hypothesis that fragile sites serve a function; we propose that fragility is linked to a coordinated regulation of fragile genes expression.