Neoplasia (New York, N.Y.)

A gene expression classifier of node-positive colorectal cancer.

PMID 19794966


We used digital long serial analysis of gene expression to discover gene expression differences between node-negative and node-positive colorectal tumors and developed a multigene classifier able to discriminate between these two tumor types. We prepared and sequenced long serial analysis of gene expression libraries from one node-negative and one node-positive colorectal tumor, sequenced to a depth of 26,060 unique tags, and identified 262 tags significantly differentially expressed between these two tumors (P < 2 x 10(-6)). We confirmed the tag-to-gene assignments and differential expression of 31 genes by quantitative real-time polymerase chain reaction, 12 of which were elevated in the node-positive tumor. We analyzed the expression levels of these 12 upregulated genes in a validation panel of 23 additional tumors and developed an optimized seven-gene logistic regression classifier. The classifier discriminated between node-negative and node-positive tumors with 86% sensitivity and 80% specificity. Receiver operating characteristic analysis of the classifier revealed an area under the curve of 0.86. Experimental manipulation of the function of one classification gene, Fibronectin, caused profound effects on invasion and migration of colorectal cancer cells in vitro. These results suggest that the development of node-positive colorectal cancer occurs in part through elevated epithelial FN1 expression and suggest novel strategies for the diagnosis and treatment of advanced disease.