The EMBO journal

Structural analysis of the X-linked gene encoding human glucose 6-phosphate dehydrogenase.

PMID 2428611


We report the isolation and analysis of human genomic DNA clones spanning about 100 kb of the X chromosome and comprising the entire gene coding for the enzyme glucose 6-phosphate dehydrogenase (G6PD). The G6PD gene is 18 kb long and consists of 13 exons: the protein-coding region is divided into 12 segments ranging in size from 12 to 236 bp; an intron is present in the 5' untranslated region. Mature G6PD mRNA has a single polyadenylation site in HeLa cells. The major 5' end of mature G6PD mRNA in several cell lines is located 177 bp upstream of the translation initiating codon; longer mRNA molecules extending further in the 5' direction could be identified by S1 mapping and by comparing genomic and cDNA sequences. The DNA sequence around the major mRNA start is very GC rich; as to putative transcription regulatory sequences, a non-canonical TATA box and 9 CCGCCC elements are present, but no CAAT element could be identified. The genomic DNA we have isolated includes another ubiquitously transcribed region, provisionally named the GdX gene. Although the function of GdX is as yet unknown, we have established that this gene is located about 40 kb downstream of G6PD and is transcribed in the same direction. A comparative analysis of the promoter region of G6PD and 10 other housekeeping enzyme genes has confirmed the presence of a number of common features. In particular, in the eight cases in which a 'TATA' box is present, a conserved sequence of 25 bp is seen immediately downstream.