Molecular docking is widely used to predict novel lead compounds for drug discovery. Success depends on the quality of the docking scoring function, among other factors. An imperfect scoring function can mislead by predicting incorrect ligand geometries or by selecting nonbinding molecules over true ligands. These false-positive hits may be considered "decoys". Although these decoys are frustrating, they potentially provide important tests for a docking algorithm; the more subtle the decoy, the more rigorous the test. Indeed, decoy databases have been used to improve protein structure prediction algorithms and protein-protein docking algorithms. Here, we describe 20 geometric decoys in five enzymes and 166 "hit list" decoys-i.e., molecules predicted to bind by our docking program that were tested and found not to do so-for beta-lactamase and two cavity sites in lysozyme. Especially in the cavity sites, which are very simple, these decoys highlight particular weaknesses in our scoring function. We also consider the performance of five other widely used docking scoring functions against our geometric and hit list decoys. Intriguingly, whereas many of these other scoring functions performed better on the geometric decoys, they typically performed worse on the hit list decoys, often highly ranking molecules that seemed to poorly complement the model sites. Several of these "hits"from the other scoring functions were tested experimentally and found, in fact, to be decoys. Collectively, these decoys provide a tool for the development and improvement of molecular docking scoring functions. Such improvements may, in turn, be rapidly tested experimentally against these and related experimental systems, which are well-behaved in assays and for structure determination.