Abstract Background Tumor metastases pose the greatest threat to a patient's survival, and thus, understanding the biology of disseminated cancer cells is critical for developing effective therapies. Methods Microarrays and immunohistochemistry were used to analyze primary breast tumors, regional (lymph node) metastases, and distant metastases in order to identify biological features associated with distant metastases. Results When compared with each other, primary tumors and regional metastases showed statistically indistinguishable gene expression patterns. Supervised analyses comparing patients with distant metastases versus primary tumors or regional metastases showed that the distant metastases were distinct and distinguished by the lack of expression of fibroblast/mesenchymal genes, and by the high expression of a 13-gene profile (that is, the 'vascular endothelial growth factor (VEGF) profile') that included VEGF, ANGPTL4, ADM and the monocarboxylic acid transporter SLC16A3. At least 8 out of 13 of these genes contained HIF1α binding sites, many are known to be HIF1α-regulated, and expression of the VEGF profile correlated with HIF1α IHC positivity. The VEGF profile also showed prognostic significance on tests of sets of patients with breast and lung cancer and glioblastomas, and was an independent predictor of outcomes in primary breast cancers when tested in models that contained other prognostic gene expression profiles and clinical variables. Conclusion These data identify a compact in vivo hypoxia signature that tends to be present in distant metastasis samples, and which portends a poor outcome in multiple tumor types. This signature suggests that the response to hypoxia includes the ability to promote new blood and lymphatic vessel formation, and that the dual targeting of multiple cell types and pathways will be needed to prevent metastatic spread.