Abstract Background Bronchopulmonary dysplasia (BPD) is a common complication of preterm birth. Very different models using clinical parameters at an early postnatal age to predict BPD have been developed with little extensive quantitative validation. The objective of this study is to review and validate clinical prediction models for BPD. Methods We searched the main electronic databases and abstracts from annual meetings. The STROBE instrument was used to assess the methodological quality. External validation of the retrieved models was performed using an individual patient dataset of 3229 patients at risk for BPD. Receiver operating characteristic curves were used to assess discrimination for each model by calculating the area under the curve (AUC). Calibration was assessed for the best discriminating models by visually comparing predicted and observed BPD probabilities. Results We identified 26 clinical prediction models for BPD. Although the STROBE instrument judged the quality from moderate to excellent, only four models utilised external validation and none presented calibration of the predictive value. For 19 prediction models with variables matched to our dataset, the AUCs ranged from 0.50 to 0.76 for the outcome BPD. Only two of the five best discriminating models showed good calibration. Conclusions External validation demonstrates that, except for two promising models, most existing clinical prediction models are poor to moderate predictors for BPD. To improve the predictive accuracy and identify preterm infants for future intervention studies aiming to reduce the risk of BPD, additional variables are required. Subsequently, that model should be externally validated using a proper impact analysis before its clinical implementation.