Discussion
Principal findings
Our IPPIC birth weight model, developed with data that are readily available at the antenatal booking, showed excellent prediction of birth weight, on average, and promising performance in four different populations included in the individual participant data meta-analysis. We used a robust Delphi process to prioritise candidate predictors, ensuring that we included clinically meaningful variables. We also used best practice prognostic model methods to develop and validate our birth weight prediction model. Our model predicted the birth weight of a baby at various potential gestational ages of delivery, based on maternal weight, height, age, parity, smoking status, ethnic group, history of chronic hypertension, history of diabetes, assisted conception, and previous history of pre-eclampsia, stillbirth, and babies born small for gestational age. We validated the model with an internal-external cross validation approach.
The model, when tested in cohorts in different countries, explained about 50% of individual level variability, and had good calibration performance in high and low risk populations. Prediction errors were smallest in individuals at the lower end of the range of predicted birth weights, which is important in informing clinical decisions in pregnancies at high risk of fetal growth restriction and related complications.
Strengths and limitations of this study
Our individual participant data meta-analysis simultaneously developed and validated a prediction model for birth weight. We developed our model with data from the harmonised IPPIC individual participant data, from cohorts from different countries,22 23 which provided us with a larger sample size than is achievable with just one study. This approach allowed us to develop a more comprehensive prediction model, applicable in different populations and settings included in these individual participant data. We evaluated clinically relevant predictors that are routinely available at the antenatal booking in both high and low resource settings, allowing the model to be easily applied in high income as well as in low income countries where perinatal mortality rates are highest.46 Although our prediction model showed promising performance after three cycles of internal-external cross validation in women from the UK, Norway, and Australia, multiple external validations with data specifically from low income settings are needed to fully evaluate if the the model can be transferred to these settings. These external validations will help verify the model's robustness and suitability for use in other countries and subgroups, strengthening its practical use in clinical practice.
Our model can be used to generate predictions of birth weight conditional on any clinically relevant gestational age at delivery. Integration of the model as part of routine growth charts has the potential to inform antenatal counselling and empower women to contribute towards shared decision making with clinicians about the frequency of monitoring in pregnancy and discussions on timing of birth, where concerns about the growth of the fetus exist. Further external validation of the model in different populations and settings, however, is required before implementation in clinical practice. Our prediction of birth weight was on the continuous scale, and therefore our model is not limited by arbitrary cut-off values used to define small or large for gestational age. This approach allows clinicians to calculate predicted birth centiles based on any fetal growth standard of their choice, such as GROW, INTERGROWTH 21st, and WHO.28–30
We used a systematic approach to develop and validate our birth weight prediction model, by first identifying and prioritising candidate predictors with a Delphi process and then using multiple imputation to deal with missing data for both predictors and outcome to avoid the loss of useful information.47 48 We used rigorous statistical methods to develop the prediction model and evaluate the predictive performance, with individual participant data from multiple cohorts to assess any potential heterogeneity in performance for the cohorts.
Our study had some limitations. Although mean birth weight was similar in all cohorts, the NICHD cohort had a higher standard deviation in birth weight, with greater variability than the other cohorts. This heterogeneity could be a result of variation in personal characteristics within the population, potentially affecting the generalisability of the model. Considering that the NICHD cohort was retained in all internal-external cross validation cycles, exploring the model's performance within this heterogenous context is important. External validation of the model in other datasets that represent different regions and populations will help confirm the model's generalisability, enhancing its practical applicability.
The average calibration performance of the model was good for all cohorts but varied in individual cohorts, with some underprediction in the smaller cohorts, mainly in those with the highest birth weight. Although overall calibration of the model was good, miscalibration for individual observations was found, particularly at the higher end of the range of predicted birth weights. This miscalibration produced a wide range of observed birth weights for a particular predicted birth weight in all cycles of the internal-external cross validation. This range was much narrower in the clinically important range for the lower predicted birth weights, however, where pregnancies have a higher risk of growth restriction and require intervention. Our model explained 47% of the variability in birth weight in the dataset, ranging from 56% (NICHD, 2018 cohort) to 33% (Allen et al, 2017 cohort). These differences in R2 estimates are partly a result of chance, but could also be because of differences in predictor effects in the various populations. Future research might explore if some variables, such as maternal weight or height, interact with country location and should be modelled differently for each location to improve variance explained.
Comparison with existing evidence
Most published models predict the risk of a baby born small for gestational age rather than birth weight.16 Dichotomisation of birth weight limits the power and usefulness of a prediction model. The use of specific cut-off values can also result in both overdiagnosis and underdiagnosis of fetal growth abnormalities, depending on the criteria used.49 These models were also usually poorly reported, with only a third being internally validated (10/28, 36%), and two (7%) were externally validated and showed limited predictive performance.50 51 Calibration measures were rarely reported in these studies, with only four (14%) reporting these performance measures. A prediction formula, rule, or score that would allow independent external validation was reported in only 16 (57%) of these models. Other published birth weight prediction models were developed for use in specific populations and have not undergone external validation to determine their generalisability to new and different populations.13 So far, no individual test is satisfactorily predictive of birth weight or small for gestational age to warrant recommendation in routine clinical use.52
We reported the development and validation of our prediction model in line with current guidelines on the transparent reporting of multivariable prediction models developed or validated with clustered data.21 Our model showed good calibration performance on internal-external cross validation, with only slight overprediction of birth weight, by 9.7 g on average. Our model also required the user to enter the assumed gestational age at delivery. Although the actual date of delivery is not known when making predictions, the option to enter various possible gestational ages for delivery allows the user to produce a plot of predictions of birth weight for various time points.
The Royal College of Obstetricians and Gynaecologists in the UK recommends, at the antenatal booking, assessing for risk factors for fetuses that are small for gestational age, to identify those who might need increased surveillance.53 The American College of Obstetricians and Gynecologists recommends screening for unspecified medical and obstetric risk factors, but does not recommend use of uterine artery Doppler or biochemical markers, citing lack of evidence on improvement of outcomes.54 The Society of Obstetricians and Gynaecologists of Canada calls for clinical risk factor based screening,55 whereas the Royal Australian and New Zealand College of Obstetricians and Gynaecologist suggests risk assessment through a combination of biomarkers, Doppler ultrasound, and major maternal clinical risk factors.56 The choice of risk factors and their combination to predict risk of small for gestational age or fetal growth restriction in any of these guidelines was not based on formal prediction modelling.
Relevance for clinical practice and research
The prediction of birth weight is an important aspect of antenatal care because it can provide valuable information to healthcare providers and expectant mothers about the growth and development of the fetus, with cost effective use of limited fetal monitoring resources. Accurate predictions of birth weight can also help identify infants who might have an increased risk of adverse outcomes, such as preterm birth or stillbirth, and allow for early interventions to improve outcomes. The development of accurate birth weight prediction models has been challenging, however, because individual studies often have limited sample sizes, variable definitions of birth weight outcome and predictors, with no external validation of any model developed.11 14 49 57 Our individual participant data meta-analysis combined data from multiple studies to develop a mathematical model, providing a more robust estimate of the association between included predictors and birth weight. Use of multiple datasets in the IPPIC data repository allowed us to carry out extensive validation of the model for different geographical regions, health systems, settings, and in populations of women with different baseline risks.
Only clinical characteristic predictors were included in the model, making it potentially applicable to both low and high resource settings. The predictors included are easy to measure and routinely available in clinical practice. Incorporating the model into practice will be simple because no additional measures are required to calculate the birth weight for potential gestational ages of delivery. Because the model includes factors that influence fetal growth and perinatal risk, its predictive ability is particularly useful for early identification of risk of abnormal growth at the antenatal booking. Thus the model can alert healthcare providers to take appropriate actions and provide necessary care in monitoring high risk pregnancies.
Our work was in direct response to calls from the National Institute for Health and Care Excellence and the Royal College of Obstetricians and Gynaecologists for predictive tests or strategies to identify women at risk of delivering a small baby, particularly growth restricted infants with complications,53 58 and the priorities of the UK Department of Health to reduce the incidence of stillbirths and neonatal deaths. Further research is needed to evaluate the ease of implementation of our birth weight model into routine clinical practice and to determine any barriers and facilitators of its use. This research should include assessment of the acceptability of the prediction model as a screening tool for pregnant women and their families, as well as healthcare providers.
The effect of using our birth weight model in clinical practice might require evaluation in cluster randomised trials to assess whether its use improves perinatal outcomes, or evaluation in an implementation study to show that it can be integrated into routine care at a population level. These studies could evaluate the use of the model to inform interventions (such as close monitoring or planned delivery) compared with routine care on perinatal mortality. Although the feasibility of these trials is challenging because of the sample size required to show an effect on perinatal mortality, proxies for perinatal mortality could be used, such as morbidity, to achieve sufficient power.59
Conclusions
We have developed a simple prediction model incorporating routinely available clinical predictors to predict birth weight at various potential gestational ages at delivery. The model explained about 50% of the variability, showed good calibration, and its use could help identify pregnancies at increased risk of adverse outcomes to allow planning of appropriate management or early intervention to improve perinatal outcomes. Further multiple external validations in different settings and populations will help confirm the generalisability of the model.