Mortality prediction using acute physiology and chronic health evaluation II and acute physiology and chronic health evaluation IV scoring systems: Is there a difference?
Correspondence Address: Source of Support: None, Conflict of Interest: None DOI: 10.4103/ijccm.IJCCM_422_17
Source of Support: None, Conflict of Interest: None
Background: Mortality prediction in the Intensive Care Unit (ICU) setting is complex, and there are several scoring systems utilized for this process. The Acute Physiology and Chronic Health Evaluation (APACHE) II has been the most widely used scoring system; although, the more recent APACHE IV is considered an updated and advanced prediction model. However, these two systems may not give similar mortality predictions. Objectives: The aim of this study is to compare the mortality prediction ability of APACHE II and APACHE IV scoring systems among patients admitted to a tertiary care ICU. Methods: In this prospective longitudinal observational study, APACHE II and APACHE IV scores of ICU patients were computed using an online calculator. The outcome of the ICU admissions for all the patients was collected as discharged or deceased. The data were analyzed to compare the discrimination and calibration of the mortality prediction ability of the two scores. Results: Out of the 1670 patients' data analyzed, the area under the receiver operating characteristic of APACHE II score was 0.906 (95% confidence interval [CI] – 0.890–0.992), and APACHE IV score was 0.881 (95% CI – 0.862–0.890). The mean predicted mortality rate of the study population as given by the APACHE II scoring system was 44.8 ± 26.7 and as given by APACHE IV scoring system was 29.1 ± 28.5. The observed mortality rate was 22.4%. Conclusions: The APACHE II and IV scoring systems have comparable discrimination ability, but the calibration of APACHE IV seems to be better than that of APACHE II. There is a need to recalibrate the scales with weights derived from the Indian population.
Keywords: Acute Physiology and Chronic Health Evaluation II, Acute Physiology and Chronic Health Evaluation IV, mortality prediction, mortality rate, scoring systems
The outcomes of patients admitted to the Intensive Care Units (ICUs) depend on numerous factors including the age, sex, type, and severity of underlying illness and various clinical and laboratory parameters. Several scoring systems have been developed over the years to assess and describe the severity of the illness and predict the mortality rate of patients admitted to the ICU. One of the earliest valid mortality prediction tools is the Acute Physiology and Chronic Health Evaluation (APACHE) II system which was described in 1981. The tool is administered on a patient admitted to the ICU within 24 h of admission wherein various parameters including the patient demographics, clinical features, and laboratory features are entered, and an APACHE score ranging between 0 and 71 is computed. Although APACHE scores may not be useful to determine outcomes of individual patients, it can work as a good guide to prognosticate in the cohort of patients admitted to ICU. Moreover, comparison of the actual mortality rates in an ICU with the predicted mortality rate (PMR) can be used to indicate the performance of an ICU and to compare outcomes across different ICUs. In 1991, APACHE II was modified to the APACHE III score which was more elaborate with 20 variables and used several additional parameters with better predictive ability. In 2006, the APACHE scoring system was further refined (APACHE IV) adding more predictor variables and revising the statistical modeling technique. The APACHE IV system was standardized and benchmarked for use in the United States ICUs as it had good predictive validity calibration. It is well known that the predictive accuracy of these scoring systems is dynamic and need to be constantly revised and re-examined to account for the improvements in care delivery. If applied inappropriately the models may lead to wrong prediction and inappropriate use of time and resources.
The APACHE II and IV scoring systems are widely used all over the world even though they were both originally developed among ICU patients in the United States and lack validation in all patients of all races and health-care systems. Despite the development and deployment of APACHE IV most major clinical studies and several ICUs to date use APACHE II score as their severity scoring system and use it for both patient prognostication and quality monitoring. It is unclear whether APACHE II and APACHE IV will yield concordant results in our patient population. Several studies in other countries comparing APACHE II and APACHE IV have provided conflicting results. They were all underpowered, evaluated a narrow spectrum of patients and/or had several limitations.,,,,,,,
In summary, there is a dearth of literature comparing the performance of APACHE II and APACHE IV in a general ICU among a wide variety of patients in the Indian setting. In this context, we sought to compare the mortality prediction ability of APACHE II and APACHE IV scoring systems among patients admitted to a multidisciplinary tertiary care ICU in India.
This prospective longitudinal follow-up study was conducted in a tertiary care private hospital in Chennai, India. All critically ill patients admitted to the multidisciplinary ICU between April 2014 and December 2015 were included in this study. The data of all the included patients were extracted from their hospital records after the first 24 h of admission. Online APACHE calculator was used to calculate APACHE II and IV scores from the worst noted variables in 24 h of ICU admission. The patients were followed up until outcome (death or discharge) and the outcome documented. Patients who were discharged against medical advice or at request were excluded from analysis as the outcome could not be ascertained in these patients. For both scoring systems observed mortality rate (OMR) was compared with PMR. The standardized mortality ratios (SMRs) were also determined from the calculated PMR and OMR.
The data analysis was performed using SPSS Statistical Package version 17 (SPSS, IBM, Chicago, USA). Descriptive statistics on the characteristics of the patients was analyzed. The receiver operating characteristic (ROC) curves for the APACHE II and APACHE IV for predicting mortality were plotted. The area under the ROC (AUC) for the two scoring systems were compared. Further, using the various sets of sensitivities and specificities derived from the ROC, the cutoff values of the scores which provided the optimal sensitivity and specificity were identified. Kruskal–Wallis test was performed to compare the mortality rate predicted by APACHE II and APACHE IV scores.
In the study period, a total of 1859 patients were admitted to the multidisciplinary ICU. A total of 189 (10%) were excluded from the study because their outcomes were not available due to discharge at the request to other facilities or discharge against medical advice. Data of 1670 patients admitted in the ICU and followed up to outcome were analyzed. The mean age of these patients was 55 years with a predominance of males (66%). Majority of them (66%) were direct admissions from the emergency room, and 81% were admitted for medical causes. About 70% of the patients were ventilated during their ICU stay. These characteristics of the study population are depicted in [Table 1].
These patients were admitted in the ICU for a wide variety of reasons including respiratory, septic, cardiovascular, neurological, renal, trauma-related, and hepato-pancreatic issues which are depicted in [Table 2].
[Figure 1] depicts the ROC of APACHE II and APACHE IV scores with the scores as continuous variables and the dependent variable of death or discharge. It is seen that both APACHE II and APACHE IV have almost similar, overlapping curves with APACHE II score having a slightly greater AUC than APACHE IV score.
The AUC of APACHE II score is 0.906 (95% confidence interval [CI] – 0.890–0.992) and the AUC of the APACHE IV score is 0.881 (95% CI – 0.862–0.890). [Table 3] compares the predictive validity of the two scoring systems.
The mean PMR of the study population as given by the APACHE II scoring system was 44.8 ± 26.7, and the mean PMR by the APACHE IV scoring system was 29.1 ± 28.5. The OMR was 22.4%. Thus, the SMR for the APACHE II score was 0.5 and for APACHE IV score was 0.70. The Kruskal–Wallis test of significance of the difference between the mean PMR of these two groups showed a P < 0.001.
This prospective longitudinal follow-up study of 1670 patients admitted to a multidisciplinary ICU in a private tertiary care hospital in Chennai, showed that APACHE II scoring system has a comparable AUC of the ROC curves to the APACHE IV score in predicting mortality in the ICU. Furthermore, the APACHE II score showed a greater PMR of 44.8% compared to the APACHE IV score which showed a PMR of 29.1%. For the same level of sensitivity, the APACHE II score had greater specificity than the APACHE IV score.
Both the APACHE II and APACHE IV scores are robust scoring systems for predicting mortality in the ICU setting. However, the application of both these systems is strongly context specific. The predictive validity of the systems depends on the demographics of the patients, the disease condition and severity, and the infrastructure and facilities of the ICU. There are several reasons why the APACHE scoring system performance is different in different settings. Mortality prediction models at their best estimate the mortality risk only for the population in which the model is developed. Both the APACHE II and IV scoring models were developed for the North American population. Therefore, their validity in the Indian setting is likely to be different. Moreover, the relative performance of these two scoring systems is also likely to be different in the Indian setting. The other important consideration in the comparison of APACHE II and APACHE IV scores in this study is the method by which the scores were computed. The scores were computed using the internet based software into which the variables were entered. Previously studies have shown that such manual charting of data computes different scores compared to automated charting from health management information systems. Previous studies have shown that APACHE IV performed better than APACHE II in conditions such as acute lung injury and neurological damage.,,, However, for conditions such as pancreatitis and sepsis APACHE II performed better than APACHE IV., We chose to compare these scores in patients with a wide range of diagnoses admitted to a medical-surgical ICU. This could have led to the disparity of outcomes projected by these two scoring systems. The calibration of these scoring systems is also dynamic and dependent on the population on which the scoring is applied. The weightage given to the different variables in the APACHE II and APACHE IV scores are also dependent on the population characteristics. Whereas APACHE II weights were given by a panel of experts, the APACHE IV weights were given by multiple logistic regression analysis. Therefore, APACHE IV scoring system weightage is more susceptible to change in different contexts than the APACHE II scoring system and more likely to be dependent on the population characteristics than APACHE II. This gain could lead to differences in outcome projections between the two scores.
Important questions that face the intensivist in the ICU setting are as follows:
What is the meaning of different PMR given by the two different scoring systems? As the discrimination ability of both the APACHE II and IV scoring systems are almost identical, does this mean that both predict mortality accurately? If so, why are the PMR given by the two scores different? As the two scoring systems give different PMR and SMR, which one should an ICU use routinely for assessing their performance?
For all the scales that are used to predict mortality in an ICU setting, there are two important characteristics namely model calibration and model discrimination. Calibration refers to the level of agreement between the estimated probability of mortality and the observed probability. If a model has good calibration, it means that the model can predict mortality well. Model discrimination shows the ability of the model to discriminate between those who will die and those who will survive the ICU admission. This is indicated by the AUC of the ROC curve of the mortality prediction scores. Thus, lack of concordance of the PMR between the two scales indicates that the scale calibration is better for the APACHE IV scoring system, whereas the discrimination is marginally better for the APACHE II system. From this finding, it is not possible to conclude which of the two scoring systems can be used routinely in an ICU setting. As suggested by a previous review, these scoring systems are not competitive. They are complementary to each other and should be used in combination. For a given ICU setting the factors which largely drive the decision on which scoring system to use would be the ease of its application, the ability to standardize the measurements, the comfort level of the intensivists using the scale and the level of calibration that is possible to achieve in that setting.
The strength of this study is that it examines the comparability of the APACHE II and APACHE IV scores in a large sample size with diverse admission diagnoses in a multidisciplinary ICU. This is also one of the few studies comparing these scores in the Indian setting. The limitation of the study is that the APACHE II and IV scores were computed using online software. The weightage and calibration were based on the scoring system available from a North American patient population. The weightage was not obtained specifically for the population under study.
Both the APACHE II and APACHE IV scores have an almost equal discrimination ability but have different levels of calibration with the APACHE IV having a PMR closer to the OMR compared to APACHE II. Therefore, there is a need to recalibrate the scoring systems to different populations in different settings to make more meaning out of the PMRs.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
[Table 1], [Table 2], [Table 3]