Scientific Papers

Predicting amputation using machine learning: A systematic review

Description of Image


Amputation is an irreversible, last-line treatment indicated for several medical problems including trauma, peripheral vascular disease, diabetes, and cancer [1]. Delaying amputation in favor of limb-sparing treatment may lead to increased risk of morbidity and mortality [2]. On the other hand, due to the life-altering course of amputation, patients can experience a variety of complications, such as various psychological morbidities [3], phantom limb pain [4], and changes to patient self-esteem [3], following amputation [3, 5]. Patient quality of life is often severely decreased due to unique challenges related to mobility, social isolation, reduced energy, pain, sleep and emotional disturbance [6]. Given the substantial burden that can follow amputation, it is important for patients and providers to be aware of the likelihood of this outcome as early as possible to accept this inevitability, and to prevent undue morbidity and mortality through early amputation [7]. Determining the likelihood of amputation can help patients understand the importance of prophylactic changes that may help the patient avoid amputation.

Despite existing tools such as the Mangled Extremity Severity Score, accurately predicting amputation as an outcome is still a troublesome dilemma in many cases [8]. Correctly identifying the need for amputation throughout a patient’s disease course can improve outcomes, such as fewer postoperative complications (e.g.: decreased length of stay in hospital, fewer local ipsilateral limb complications while in hospital and fewer instances of unplanned revisions) [9]. Earlier identification of the need for amputation would also allow for a longer period of time to implement preoperative rehabilitation programs which could further improve postoperative outcomes [10]. There is also evidence to suggest that earlier identification can lead to a larger number of patients using prosthetics, and fewer ipsilateral leg complications that can worsen prosthetic use as well as worsen rehabilitation outcomes [11, 12]. Lastly, earlier prediction of amputation can aid multidisciplinary teams in providing emotional and psychological support well before the patient may receive surgery, thereby improving patient perception of the treatment decision [13]. Early prediction of amputation would ultimately allow patients to feel more involved with their decision-making process, which, in a systematic review, was found to lead to a better patient treatment experience [14].

Artificial intelligence (AI) is defined as a “machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments” [15]. These objectives are accomplished through having the AI “learn”, from datasets, the relationships that exists within the data. For instance, AI could review a dataset containing patient factors (e.g.: genetics, environment, patient vitals) and clinical outcomes, learn the relationships that exist, and use this information to predict future outcome in similar patients [16]. In a medical context, AI has been touted to be used in conjunction with electronic medical records (EMR) to help make medical predictions [1719]. Machine learning (ML) is a subset of AI that uses prediction models and algorithms to analyze and draw inferences from patterns of data to learn or adapt. Machine learning is currently being used in a variety of ways ancillary to amputation, most of which have focused on patient outcomes after amputation [2022]. There remains a gap in the literature about how ML has been applied to patient populations that may require amputation. This systematic review synthesized the literature to assess the status of ML with respect to prediction of amputation as an outcome.


This systematic review was written in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Checklist (S1 Checklist) and the R-AMSTAR (Revised Assessment of Multiple Systematic Reviews) guidelines for reporting systematic reviews. This study was registered in PROSPERO (registration number CRD42022375853).

Search strategy

A systematic review of the literature was completed using the subject heading “amputation” and the additional subject headings “machine learning”, “artificial intelligence”, “deep learning”. Numerous search terms were also used including “amputat*”, “AI”, “computer* assist* diagnos*”, “computer vision”, “neural network*”, “supervised learn*”, “unsupervised learn*”, “natural language process*”, “segmentat*”, and “reinforcement learn*” (S1 File). The search terms “predict*” and “risk” were not included to broaden the search. The references of included articles were checked manually for citation chaining. All literature (interventional, observation, and otherwise) were eligible for inclusion during the initial screening. The literature from the search was screened based on their title and abstract. Duplicates were removed, and those that met the inclusion criteria progressed to the full-text screening stage for more in-depth screening.

Two reviewers (P.Y., Y.D.) completed the title and abstract review screening for eligible studies independently and in duplicate. A full-text review was subsequently conducted. Data was extracted independently and duplicate and discrepancies at each stage were resolved through review with a third author (E.M.). Risk of bias of each study was assessed using the PROBAST Risk of Bias for Predictive Models assessment tool and given either low, high, or unclear designations as outlined [24]. The authors considered Newcastle (in PROSPERO protocol), however, PROBAST was ultimately favored given the superior applicability of assessing risk of bias in machine learning models.


The search yielded 3572 articles; after duplicates were removed, 1376 articles remained and underwent title and abstract screening. Thirty articles moved through to full-text review, with 15 of these meeting the criteria for inclusion in this systematic review (Fig 1). The included studies developed and validated ML models from a total of 2,261,790 patients. Extensive heterogeneity between the studies across study objectives, ML models, data set features, varying subgroup analyses, and performance metrics of included studies precluded a meta-analysis of such findings. The performance metric in the majority of included articles [2536] was the area under the curve of the receiver operator characteristic (AUC) which is standard for the evaluation of application of ML in medical contexts [37, 38]. However, three studies used other performance metrics, including F-score (Fβ) [31], out of bag error rate [39], or only accuracy [40]. For simplicity of reporting, the included studies were categorized by amputation etiology. Most studies reported on patients who received amputation due to Diabetes [2529, 35, 36, 3941], followed by Trauma [3032], and “Other” [33, 34]. All included studies were derivation studies that included a form of validation. 12 of the included studies performed only internal validation, while the remaining three [3335] included external validation as well.


Table 1 shows those studies that applied ML-based prediction models for patients with diabetes. The variables that were found to be important features in models varied. However, some of these variables, including increased age, Wagner scores, C-reactive protein and history of amputation among others, appeared in multiple models [2629, 36, 3941]. The models within these studies ranged in performance from sub-optimal to excellent [AUC: 0.6–0.94]. Random Forest models [25, 28, 29, 35, 36], Gradient Boosted [27, 28], and Logistic Regression [25, 29, 35, 39] were used in multiple studies as the modeling technique. Only two of the studies had comparison reference tests to a non-ML prediction model [27, 40]. In those studies, the ML-based prediction model had better performance than the non-ML-based prediction model. Two of the studies [29, 36] produced online tools aimed at helping clinicians stratify the risk of amputation based on their modelling. Only five of the included studies [27, 28, 32, 33, 35] could be classified as low risk of bias according to the PROBAST Risk of Bias assessment tool. Within these studies, a history of amputation, age, and diabetic complications such as peripheral vascular disease or kidney complications were features that appeared useful in more than one of the models for prediction of amputation. One study [27] was rated as having an unclear risk of outcome due to the use of basket error rate as the sole performance metric as well as having unclear methodology in derivation and validation of their model.


Table 2 shows the studies that used a ML-based prediction model for patients who had suffered physical trauma. All the studies for this population showed strong to excellent performance (AUC: 0.88–0.95). Each of these studies used different base ML learning models. Two studies [31, 32] looked at lower extremity injury with concurrent vascular injury and shared the same predictor variable of arterial injury. Bevevino et al. [30] compared their model to a non ML-based model, with theirs resulting in better performance. Perkins et al.’s [32] model was rated “Unclear” in the applicability section of the PROBAST score as they tested for the chance of revascularization and limb viability and did not directly test for amputation as an outcome. In addition, the population in both the derivation and validation of their model was 100% military personnel, therefore, their model may not be generalizable to other populations [32]. Perkins et al. [32] compared their results with those determined with the Mangled Extremity Severity Score (MESS), a clinical decision-making tool created in 1990 and validated in 2001 [2, 43]. Perkins et al. [32] demonstrated that their Supervised Bayesian Network model showed better performance in predicting the revascularization of limbs.


Table 3 shows the studies that used ML-based prediction models for all other pathologies. Two studies [33, 34] were included, Cox et al. was classified as low PROBAST risk of bias. Models from both studies [33, 34] used the random forest ML model, and they both had a strong performance (AUC:0.81–1.0). Martinez-Jimenez et al. [34] demonstrated the applicability of their ML model in a cohort of 22 prospective burn patients, correctly identifying all patients that would later go on to require amputation by the surgeon’s independent decision. Uniquely, this study was the only one of the 15 included studies that analyzed imaging, using thermograms to assess and delineate the patients [35].


Amputation is a life-altering, but often necessary procedure resulting from consequences associated with conditions such as diabetes or limb trauma. Proper early identification of the need to amputate can help mitigate negative outcomes associated with amputation [2], and provide patients with the appropriate time to prepare for the potential physical and emotional or psychological complications that can follow the intervention [2, 3, 5, 6]. Earlier work has used AI and ML to make medical predictions, including predicting outcomes following amputation. Researchers have also been attempting to create ML models to predict factors associated with the outcome of amputation. It is difficult to understand the potential use of ML in predicting amputation as an outcome, as there has been no published review of these studies until now. This systematic review aimed to synthesize the available literature using ML to predict amputation. Results demonstrated the potential for ML to predict amputation as an outcome across multiple different target populations. Most of the studies in this review were able to produce predictive models with good performance, with some demonstrating improved sensitivity and specificity compared to non-ML prediction models or clinical decision-making tools. In addition, Martínez-Jiménez et al. showed comparability to clinical decision-making in a prospective setting, a requirement for the future implementation of ML in medicine [34]. Collectively, the results of this review showcase the viability of ML modeling in creating predictions for amputation. These models could be used to accurately forecast the clinical course of a patient and inform clinicians on personalized treatment plans including interventional or prophylactic changes.

Despite the promising nature of the results, there are several limitations that should be considered. The first of these arises from the review process itself. Only full-text, peer-reviewed articles published in English were included in this study. This likely resulted in an overrepresentation of research from primarily English-speaking countries. Furthermore, there are limitations to the studies themselves. The results demonstrate heterogeneity between the features that were important to predict amputation. This could be due to the variance in the data fields between the datasets, the discrepancies between each modeling technique, the intrinsic reliability of the models themselves, or any combination of these factors. Many of the datasets that were used for derivation were pre-existing databases, therefore restricting the variables that could be collected and analyzed between models. In addition, many of the studies discussed the database fields’ restrictions, arguing that the granularity within variables such as surgery outcomes and the severity of disease or injuries can be limiting in many of the datasets [31, 33, 35]. The inconsistency between database variable recording can therefore alter the impact these features could have between models or if they were to feature in a model at all. Taken together, these variations limit the confidence that can be placed in any trends or correlations that may be observed in important features across models and studies. Furthermore, despite the independent models demonstrating positive numbers, one cannot synthesize a summative conclusion from their amalgamation. The non-uniformity in outcomes such as the window of consideration for the outcomes limits the ability to compare [30, 31]. In addition, the applicability to the study population was variable across studies, with some studies deriving their models from a cohort sharing a specific trait that would limit generalizability to the other patient populations [28, 32, 33], and others limited by having no external validation [27, 28, 32, 34]. Lastly, a large number of the studies had a high risk of bias owing to small sample sizes [25, 26, 30, 31, 40, 41], therefore resulting in the need for further validation both internally and externally.

Given the current work done with ML and amputations, the results show the potential for ML to be clinically impactful. Although some authors provided online tools produced from their models [30, 37], the overall reliable application of the current models studied is limited. Increasing the breadth of data collected and standardizing the outcome measures would help to mitigate the heterogeneity seen across variables considered between models. Ultimately, despite the evidence that these models can be developed to accurately predict outcomes, for these models to build credibility, more studies that have a low risk of bias must be produced. These then need to be taken into clinical settings to study the validity and utility of these models or tools in each cohort. Lastly, future research that investigates the outcomes of change in management in cohorts applying ML based risk stratification should be pursued. The results of interventions such as increased surveillance and education in patients who are classified as higher risk for amputation should be clarified to understand the true extent of the impact that predicting amputation early will have.

In conclusion, this systematic review shows that multiple ML models with various target populations have been successfully derived that have the potential to be superior to traditional modeling techniques and comparable to prospective clinical judgment. Despite existing clinical decision-making tools, being able to accurately predict amputation as an outcome is a clinical question that has yet to be conclusively answered. There is notable interest in the applications of AI in this area, a body of research growing particularly in the last decade. Despite the promise, there are several limitations stalling the growth of these modeling technologies in a clinical context including heterogeneity between database variables and therefore model features, and bias or lack of applicability in the derivation and validation of the models themselves. Although clinical decision making tools based on these models are starting to be created, future research is needed that includes more robust databases designed to validate ML models against external cohorts in order to confidently apply this technology in clinical settings.


  1. 1.
    Kalbaugh CA, Strassle PD, Paul NJ, McGinigle KL, Kibbe MR, Marston WA. Trends in surgical indications for major lower limb amputation in the USA from 2000 to 2016. Eur J Vasc Endovasc Surg. 2020;60(1):88–96. pmid:32312664
  2. 2.
    Johansen K, Daines M, Howey T, Helfet D, Hansen S. Objective criteria accurately predict amputation following lower extremity trauma. J Trauma. 1990;30(5):568–73. pmid:2342140
  3. 3.
    Sarroca N, Valero J, Deus J, Casanova J, Luesma MJ, Lahoz M. Quality of life, body image and self-esteem in patients with unilateral transtibial amputations. Sci Rep. 2021;11(1):12559. pmid:34131211
  4. 4.
    Jensen TS, Krebs B, Nielsen J, Rasmussen P. Immediate and long-term phantom limb pain in amputees: Incidence, clinical characteristics and relationship to pre-amputation limb pain. Pain. 1985;21(3):267–78. pmid:3991231
  5. 5.
    Sahu A, Sagar R, Sarkar S, Sagar S. Psychological effects of amputation: A review of studies from India. Ind Psychiatry J. 2016;25(1):4–10. pmid:28163401
  6. 6.
    Pell JP, Donnan PT, Fowkes FGR, Ruckley CV. Quality of life following lower limb amputation for peripheral arterial disease. Eur J Vasc Surg. 1993;7(4):448–51. pmid:8359304
  7. 7.
    Butler DJ, Turkal NW, Seidl JJ. Amputation: preoperative psychological preparation. J Am Board Fam Pract. 1992;5(1):69–73. pmid:1561924
  8. 8.
    Eskridge SL, Hill OT, Clouser MC, Galarneau MR. Association of specific lower extremity injuries with delayed amputation. Mil Med. 2019;184(5–6):e323–9. pmid:30371883
  9. 9.
    Bondurant FJ, Cotler HB, Buckle R, Miller-crotchett P, Browner BD. The medical and economic impact of severely injured lower extremities. J Trauma. 1988;28(8):1270–3. pmid:3137367
  10. 10.
    Dekker R, Hristova YV, Hijmans JM, Geertzen JHB. Pre-operative rehabilitation for dysvascular lower-limb amputee patients: A focus group study involving medical professionals. PLoS One. 2018;13(10):e0204726. pmid:30321178
  11. 11.
    Budinski S. Predictive factors for successful prosthetic rehabilitation after vascular transtibial amputation. 2021. Available from: pmid:35734483
  12. 12.
    Williams ZF MD, Bools LM MD, Adams A BS, Clancy TV MD, Hope WW MD. Early versus delayed amputation in the setting of severe lower extremity trauma. Am Surg. 2015;81(6):564–8. pmid:26031267
  13. 13.
    Jo SH, Kang SH, Seo WS, Koo BH, Kim HG, Yun SH. Psychiatric understanding and treatment of patients with amputations. Yeungnam Univ J Med. 2021;38(3):194–201. pmid:33971697
  14. 14.
    Schober TL, Abrahamsen C. Patient perspectives on major lower limb amputation–A qualitative systematic review. Int J Orthop Trauma Nurs. 2022;46:100958. pmid:35930959
  15. 15.
    Pena-Lopez I. Artificial intelligence in society. OECD. 2019.
  16. 16.
    Phillips SP, Spithoff S, Simpson A. Artificial intelligence and predictive algorithms in medicine: Promise and problems. Can Fam Physician. 2022;68(8):570–2. pmid:35961724
  17. 17.
    Yu C, Helwig EJ. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer. Artif Intell Rev. 2022;55(1):323–43. pmid:34248245
  18. 18.
    Sankaran R, Kumar A, Parasuram H. Role of artificial intelligence and machine learning in the prediction of the pain: a scoping systematic review. Proc Inst Mech Eng H. 2022;236(10):1478–91. pmid:36148916
  19. 19.
    Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Hosp J. 2019;6(2):94–8. pmid:31363513
  20. 20.
    Amanpreet K. Machine learning-based novel approach to classify the shoulder motion of upper limb amputees. Biocybern Biomed En. 2019;39(3):857–67.
  21. 21.
    Griffiths B, Diment L, Granat MH. A machine learning classification model for monitoring the daily physical behaviour of lower-limb amputees. Sensors. 2021;21(22):7458. pmid:34833534
  22. 22.
    Juneau P, Baddour N, Burger H, Bavec A, Lemaire ED. Amputee fall risk classification using machine learning and smartphone sensor data from 2-minute and 6-minute walk tests. Sensors. 2022;22(5):1749. pmid:35270892
  23. 23.
    Covidence systematic review software [Internet]. Covidence. 2023. Available from:
  24. 24.
    Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170(1):51–8. pmid:30596875
  25. 25.
    Du C, Li Y, Xie P, Zhang X, Deng B, Wang G, et al. The amputation and mortality of inpatients with diabetic foot ulceration in the COVID ‐19 pandemic and postpandemic era: A machine learning study. Int Wound J. 2022;19(6):1289–97. pmid:34818691
  26. 26.
    Lin C, Yuan Y, Ji L, Yang X, Yin G, Lin S. The amputation and survival of patients with diabetic foot based on establishment of prediction model. Saudi J Biol Sci. 2020;27(3):853–8. pmid:32127762
  27. 27.
    Ravaut M, Sadeghi H, Leung KK, Volkovs M, Kornas K, Harish V, et al. Predicting adverse outcomes due to diabetes complications with machine learning using administrative health data. Digit Med. 2021;4(1):24. pmid:33580109
  28. 28.
    Yang L, Gabriel N, Hernandez I, Winterstein AG, Guo J. Using machine learning to identify diabetes patients with canagliflozin prescriptions at high-risk of lower extremity amputation using real-world data. Pharmacoepidemiol Drug Saf. 2021;30(5):644–51. pmid:33606340
  29. 29.
    Wang S, Wang J, Zhu MX, Tan Q. Machine learning for the prediction of minor amputation in University of Texas grade 3 diabetic foot ulcers. PLoS ONE 2022;17(12):e0278445. pmid:36472981
  30. 30.
    Bevevino AJ, Dickens JF, Potter BK, Dworak T, Gordon W, Forsberg JA. A model to predict limb salvage in severe combat-related open calcaneus fractures. Clin Orthop Relat Res. 2014;472(10):3002–9. pmid:24249536
  31. 31.
    Bolourani S, Thompson D, Siskind S, Kalyon BD, Patel VM, Mussa FF. Cleaning up the mess: can machine learning be used to predict lower extremity amputation after trauma-associated arterial injury? J Am Coll Surg. 2021;232(1). pmid:33022402
  32. 32.
    Perkins ZB, Yet B, Sharrock A, Rickard R, Marsh W, Rasmussen TE, et al. Predicting the outcome of limb revascularization in patients with lower-extremity arterial trauma: development and external validation of a supervised machine-learning algorithm to support surgical decisions. Ann of Surg. 2020;272(4). pmid:32657917
  33. 33.
    Cox M, Reid N, Panagides JC, Di Capua J, DeCarlo C, Dua A, et al. Interpretable machine learning for the prediction of amputation risk following lower extremity infrainguinal endovascular interventions for peripheral arterial disease. Cardiovasc Intervent Radiol. 2022;45(5):633–40. pmid:35322303
  34. 34.
    Martínez-Jiménez MA, Ramirez-GarciaLuna JL, Kolosovas-Machuca ES, Drager J, González FJ. Development and validation of an algorithm to predict the treatment modality of burn wounds using thermographic scans: Prospective cohort study. PLoS One. 2018;13(11):e0206477. pmid:30427892
  35. 35.
    Schäfer Z, Mathisen A, Svendsen K, Engberg S, Rolighed Thomsen T, Kirketerp-Møller K. Toward machine-learning-based decision support in diabetes care: a risk stratification study on diabetic foot ulcer and amputation. Front Med 2021;7:601602. pmid:33681236
  36. 36.
    Stefanopoulos S, Qiu Q, Ren G, Ahmed A, Osman M, Brunicardi FC, et al. A machine learning model for prediction of amputation in diabetics. J Diabetes Sci Technol. 2022;19322968221142900. pmid:36476059
  37. 37.
    Rahman MM, Davis DN. Addressing the class imbalance problem in medical datasets. IJMLC. 2013;224–8.
  38. 38.
    Ling CX, Huang J, Zhang H. AUC: a better measure than accuracy in comparing learning algorithms. Canadian conference on AI 2003. p. 329–341
  39. 39.
    Austin AM, Ramkumar N, Gladders B, Barnes JA, Eid MA, Moore KO, et al. Using a cohort study of diabetes and peripheral artery disease to compare logistic regression and machine learning via random forest modeling. BMC Med Res Methodol. 2022;22(1):300. pmid:36418976
  40. 40.
    Kasbekar PU, Goel P, Jadhav SP. A Decision Tree Analysis of Diabetic Foot Amputation Risk in Indian Patients. Front Endocrinol. 2017;8. pmid:28261156
  41. 41.
    Xie P, Li Y, Deng B, Du C, Rui S, Deng W, et al. An explainable machine learning model for predicting in‐hospital amputation rate of patients with diabetic foot ulcer. Int Wound J. 2022;19(4):910–8. pmid:34520110
  42. 42.
    Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. PLoS Med. 2021;18(3):e1003583. pmid:33780438
  43. 43.
    Togawa S, Yamami N, Nakayama H, Mano Y, Ikegami K, Ozeki S. The validity of the mangled extremity severity score in the assessment of upper limb injuries. J Bone Joint Surg Br. 2005;87-B(11):1516–9. pmid:16260670

Description of Image

Source link