Amputation is an irreversible, last-line treatment indicated for a multitude of medical problems. Delaying amputation in favor of limb-sparing treatment may lead to increased risk of morbidity and mortality. This systematic review aims to synthesize the literature on how ML is being applied to predict amputation as an outcome. OVID Embase, OVID Medline, ACM Digital Library, Scopus, Web of Science, and IEEE Xplore were searched from inception to March 5, 2023. 1376 studies were screened; 15 articles were included. In the diabetic population, models ranged from sub-optimal to excellent performance (AUC: 0.6–0.94). In trauma patients, models had strong to excellent performance (AUC: 0.88–0.95). In patients who received amputation secondary to other etiologies (e.g.: burns and peripheral vascular disease), models had similar performance (AUC: 0.81–1.0). Many studies were found to have a high PROBAST risk of bias, most often due to small sample sizes. In conclusion, multiple machine learning models have been successfully developed that have the potential to be superior to traditional modeling techniques and prospective clinical judgment in predicting amputation. Further research is needed to overcome the limitations of current studies and to bring applicability to a clinical setting.
Citation: Yao PF, Diao YD, McMullen EP, Manka M, Murphy J, Lin C (2023) Predicting amputation using machine learning: A systematic review. PLoS ONE 18(11):
Editor: Noman Naseer, Air University, PAKISTAN
Received: August 19, 2023; Accepted: October 17, 2023; Published: November 7, 2023
Copyright: © 2023 Yao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Amputation is an irreversible, last-line treatment indicated for several medical problems including trauma, peripheral vascular disease, diabetes, and cancer . Delaying amputation in favor of limb-sparing treatment may lead to increased risk of morbidity and mortality . On the other hand, due to the life-altering course of amputation, patients can experience a variety of complications, such as various psychological morbidities , phantom limb pain , and changes to patient self-esteem , following amputation [3, 5]. Patient quality of life is often severely decreased due to unique challenges related to mobility, social isolation, reduced energy, pain, sleep and emotional disturbance . Given the substantial burden that can follow amputation, it is important for patients and providers to be aware of the likelihood of this outcome as early as possible to accept this inevitability, and to prevent undue morbidity and mortality through early amputation . Determining the likelihood of amputation can help patients understand the importance of prophylactic changes that may help the patient avoid amputation.
Despite existing tools such as the Mangled Extremity Severity Score, accurately predicting amputation as an outcome is still a troublesome dilemma in many cases . Correctly identifying the need for amputation throughout a patient’s disease course can improve outcomes, such as fewer postoperative complications (e.g.: decreased length of stay in hospital, fewer local ipsilateral limb complications while in hospital and fewer instances of unplanned revisions) . Earlier identification of the need for amputation would also allow for a longer period of time to implement preoperative rehabilitation programs which could further improve postoperative outcomes . There is also evidence to suggest that earlier identification can lead to a larger number of patients using prosthetics, and fewer ipsilateral leg complications that can worsen prosthetic use as well as worsen rehabilitation outcomes [11, 12]. Lastly, earlier prediction of amputation can aid multidisciplinary teams in providing emotional and psychological support well before the patient may receive surgery, thereby improving patient perception of the treatment decision . Early prediction of amputation would ultimately allow patients to feel more involved with their decision-making process, which, in a systematic review, was found to lead to a better patient treatment experience .
Artificial intelligence (AI) is defined as a “machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments” . These objectives are accomplished through having the AI “learn”, from datasets, the relationships that exists within the data. For instance, AI could review a dataset containing patient factors (e.g.: genetics, environment, patient vitals) and clinical outcomes, learn the relationships that exist, and use this information to predict future outcome in similar patients . In a medical context, AI has been touted to be used in conjunction with electronic medical records (EMR) to help make medical predictions [17–19]. Machine learning (ML) is a subset of AI that uses prediction models and algorithms to analyze and draw inferences from patterns of data to learn or adapt. Machine learning is currently being used in a variety of ways ancillary to amputation, most of which have focused on patient outcomes after amputation [20–22]. There remains a gap in the literature about how ML has been applied to patient populations that may require amputation. This systematic review synthesized the literature to assess the status of ML with respect to prediction of amputation as an outcome.
This systematic review was written in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Checklist (S1 Checklist) and the R-AMSTAR (Revised Assessment of Multiple Systematic Reviews) guidelines for reporting systematic reviews. This study was registered in PROSPERO (registration number CRD42022375853).
English peer-reviewed articles that developed multivariable models for predicting amputation in humans were included. No restriction on patient age was made.
Publications were excluded if there was no mention of predicting amputation risk, or if AI was not a part of the methodology. Examples of this include AI models predicting only wound healing or outcomes rather than amputation. Abstracts from conferences were also excluded as they lacked depth and data to adequately contribute to the systematic review.
A systematic search of OVID Embase, OVID Medline, ACM Digital Library, Scopus, Web of Science, and IEEE Xplore from inception to November 12th, 2022, and re-updated on March 5th, 2023 with assistance from a medical librarian (S1 File). Studies resulting from this search were imported into Covidence, a systematic review software .
A systematic review of the literature was completed using the subject heading “amputation” and the additional subject headings “machine learning”, “artificial intelligence”, “deep learning”. Numerous search terms were also used including “amputat*”, “AI”, “computer* assist* diagnos*”, “computer vision”, “neural network*”, “supervised learn*”, “unsupervised learn*”, “natural language process*”, “segmentat*”, and “reinforcement learn*” (S1 File). The search terms “predict*” and “risk” were not included to broaden the search. The references of included articles were checked manually for citation chaining. All literature (interventional, observation, and otherwise) were eligible for inclusion during the initial screening. The literature from the search was screened based on their title and abstract. Duplicates were removed, and those that met the inclusion criteria progressed to the full-text screening stage for more in-depth screening.
Two reviewers (P.Y., Y.D.) completed the title and abstract review screening for eligible studies independently and in duplicate. A full-text review was subsequently conducted. Data was extracted independently and duplicate and discrepancies at each stage were resolved through review with a third author (E.M.). Risk of bias of each study was assessed using the PROBAST Risk of Bias for Predictive Models assessment tool and given either low, high, or unclear designations as outlined . The authors considered Newcastle (in PROSPERO protocol), however, PROBAST was ultimately favored given the superior applicability of assessing risk of bias in machine learning models.
Data charting and result reporting
Data was extracted from the 15 included articles into a data extraction table created a priori. The following pre-selected variables were extracted from each included article: author(s), year of publication, country of dataset origin, study design, level of evidence, primary aim(s), secondary aim(s), ML model(s) used, derivation/validation test used, reference test used, comparison to reference, secondary reference test used (if applicable), comparison to the secondary reference test, clinical applicability of the ML model, dataset, study inclusion criteria, study exclusion criteria, underlying pathology, anatomical part being studied, number of patients in dataset, the number of cases in data set, sex (Female, Male [%]), age (range), how the model was trained, features in the model, predictors of amputation, performance metric used, study limitations, conclusion(s), any notes made by the authors of this systematic review, and any conflicts of interest.
The search yielded 3572 articles; after duplicates were removed, 1376 articles remained and underwent title and abstract screening. Thirty articles moved through to full-text review, with 15 of these meeting the criteria for inclusion in this systematic review (Fig 1). The included studies developed and validated ML models from a total of 2,261,790 patients. Extensive heterogeneity between the studies across study objectives, ML models, data set features, varying subgroup analyses, and performance metrics of included studies precluded a meta-analysis of such findings. The performance metric in the majority of included articles [25–36] was the area under the curve of the receiver operator characteristic (AUC) which is standard for the evaluation of application of ML in medical contexts [37, 38]. However, three studies used other performance metrics, including F-score (Fβ) , out of bag error rate , or only accuracy . For simplicity of reporting, the included studies were categorized by amputation etiology. Most studies reported on patients who received amputation due to Diabetes [25–29, 35, 36, 39–41], followed by Trauma [30–32], and “Other” [33, 34]. All included studies were derivation studies that included a form of validation. 12 of the included studies performed only internal validation, while the remaining three [33–35] included external validation as well.
Table 1 shows those studies that applied ML-based prediction models for patients with diabetes. The variables that were found to be important features in models varied. However, some of these variables, including increased age, Wagner scores, C-reactive protein and history of amputation among others, appeared in multiple models [26–29, 36, 39–41]. The models within these studies ranged in performance from sub-optimal to excellent [AUC: 0.6–0.94]. Random Forest models [25, 28, 29, 35, 36], Gradient Boosted [27, 28], and Logistic Regression [25, 29, 35, 39] were used in multiple studies as the modeling technique. Only two of the studies had comparison reference tests to a non-ML prediction model [27, 40]. In those studies, the ML-based prediction model had better performance than the non-ML-based prediction model. Two of the studies [29, 36] produced online tools aimed at helping clinicians stratify the risk of amputation based on their modelling. Only five of the included studies [27, 28, 32, 33, 35] could be classified as low risk of bias according to the PROBAST Risk of Bias assessment tool. Within these studies, a history of amputation, age, and diabetic complications such as peripheral vascular disease or kidney complications were features that appeared useful in more than one of the models for prediction of amputation. One study  was rated as having an unclear risk of outcome due to the use of basket error rate as the sole performance metric as well as having unclear methodology in derivation and validation of their model.
Table 2 shows the studies that used a ML-based prediction model for patients who had suffered physical trauma. All the studies for this population showed strong to excellent performance (AUC: 0.88–0.95). Each of these studies used different base ML learning models. Two studies [31, 32] looked at lower extremity injury with concurrent vascular injury and shared the same predictor variable of arterial injury. Bevevino et al.  compared their model to a non ML-based model, with theirs resulting in better performance. Perkins et al.’s  model was rated “Unclear” in the applicability section of the PROBAST score as they tested for the chance of revascularization and limb viability and did not directly test for amputation as an outcome. In addition, the population in both the derivation and validation of their model was 100% military personnel, therefore, their model may not be generalizable to other populations . Perkins et al.  compared their results with those determined with the Mangled Extremity Severity Score (MESS), a clinical decision-making tool created in 1990 and validated in 2001 [2, 43]. Perkins et al.  demonstrated that their Supervised Bayesian Network model showed better performance in predicting the revascularization of limbs.
Table 3 shows the studies that used ML-based prediction models for all other pathologies. Two studies [33, 34] were included, Cox et al. was classified as low PROBAST risk of bias. Models from both studies [33, 34] used the random forest ML model, and they both had a strong performance (AUC:0.81–1.0). Martinez-Jimenez et al.  demonstrated the applicability of their ML model in a cohort of 22 prospective burn patients, correctly identifying all patients that would later go on to require amputation by the surgeon’s independent decision. Uniquely, this study was the only one of the 15 included studies that analyzed imaging, using thermograms to assess and delineate the patients .
Amputation is a life-altering, but often necessary procedure resulting from consequences associated with conditions such as diabetes or limb trauma. Proper early identification of the need to amputate can help mitigate negative outcomes associated with amputation , and provide patients with the appropriate time to prepare for the potential physical and emotional or psychological complications that can follow the intervention [2, 3, 5, 6]. Earlier work has used AI and ML to make medical predictions, including predicting outcomes following amputation. Researchers have also been attempting to create ML models to predict factors associated with the outcome of amputation. It is difficult to understand the potential use of ML in predicting amputation as an outcome, as there has been no published review of these studies until now. This systematic review aimed to synthesize the available literature using ML to predict amputation. Results demonstrated the potential for ML to predict amputation as an outcome across multiple different target populations. Most of the studies in this review were able to produce predictive models with good performance, with some demonstrating improved sensitivity and specificity compared to non-ML prediction models or clinical decision-making tools. In addition, Martínez-Jiménez et al. showed comparability to clinical decision-making in a prospective setting, a requirement for the future implementation of ML in medicine . Collectively, the results of this review showcase the viability of ML modeling in creating predictions for amputation. These models could be used to accurately forecast the clinical course of a patient and inform clinicians on personalized treatment plans including interventional or prophylactic changes.
Despite the promising nature of the results, there are several limitations that should be considered. The first of these arises from the review process itself. Only full-text, peer-reviewed articles published in English were included in this study. This likely resulted in an overrepresentation of research from primarily English-speaking countries. Furthermore, there are limitations to the studies themselves. The results demonstrate heterogeneity between the features that were important to predict amputation. This could be due to the variance in the data fields between the datasets, the discrepancies between each modeling technique, the intrinsic reliability of the models themselves, or any combination of these factors. Many of the datasets that were used for derivation were pre-existing databases, therefore restricting the variables that could be collected and analyzed between models. In addition, many of the studies discussed the database fields’ restrictions, arguing that the granularity within variables such as surgery outcomes and the severity of disease or injuries can be limiting in many of the datasets [31, 33, 35]. The inconsistency between database variable recording can therefore alter the impact these features could have between models or if they were to feature in a model at all. Taken together, these variations limit the confidence that can be placed in any trends or correlations that may be observed in important features across models and studies. Furthermore, despite the independent models demonstrating positive numbers, one cannot synthesize a summative conclusion from their amalgamation. The non-uniformity in outcomes such as the window of consideration for the outcomes limits the ability to compare [30, 31]. In addition, the applicability to the study population was variable across studies, with some studies deriving their models from a cohort sharing a specific trait that would limit generalizability to the other patient populations [28, 32, 33], and others limited by having no external validation [27, 28, 32, 34]. Lastly, a large number of the studies had a high risk of bias owing to small sample sizes [25, 26, 30, 31, 40, 41], therefore resulting in the need for further validation both internally and externally.
Given the current work done with ML and amputations, the results show the potential for ML to be clinically impactful. Although some authors provided online tools produced from their models [30, 37], the overall reliable application of the current models studied is limited. Increasing the breadth of data collected and standardizing the outcome measures would help to mitigate the heterogeneity seen across variables considered between models. Ultimately, despite the evidence that these models can be developed to accurately predict outcomes, for these models to build credibility, more studies that have a low risk of bias must be produced. These then need to be taken into clinical settings to study the validity and utility of these models or tools in each cohort. Lastly, future research that investigates the outcomes of change in management in cohorts applying ML based risk stratification should be pursued. The results of interventions such as increased surveillance and education in patients who are classified as higher risk for amputation should be clarified to understand the true extent of the impact that predicting amputation early will have.
In conclusion, this systematic review shows that multiple ML models with various target populations have been successfully derived that have the potential to be superior to traditional modeling techniques and comparable to prospective clinical judgment. Despite existing clinical decision-making tools, being able to accurately predict amputation as an outcome is a clinical question that has yet to be conclusively answered. There is notable interest in the applications of AI in this area, a body of research growing particularly in the last decade. Despite the promise, there are several limitations stalling the growth of these modeling technologies in a clinical context including heterogeneity between database variables and therefore model features, and bias or lack of applicability in the derivation and validation of the models themselves. Although clinical decision making tools based on these models are starting to be created, future research is needed that includes more robust databases designed to validate ML models against external cohorts in order to confidently apply this technology in clinical settings.
Kalbaugh CA, Strassle PD, Paul NJ, McGinigle KL, Kibbe MR, Marston WA. Trends in surgical indications for major lower limb amputation in the USA from 2000 to 2016. Eur J Vasc Endovasc Surg. 2020;60(1):88–96. pmid:32312664
Johansen K, Daines M, Howey T, Helfet D, Hansen S. Objective criteria accurately predict amputation following lower extremity trauma. J Trauma. 1990;30(5):568–73. pmid:2342140
Sarroca N, Valero J, Deus J, Casanova J, Luesma MJ, Lahoz M. Quality of life, body image and self-esteem in patients with unilateral transtibial amputations. Sci Rep. 2021;11(1):12559. pmid:34131211
Jensen TS, Krebs B, Nielsen J, Rasmussen P. Immediate and long-term phantom limb pain in amputees: Incidence, clinical characteristics and relationship to pre-amputation limb pain. Pain. 1985;21(3):267–78. pmid:3991231
Sahu A, Sagar R, Sarkar S, Sagar S. Psychological effects of amputation: A review of studies from India. Ind Psychiatry J. 2016;25(1):4–10. pmid:28163401
Pell JP, Donnan PT, Fowkes FGR, Ruckley CV. Quality of life following lower limb amputation for peripheral arterial disease. Eur J Vasc Surg. 1993;7(4):448–51. pmid:8359304
Butler DJ, Turkal NW, Seidl JJ. Amputation: preoperative psychological preparation. J Am Board Fam Pract. 1992;5(1):69–73. pmid:1561924
Eskridge SL, Hill OT, Clouser MC, Galarneau MR. Association of specific lower extremity injuries with delayed amputation. Mil Med. 2019;184(5–6):e323–9. pmid:30371883
Bondurant FJ, Cotler HB, Buckle R, Miller-crotchett P, Browner BD. The medical and economic impact of severely injured lower extremities. J Trauma. 1988;28(8):1270–3. pmid:3137367
Dekker R, Hristova YV, Hijmans JM, Geertzen JHB. Pre-operative rehabilitation for dysvascular lower-limb amputee patients: A focus group study involving medical professionals. PLoS One. 2018;13(10):e0204726. pmid:30321178
Budinski S. Predictive factors for successful prosthetic rehabilitation after vascular transtibial amputation. 2021. Available from: https://hrcak.srce.hr/clanak/399166 pmid:35734483
Williams ZF MD, Bools LM MD, Adams A BS, Clancy TV MD, Hope WW MD. Early versus delayed amputation in the setting of severe lower extremity trauma. Am Surg. 2015;81(6):564–8. pmid:26031267
Jo SH, Kang SH, Seo WS, Koo BH, Kim HG, Yun SH. Psychiatric understanding and treatment of patients with amputations. Yeungnam Univ J Med. 2021;38(3):194–201. pmid:33971697
Schober TL, Abrahamsen C. Patient perspectives on major lower limb amputation–A qualitative systematic review. Int J Orthop Trauma Nurs. 2022;46:100958. pmid:35930959
Pena-Lopez I. Artificial intelligence in society. OECD. 2019.
Phillips SP, Spithoff S, Simpson A. Artificial intelligence and predictive algorithms in medicine: Promise and problems. Can Fam Physician. 2022;68(8):570–2. pmid:35961724
Yu C, Helwig EJ. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer. Artif Intell Rev. 2022;55(1):323–43. pmid:34248245
Sankaran R, Kumar A, Parasuram H. Role of artificial intelligence and machine learning in the prediction of the pain: a scoping systematic review. Proc Inst Mech Eng H. 2022;236(10):1478–91. pmid:36148916
Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Hosp J. 2019;6(2):94–8. pmid:31363513
Amanpreet K. Machine learning-based novel approach to classify the shoulder motion of upper limb amputees. Biocybern Biomed En. 2019;39(3):857–67.
Griffiths B, Diment L, Granat MH. A machine learning classification model for monitoring the daily physical behaviour of lower-limb amputees. Sensors. 2021;21(22):7458. pmid:34833534
Juneau P, Baddour N, Burger H, Bavec A, Lemaire ED. Amputee fall risk classification using machine learning and smartphone sensor data from 2-minute and 6-minute walk tests. Sensors. 2022;22(5):1749. pmid:35270892
Covidence systematic review software [Internet]. Covidence. 2023. Available from: www.covidence.org
Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170(1):51–8. pmid:30596875
Du C, Li Y, Xie P, Zhang X, Deng B, Wang G, et al. The amputation and mortality of inpatients with diabetic foot ulceration in the COVID ‐19 pandemic and postpandemic era: A machine learning study. Int Wound J. 2022;19(6):1289–97. pmid:34818691
Lin C, Yuan Y, Ji L, Yang X, Yin G, Lin S. The amputation and survival of patients with diabetic foot based on establishment of prediction model. Saudi J Biol Sci. 2020;27(3):853–8. pmid:32127762
Ravaut M, Sadeghi H, Leung KK, Volkovs M, Kornas K, Harish V, et al. Predicting adverse outcomes due to diabetes complications with machine learning using administrative health data. Digit Med. 2021;4(1):24. pmid:33580109
Yang L, Gabriel N, Hernandez I, Winterstein AG, Guo J. Using machine learning to identify diabetes patients with canagliflozin prescriptions at high-risk of lower extremity amputation using real-world data. Pharmacoepidemiol Drug Saf. 2021;30(5):644–51. pmid:33606340
Wang S, Wang J, Zhu MX, Tan Q. Machine learning for the prediction of minor amputation in University of Texas grade 3 diabetic foot ulcers. PLoS ONE 2022;17(12):e0278445. pmid:36472981
Bevevino AJ, Dickens JF, Potter BK, Dworak T, Gordon W, Forsberg JA. A model to predict limb salvage in severe combat-related open calcaneus fractures. Clin Orthop Relat Res. 2014;472(10):3002–9. pmid:24249536
Bolourani S, Thompson D, Siskind S, Kalyon BD, Patel VM, Mussa FF. Cleaning up the mess: can machine learning be used to predict lower extremity amputation after trauma-associated arterial injury? J Am Coll Surg. 2021;232(1). pmid:33022402
Perkins ZB, Yet B, Sharrock A, Rickard R, Marsh W, Rasmussen TE, et al. Predicting the outcome of limb revascularization in patients with lower-extremity arterial trauma: development and external validation of a supervised machine-learning algorithm to support surgical decisions. Ann of Surg. 2020;272(4). pmid:32657917
Cox M, Reid N, Panagides JC, Di Capua J, DeCarlo C, Dua A, et al. Interpretable machine learning for the prediction of amputation risk following lower extremity infrainguinal endovascular interventions for peripheral arterial disease. Cardiovasc Intervent Radiol. 2022;45(5):633–40. pmid:35322303
Martínez-Jiménez MA, Ramirez-GarciaLuna JL, Kolosovas-Machuca ES, Drager J, González FJ. Development and validation of an algorithm to predict the treatment modality of burn wounds using thermographic scans: Prospective cohort study. PLoS One. 2018;13(11):e0206477. pmid:30427892
Schäfer Z, Mathisen A, Svendsen K, Engberg S, Rolighed Thomsen T, Kirketerp-Møller K. Toward machine-learning-based decision support in diabetes care: a risk stratification study on diabetic foot ulcer and amputation. Front Med 2021;7:601602. pmid:33681236
Stefanopoulos S, Qiu Q, Ren G, Ahmed A, Osman M, Brunicardi FC, et al. A machine learning model for prediction of amputation in diabetics. J Diabetes Sci Technol. 2022;19322968221142900. pmid:36476059
Rahman MM, Davis DN. Addressing the class imbalance problem in medical datasets. IJMLC. 2013;224–8.
Ling CX, Huang J, Zhang H. AUC: a better measure than accuracy in comparing learning algorithms. Canadian conference on AI 2003. p. 329–341
Austin AM, Ramkumar N, Gladders B, Barnes JA, Eid MA, Moore KO, et al. Using a cohort study of diabetes and peripheral artery disease to compare logistic regression and machine learning via random forest modeling. BMC Med Res Methodol. 2022;22(1):300. pmid:36418976
Kasbekar PU, Goel P, Jadhav SP. A Decision Tree Analysis of Diabetic Foot Amputation Risk in Indian Patients. Front Endocrinol. 2017;8. pmid:28261156
Xie P, Li Y, Deng B, Du C, Rui S, Deng W, et al. An explainable machine learning model for predicting in‐hospital amputation rate of patients with diabetic foot ulcer. Int Wound J. 2022;19(4):910–8. pmid:34520110
Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. PLoS Med. 2021;18(3):e1003583. pmid:33780438
Togawa S, Yamami N, Nakayama H, Mano Y, Ikegami K, Ozeki S. The validity of the mangled extremity severity score in the assessment of upper limb injuries. J Bone Joint Surg Br. 2005;87-B(11):1516–9. pmid:16260670