Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 4, 2023 | Supplemental Material + Published
Journal Article Open

Human visual explanations mitigate bias in AI-based assessment of surgeon skills

Abstract

Artificial intelligence (AI) systems can now reliably assess surgeon skills through videos of intraoperative surgical activity. With such systems informing future high-stakes decisions such as whether to credential surgeons and grant them the privilege to operate on patients, it is critical that they treat all surgeons fairly. However, it remains an open question whether surgical AI systems exhibit bias against surgeon sub-cohorts, and, if so, whether such bias can be mitigated. Here, we examine and mitigate the bias exhibited by a family of surgical AI systems—SAIS—deployed on videos of robotic surgeries from three geographically-diverse hospitals (USA and EU). We show that SAIS exhibits an underskilling bias, erroneously downgrading surgical performance, and an overskilling bias, erroneously upgrading surgical performance, at different rates across surgeon sub-cohorts. To mitigate such bias, we leverage a strategy —TWIX—which teaches an AI system to provide a visual explanation for its skill assessment that otherwise would have been provided by human experts. We show that whereas baseline strategies inconsistently mitigate algorithmic bias, TWIX can effectively mitigate the underskilling and overskilling bias while simultaneously improving the performance of these AI systems across hospitals. We discovered that these findings carry over to the training environment where we assess medical students' skills today. Our study is a critical prerequisite to the eventual implementation of AI-augmented global surgeon credentialing programs, ensuring that all surgeons are treated fairly.

Additional Information

© The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Research reported in this publication was supported by the National Cancer Institute under Award No. R01CA251579-01A1. Contributions. D.K. contributed to the conception of the study and the study design, developed the deeplearning models, and wrote the manuscript. J.L. collected the data from the training environment. D.K., J.L., T.F.H., and M.O. provided annotations for the video samples. D.A.D. and Q.-D.T. provided feedback on the manuscript. C.W. collected data from St. Antonius Hospital and B.J.M. collected data from Houston Methodist Hospital and provided feedback on the manuscript. A.J.H. and A.A. provided supervision and contributed to edits of the manuscript. Data availability. The videos of live surgical procedures from the University of Southern California, St. Antonius Hospital, and Houston Methodist Hospital are not publicly available. However, the videos and the corresponding annotations of the suturing activities performed by medical students in the training environment are available upon reasonable request from the authors. Code availability. All models were developed using Python and standard deeplearning libraries such as PyTorch61. The code for the underlying model (SAIS) can be accessed at https://github.com/danikiyasseh/SAIS and that for TWIX can be accessed at https://github.com/danikiyasseh/TWIX. Competing interests. The authors declare no competing non-financial interests but the following competing financial interests: D.K. is a paid consultant of Flatiron Health and an employee of Vicarious Surgical, C.W. is a paid consultant of Intuitive Surgical, A.A. is an employee of Nvidia, and A.J.H is a consultant of Intuitive Surgical.

Attached Files

Published - 41746_2023_Article_766.pdf

Supplemental Material - 41746_2023_766_MOESM1_ESM.pdf

Files

41746_2023_766_MOESM1_ESM.pdf
Files (1.9 MB)
Name Size Download all
md5:82026a029cfea8ce479092aabb71246f
400.8 kB Preview Download
md5:e30b8dff81ac11364a0b83bc41355d77
1.5 MB Preview Download

Additional details

Created:
August 22, 2023
Modified:
October 18, 2023