Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published August 2022 | public
Conference Paper

A Semi-automatic Indexing Pipeline for Medical Document Retrieval in Resource-constrained Settings

Abstract

Medical document indexing can benefit from both automation and human feedback. This research develops a semi-automatic indexing pipeline (SIP) for medical document retrieval in resource-constrained settings. The SIP includes an affordable and efficient automated process for preparing and indexing continuing medical education documents and a human feedback loop to validate recommended terms. It leverages pre-trained Named-entity Recognition models to identify appropriate terms from the MeSH vocabulary and higher-level subject terms from UMLS. The SIP achieved a precision of 59%, a recall of 64%, and an F1 score of 61% based on the expert evaluation of 124 distinct medical documents. The combination of automation with a human expert feedback loop demonstrates a model strategy for an affordable and practical approach to document indexing in resource-limited yet critical services. The SIP may be extended to other environments and information sources to improve the efficiency and accuracy of information retrieval.

Additional Information

© 2022, the Author(s). This material is brought to you by the Americas Conference on Information Systems (AMCIS) at AIS Electronic Library (AISeL). It has been accepted for inclusion in AMCIS 2022 Proceedings by an authorized administrator of AIS Electronic Library (AISeL).

Additional details

Created:
August 20, 2023
Modified:
October 24, 2023