Scientific article 10. JUL 2025
Generative pretrained transformer models can function as highly reliable second screeners of titles and abstracts in systematic reviews: A proof of concept and common guidelines
Independent human double screening of titles and abstracts is a critical step to ensure the quality of systematic reviews and meta-analyses herein. However, double screening is a resource-demanding procedure that slows the review process. To alleviate this issue, we evaluated the use of OpenAI's generative pretrained transformer (GPT) application programming interface (API) models as an alternative to human second screeners of titles and abstracts. We did so by developing a new benchmark scheme for interpreting the performances of automated screening tools against common human screening performances in high-quality systematic reviews and by conducting three large-scale experiments on three psychological systematic reviews with different levels of complexity. Across all experiments, we show that the GPT API models can perform on par with and in some cases even better than typical human screening performance in terms of detecting relevant studies while showing high exclusion performance, as well. Hereto, we introduce the use of multiprompt screening, which is making one concise prompt per inclusion/exclusion criteria in a review, and show that it can be a valuable tool to use and support screenings in highly complex review settings. To consolidate future implementation, we develop a reproducible workflow and a set of tentative guidelines for when and when not to use GPT API models as independent second screeners of titles and abstracts. Moreover, we present the R package AIscreenR to standardize the suggested application. Our aim is ultimately to make GPT API models acceptable as independent second screeners within high-quality systematic reviews, such as the ones published in Psychological Bulletin. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
Authors
About this publication
Published in
Psychological Methods