Burak Kocak, Tugba Akinci D'Antonoli, Nathaniel Mercaldo, Angel Alberich-Bayarri, Bettina Baessler, Ilaria Ambrosini, Anna E Andreychenko, Spyridon Bakas, Regina G H Beets-Tan, Keno Bressem, Irene Buvat, Roberto Cannella, Luca Alessandro Cappellini, Armando Ugo Cavallo, Leonid L Chepelev, Linda Chi Hang Chu, Aydin Demircioglu, Nandita M deSouza, Matthias Dietzel, Salvatore Claudio Fanni, Andrey Fedorov, Laure S Fournier, Valentina Giannini, Rossano Girometti, Kevin B W Groot Lipman, Georgios Kalarakis, Brendan S Kelly, Michail E Klontzas, Dow-Mu Koh, Elmar Kotter, Ho Yun Lee, Mario Maas, Luis Marti-Bonmati, Henning Müller, Nancy Obuchowski, Fanny Orlhac, Nikolaos Papanikolaou, Ekaterina Petrash, Elisabeth Pfaehler, Daniel Pinto Dos Santos, Andrea Ponsiglione, Sebastià Sabater, Francesco Sardanelli, Philipp Seeböck, Nanna M Sijtsema, Arnaldo Stanzione, Alberto Traverso, Lorenzo Ugga, Martin Vallières, Lisanne V van Dijk, Joost J M van Griethuysen, Robbert W van Hamersvelt, Peter van Ooijen, Federica Vernuccio, Alan Wang, Stuart Williams, Jan Witowski, Zhongyi Zhang, Alex Zwanenburg, Renato Cuocolo
Insights Imaging . 2024 Jan 17;15(1):8. doi: 10.1186/s13244-023-01572-w.
Purpose: To propose a new quality scoring tool, METhodological RadiomICs Score (METRICS), to assess and improve research quality of radiomics studies.
Methods: We conducted an online modified Delphi study with a group of international experts. It was performed in three consecutive stages: Stage#1, item preparation; Stage#2, panel discussion among EuSoMII Auditing Group members to identify the items to be voted; and Stage#3, four rounds of the modified Delphi exercise by panelists to determine the items eligible for the METRICS and their weights. The consensus threshold was 75%. Based on the median ranks derived from expert panel opinion and their rank-sum based conversion to importance scores, the category and item weights were calculated.
Result: In total, 59 panelists from 19 countries participated in selection and ranking of the items and categories. Final METRICS tool included 30 items within 9 categories. According to their weights, the categories were in descending order of importance: study design, imaging data, image processing and feature extraction, metrics and comparison, testing, feature processing, preparation for modeling, segmentation, and open science. A web application and a repository were developed to streamline the calculation of the METRICS score and to collect feedback from the radiomics community.
Conclusion: In this work, we developed a scoring tool for assessing the methodological quality of the radiomics research, with a large international panel and a modified Delphi protocol. With its conditional format to cover methodological variations, it provides a well-constructed framework for the key methodological concepts to assess the quality of radiomic research papers.
Critical relevance statement: A quality assessment tool, METhodological RadiomICs Score (METRICS), is made available by a large group of international domain experts, with transparent methodology, aiming at evaluating and improving research quality in radiomics and machine learning.