This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Faculty of physics, University of Belgrade , Belgrade , Serbia
Clinic for Radiation Oncology, Institute of Oncology and Radiology of Serbia , Belgrade , Serbia
Faculty of physics, University of Belgrade , Belgrade , Serbia
Clinic for Radiation Oncology, Institute of Oncology and Radiology of Serbia , Belgrade , Serbia
Faculty of Medicine, University of Belgrade , Belgrade , Serbia
Faculty of Medicine, University of Belgrade , Belgrade , Serbia
The aim of this study was to investigate the application of classical machine learning algorithms for the detection of rectal tumors on computed tomography (CT) images of patients following neoadjuvant chemoradiotherapy. The study included 138 patients, of whom 136 had a confirmed tumor, while two subjects with a healthy rectum were included to balance the dataset. We extracted CT slices in which a portion of the rectum was visualized, resulting in a dataset of 3566 images. Due to variations in rectal volume, statistical distribution functions of tissue density along the contour and within the interior of the segmented region were used as input features. The most relevant features were selected using feature selection and dimensionality reduction methods, namely Mutual Information (MI) and Principal Component Analysis (PCA), and used for training and testing multiple machine learning algorithms. We used several machine learning algorithms: Logistic Regression (LR), Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Classification and Regression Trees (CART), Naive Bayes (NB), k-Nearest Neighbors (KNN), and Random Forest (RF). We trained all models on the same feature sets, and evaluated their performance using sensitivity and specificity, which are crucial in medical diagnostics. We achieved the highest accuracy (~80%) using RF models trained on six and seven features selected by MI. We obtained the highest sensitivity (~85%) with RF and NB models using seven features. We observed that specificity was highest for the LDA model with three features and the RF model with seven features (~80%).
MM is supported by the Horizon Europe STEPUPIORS Project (HORIZON-WIDERA-2021-ACCESS-03, European Commission, Agreement No. 101079217) and the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Agreement No. 451-03-136/2025-03/200043). ED utilized computational resources provided by the National Data Centre of the Republic of Serbia.
The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.