Predictive Factors of Advanced Colonic Adenomas and Cancer Using Data Mining

Atieh Sadat Fatemian, Neda Abdolvand, Hamideh Salimzadeh, Alireza Delavari



Colorectal cancer is the third common cancer in Iran. In this study we aimed to identify factors associated with the prevalence of advanced colonic neoplasms among a high-risk population.


Participants were 474 first degree relatives of patients with colon cancer who underwent a screening colonoscopy at Digestive Disease Research Institute, Shariati Hospital affiliated to Tehran University of Medical Sciences. Features examined in this study were age, sex, body mass index, Aspirin use, smoking, and relationship type with patients with cancer in family. Also, patient’s age at the time of cancer diagnosis, number and sex of the patients with colon cancer in the family were assessed. Data analysis was performed by data mining methods using  K-Medoid clustering and decision tree C4.5.


Results showed that female sex of the patients with colon cancer and their young age (< 60 years old) at the time of cancer diagnosis were important predictive factors for the prevalence of colorectal advanced neoplasms among their family members.


Data mining methods were found to be applicable in recognizing predictive factors of colorectal advanced neoplasms in each cluster and tree.


Colorectal cancer – data mining – clustering - decision tree - Crisp methodology

Full Text:


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.