Prediction rule for estimating advanced colorectal neoplasm risk in average-risk populations in southern Jiangsu Province
Introduction
Colorectal neoplasm (CN) is one of common malignant tumors of China, in the past 20 years, its incidence and mortality exhibited a clear upward trend (1,2). It is currently considered that CN is mainly developed from the adenomatous polyps (AP), and 70-75% newly-diagnosed CN patients are the asymptomatic average-risk populations (AARP) who are over 50 years old (3). The so-called AARP refers to the populations without the CN-related symptoms or prewarning symptoms, without the medical history and family history of CN and AP, and without genetic CN and the medical history of inflammatory bowel disease (3). The colonoscopy screening towards AARP could detect the early-stage CN and precancerous lesions, and make timely treatment, so that the incidence and mortality of CN in this kind of population would be expected to be effectively reduced (4). However, compared with other screening methods such as fecal occult blood test, the cost of colonoscopy is larger, and it is an invasive procedure, so many scholars have questioned its rationality to be the preferred screening method towards AARP (5). In addition, it is estimated that AARP over 50 years old would be over 100 million, so that the current health resources and economic conditions could not support the entire population colonoscopy screening strategy (6).
Studies have shown that the AARP could be considered the same as the low-risk populations (3,7). Therefore, if certain information which could be easily accessed, such as age and gender, could be used to accurately predict the disease risks (risk stratification) of AARP, and then guide the screening colonoscopy towards the populations with higher disease risks, it would greatly improve the screening efficiency, reduce the screening costs and save the limited health resources. In this study, the optimal prediction factors towards the advanced CN in Han-nationality AARP of the southern Jiangsu Province were investigated, aiming to establish the risk scoring system based on the situations of advanced CN in this AARP, and evaluate its effectiveness as the screening tool of CN.
Materials and methods
Subjects
The AARP, permanently registered in the southern Jiangsu Province, were detected with the colonoscopy during the routine health examination in the Affiliated Yixing Hospital of Jiangsu University from Jul 2011 to Dec 2012. Inclusion criteria: scheduled for colonoscopy; aged ≥40 years old; Han nationality; asymptomatic, or only possessed the non-specific colorectal symptoms (mild abdominal pain, intermittent diarrhea or constipation). Exclusion criteria: (I) the first and second degree relative had CN history; (II) the under-60-year first degree relative had the history of AP or familial hereditary syndromes (including familial AP, hereditary non-polyposis CN, Turcot syndrome, Oldfield syndrome, etc.); (III) had the disease history of CN or polypous disease, inflammatory bowel disease, or other organ tumors, etc.; (IV) had iron deficiency anemia or the fecal occult blood test was positive, hematochezia, significant weight loss, tenesmus and other symptoms; (V) performed the colonoscopy in the near five years; or (VI) had the colorectal surgery history. This study was conducted in accordance with the declaration of Helsinki. This study was approved by the Ethics Committee of the Affiliated Hospital of Jiangsu University. Written informed consent was obtained from all participants.
Methods
The cross-sectional study was used for the design. Before the colonoscopy, a questionnaire was surveyed, including the demographic characteristics, past medical history, surgical history, medication history, smoking history, alcohol drinking history, tea drinking history, physical activity, diet habits and defecation frequency, etc. The colonoscopy was performed by 2 experienced by gastrointestinal-endoscopy experts (the inspection equipment was produced by Japan OLYMPUS Medical Co., Ltd., Host Model: CV-260SL and CLV-260SL, the electronic colonoscopy Model: Q260AL), and they were also responsible for completing the colonoscopy examination results. The advanced CN includes the advanced adenoma and invasive carcinoma. The advanced adenomas include adenomas with diameter ≥1 cm, villous adenomas (the villous component should at least be 25%) and tubular adenomas which was associated with severe atypical hyperplasia. The invasive carcinoma refers to the tumor of which the malignant cells invaded over the muscularis mucosa. The severe atypical hyperplasia includes the intramucosal carcinoma and carcinoma in situ. Only the subjects with good bowel preparation and having completed the whole colonoscopy (the colonoscope reached the ileocecal valve) were included in the study and statistically analyzed.
Statistical analysis
EpiData 3.02 software was used, and the double entry method was used to build the database. SPSS 19.0 statistical software (SPSS Inc., Chicago, IL, USA) was used for the data analysis. All the analyses were the two-sided test. The significance level of the multivariate analysis was 0.05.
The risk prediction model of advanced CN and the establishment and evaluation of its scoring system: the colonoscopy examination results (with or without the advanced tumors) were set as the dependent variables, while the demographic characteristics and other variables were set as the independent variables for the single factor analysis and multivariate analysis. According to the data nature of the independent variables, the intergroups’ data t-test, Mann-Whitney U test and χ2 test were used for the single factor analysis. Among the single factor analysis results, the independent variables of P6), then the multivariate Logistic regression analysis was performed for the multivariate analysis to establish the risk prediction model of the advanced CN. In order to facilitate the clinical application, the continuous variables of the above model were transferred into the categorical variables for re-performing the multivariate Logistic regression analysis. The scores were then assigned according to the β values of each variable in the new Logistic regression model (6,8) to establish the risk scoring system of the advanced CN. The risk prediction consistency was evaluated by the Hosmer-Lemeshow goodness of fit test method (6). The distinguishing abilities of the risk prediction model and its scoring system were evaluated with the areas under the receiver operating characteristic (ROC) curve (6,8,9). The accuracy was evaluated with the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, positive likelihood ratio and negative likelihood ratio (6).
Internal population verification of the advanced CN risk scoring system: the non-parametric Bootstrap method was used for the verification (6,10). Specific procedure: among the range of original data, the sampling with replacement was done with the same sample size, and the obtained samples were called the Bootstrap samples, the ROC curve analysis of the scoring system was then performed on such samples, which would generate the estimate area values under the ROC curve; repeatedly sampled such samples for 1,000 times, which would obtain 1,000 estimate area values under the ROC curve; according to the normal distribution theory, the point estimation under the scoring system ROC curve and the 95% confidence interval (95% CI) were calculated.
Comparison with other similar scoring systems: the reported similar scoring systems included the scoring systems established by Cai et al. (6), Betés et al. (11) and Lin et al. (12). The areas under the ROC curves and the 95% CIs of the above three scoring systems were calculated. U test method was used to compare the differences of the areas under ROC curve among the scoring system established in this study and the other three scoring systems.
Results
Subject characteristics
A total of 985 qualified subjects were included, including 905 cases (91.9%), who completed the entire colon test, and were included in subsequent statistical analysis. Among the 905 cases studied, 393 cases were males (43.4%), with the mean age as 56.6±10.1 years old, and 48 cases were the cases with the advanced tumors (5.3%). AP was in 100 cases (11.5%), including 2 cases of malignant transformation (2%).
Establishment and evaluation of risk prediction model and its scoring system of advanced CN
The single factor analysis showed that the age, gender, educational degree, hypertension, coronary heart disease, smoking, alcohol drinking, tea drinking, green vegetable intake, egg intake and defecation frequency were the potential risk predictors of the advanced CN (PTable 1). The multivariate analysis showed that the age, gender, coronary heart disease, egg intake and defecation frequency were the risk independent predictors of the advanced CN (Table 1). The results of the Hosmer-Lemeshow goodness of fit test showed that the goodness of fit of the Logistic regression model was good (P=0.174). The area under the ROC curve (95% CI) was 0.76 (0.70-0.82) (P
Full table
In order to facilitate the clinical application, the continuous variable “age” in the above model was transferred into the categorical variable and the re-Logistic regression analysis was performed (Table 2). The goodness of fit test of the Hosmer-Lemeshow results showed that the goodness of fit in the new Logistic regression model was good (P=0.205). The area under the ROC curve (95% CI) was 0.76 (0.70-0.82) (PTable 2). With the increasing scores, the ratio of the advanced CN patients in the total study subjects tended to increase (Table 3). The ROC curve analysis of the scoring system showed that the area under the ROC curve (95% CI) was 0.75 (0.69-0.82) (P
Full table
Full table
Based on the ROC curve of the scoring system, 2.5 points was set as the cutoff value, the subjects were divided into low-risk populations of the advanced CN (411 cases, 45.4%), and high-risk populations (494 cases, 54.6%) (Table 3). The proportion of the advanced CN patients in the high-risk populations (>2 points) was 9.1% (45/494), which was significantly higher than that in the low-risk populations (0-2 points) (0.7%, 3/411) (PTable 3). Within the predictive cutoff value, the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, positive likelihood ratio and negative likelihood ratio of the screening tool of the scoring system were 93.8%, 47.6%, 50.1%, 9.1%, 99.3%, 1.79 and 0.13, respectively. And 93.8% (43/48) cases of the advanced CN were in the high-risk populations. Among the whole population, low-risk and high-risk populations, the colonoscopy numbers needed to sieve out 1 case of advanced CN were 19, 137 and 11 cases, respectively (Table 3).
Internal population verification of risk scoring system of advanced CN
The validation results of the non-parametric Bootstrap method showed that the average area under ROC curve (95% CI) of the scoring system was 0.75 (0.70-0.82), similar to the ROC curve results of the modeling crowd.
Comparison with other similar scoring systems
Among the 4 scoring systems, the distinguishing degree of the scoring system established in this study was the best, significantly better than the scoring system established by Cai et al. (6) (P=0.036), while exhibited no significant difference with the scoring systems established by Betés et al. (11) and Lin et al. (12) (P>0.05) (Table 4).
Full table
Discussion
Through this study, we expect to get the optimal prediction factors towards the advanced CN in Han-nationality AARP of the southern Jiangsu Province to establish the risk scoring system based on the situations of advanced CN in this AARP, and to evaluate its effectiveness as the screening tool of CN. The risk scoring system of the advanced CN in this study were composed of five variables, namely the age, gender, coronary heart disease, egg intake and defecation frequency, with good predictive consistency and distinguishing degree, higher sensitivity and negative predictive value, and could be used in the initial screening of the advanced CN in AARP. The low-risk populations judged by the screening tool of the scoring system could only perform the regular follow-up or fecal occult blood test; while as for the high-risk populations, they would be recommended for full colon examination to identify the potential colorectal lesions. This study could help to build the screening strategies of the characteristic CN within the southern Jiangsu Province, therefore significantly improving the screening efficacy of CN in the southern Jiangsu Province, reducing the screening costs and saving the health resources.
Considerable evidence exhibited that the risk scoring system of the advanced CN established in this study was effective, accurate and reliable, and could be used as the CN screening tool in the screening application of AARP in the southern Jiangsu Province. Especially, the discovery of early CN and precancerous lesions has certain advantage. Firstly, our results showed that no matter the modeling or the internal population verification, the scoring system showed a good distinguishing ability, which could accurately distinguish the high-risk and low-risk populations. Secondly, the sensitivity of the scoring system was up to 93.8%, the majority of the CN cases and pre-malignant lesions were included in the high-risk populations, the misdiagnosis rate was low, and the negative predictive value was high (99.3%), so it would be especially suitable as the early CN screening tool in population screening. Thirdly, compared with the reported similar scoring systems, the scoring system had a higher distinguishing degree in the populations of the southern Jiangsu Province, and would be more suitable for the CN screening in this region. Fourthly, our findings showed that among the whole population, low-risk and high-risk populations, the colonoscopy numbers needed to sieve out 1 case of advanced CN were 19, 137 and 11 cases, respectively (Table 3). The risk stratification screening strategy, based on this study, could screen out 93.8% cases (45/48) in the case of reducing 45.4% colonoscopy (411/905), therefore, this strategy significantly improved the population screening efficiency. Fifthly, our study found that older, male, with coronary heart disease history, frequent egg intake and less defecation frequency (once for 2 or more days) were the independent risk factors of the CN, which was consistent with the results reported in the literature (13-21). Numerous studies have confirmed that with the increasing age, the risk of CN would significantly increase (13,14), so the foreign screening guidelines recommended that the CN screening should be performed towards the AARP from 50 years old (15). A number of researches towards the different ethnic groups confirmed that the risk of the advanced CN in males was significantly higher than females (16-18). The CN patients would often be accompanied by coronary heart disease, so it’s presumed that the both diseases shared the common risk factors (19). A few studies have shown that the frequent egg intake could increase the risks of a variety of malignancies, including CN (20). The observational studies showed that constipation might increase the risk of CN (21).
The following shortcomings existed in this study mainly: (I) the sample size was relatively small (among the qualified 905 subjects, 48 cases were in the advanced CN), therefore, all the research subjects could only be used for the modeling. The results of this study stilled needed to further be verified with the larger external populations; (II) A lot of potential risk factors in this cross-sectional study were originated from the patients’ memories, which might have some recall bias. Despite the above limitations, our findings showed that the scoring system established in this study was still accurate, effective and credible.
In summary, the risk scoring system of advanced CN established in this study had good prediction consistency and distinguishing ability, higher sensitivity and negative predictive value. It can also detect CN or pre-malignant lesions earlier. The results could help to establish the characteristic screening strategies towards CN in the southern Jiangsu Province. The application of this scoring system in the initial screening of CN towards the AARP in the southern Jiangsu Province could be expected to significantly improve the CN screening effectiveness, reduce the screening costs and save the health resources.
Acknowledgements
This study was supported by Jiangsu Provincial Wuxi Science and Technology Bureau Project (No. CSZ00N1248).
Disclosure: The authors declare no conflict of interest.
References
- Zhang J, Dhakal IB, Zhao Z, et al. Trends in mortality from cancers of the breast, colon, prostate, esophagus, and stomach in East Asia: role of nutrition transition. Eur J Cancer Prev 2012;21:480-9. [PubMed]
- Chen HM, Weng YR, Jiang B, et al. Epidemiological study of colorectal adenoma and cancer in symptomatic patients in China between 1990 and 2009. J Dig Dis 2011;12:371-8. [PubMed]
- Nelson RS, Thorson AG. Colorectal cancer screening. Curr Oncol Rep 2009;11:482-9. [PubMed]
- Schoenfeld P, Cash B, Flood A, et al. Colonoscopic screening of average-risk women for colorectal neoplasia. N Engl J Med 2005;352:2061-8. [PubMed]
- Sung J. Does fecal occult blood test have a place for colorectal cancer screening in China in 2006? Am J Gastroenterol 2006;101:213-5. [PubMed]
- Cai QC, Yu ED, Xiao Y, et al. Derivation and validation of a prediction rule for estimating advanced colorectal neoplasm risk in average-risk Chinese. Am J Epidemiol 2012;175:584-93. [PubMed]
- Lieberman D. Screening for colorectal cancer in average-risk populations. Am J Med 2006;119:728-35. [PubMed]
- Moons KG, Harrell FE, Steyerberg EW. Should scoring rules be based on odds ratios or regression coefficients? J Clin Epidemiol 2002;55:1054-5. [PubMed]
- Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29-36. [PubMed]
- Steyerberg EW, Bleeker SE, Moll HA, et al. Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol 2003;56:441-7. [PubMed]
- Betés M, Muñoz-Navas MA, Duque JM, et al. Use of colonoscopy as a primary screening test for colorectal cancer in average risk people. Am J Gastroenterol 2003;98:2648-54. [PubMed]
- Lin OS, Kozarek RA, Schembre DB, et al. Risk stratification for colon neoplasia: screening strategies using colonoscopy and computerized tomographic colonography. Gastroenterology 2006;131:1011-9. [PubMed]
- Kolligs FT, Crispin A, Munte A, et al. Risk of advanced colorectal neoplasia according to age and gender. PLoS One 2011;6:e20076. [PubMed]
- Thoma MN, Castro F, Golawala M, et al. Detection of colorectal neoplasia by colonoscopy in average-risk patients age 40-49 versus 50-59 years. Dig Dis Sci 2011;56:1503-8. [PubMed]
- Rex DK, Johnson DA, Anderson JC, et al. American College of Gastroenterology guidelines for colorectal cancer screening 2009 Am J Gastroenterol 2009;104:739-50. [PubMed]
- Hassan C, Pooler BD, Kim DH, et al. Computed tomographic colonography for colorectal cancer screening: risk factors for the detection of advanced neoplasia. Cancer 2013;119:2549-54. [PubMed]
- Corley DA, Jensen CD, Marks AR, et al. Variation of adenoma prevalence by age, sex, race, and colon location in a large population: implications for screening and quality programs. Clin Gastroenterol Hepatol 2013;11:172-80. [PubMed]
- Nguyen SP, Bent S, Chen YH, et al. Gender as a risk factor for advanced neoplasia and colorectal cancer: a systematic review and meta-analysis. Clin Gastroenterol Hepatol 2009;7:676-81. [PubMed]
- Chan AO, Lam KF, Tong T, et al. Coexistence between colorectal cancer/adenoma and coronary artery disease: results from 1382 patients. Aliment Pharmacol Ther 2006;24:535-9. [PubMed]
- Aune D, De Stefani E, Ronco AL, et al. Egg consumption and the risk of cancer: a multisite case-control study in Uruguay. Asian Pac J Cancer Prev 2009;10:869-76. [PubMed]
- Power AM, Talley NJ, Ford AC. Association between constipation and colorectal cancer: systematic review and meta-analysis of observational studies. Am J Gastroenterol 2013;108:894-903. [PubMed]