TY - JOUR
T1 - Extended endocrine therapy in breast cancer: A basket of length-constraint feature selection metaheuristics to balance Type I against Type II errors
AU - Liu, Hua-Ping
AU - Zhang, Jian V
AU - Wang, Dongwen
AU - Albrecht, Andreas
AU - Steinhöfel, Kathleen
AU - Lai, Hung-Ming
N1 - Funding Information:
Jian V. Zhang is supported by Shenzhen Basic Research Program (JCYJ20170413165233512) and Shenzhen Key Laboratory of Metabolic Health. Dongwen Wang is supported by Shenzhen High-level Hospital Construction Fund and Sanming Project of Medicine in Shenzhen (No. SZSM202111003). The authors thanked Mr. Yuan-Yi Chang for a preliminary to preparing a Kaplan-Meier plot.
Publisher Copyright:
© 2022
PY - 2022/7
Y1 - 2022/7
N2 - Extended endocrine therapy beyond 5 years is of major concern to ER+ breast cancer survivors. However, it might be unsuitable to apply routinely used genomic tests designed for early recurrence risks to distant recurrence within 10 years in extended treatment context. These tests initially aim at high sensitivities with Type I errors much higher than Type II. Having lower positive predictive values (PPVs), these tests can bring many false positives who might not need further treatment options to avoid adversely affecting quality of life. Alternatively, we proposed a top-down approach to the raised issues. We built 149 targeted genes from four genomic tests upon 381 ER-positive node-negative patients with either metastasis free beyond 10 years (n=202) or metastasis within 10 years (n=179). By a basket of SVM-wrapped length-constraint feature selection (LCFS), we discovered four genomic SVMs that traded off Type I against Type II errors. Two independent cohorts were used to validate disease outcome predictions. A 36-gene SVM balanced sensitivities with PPVs at good levels: 74% vs 76% on 10-fold cross validation (n=347) and 75% vs 71% on a test set (n=34). Neither Oncotype DX RS (cutoff=18, 31, 60.97) nor PAM50 ROR-S (cutoff=29, 53, 61.18) could. Independent cohorts showed the 36-gene SVM predicted disease free survival (n=136, HR=2.59; 95% CI, 1.4-4.8) and disease specific survival (n=127, HR=4.06; 95% CI, 1.63-10.11) better than RS (DFS, HR=2.15; DSS, HR=3.86) and ROR-S (DFS, HR=2.29; DSS, HR=2.76). The case study demonstrated how we identified a genomic test to balance Type I against Type II errors for risk stratification. The top-down approach centered around the LCFS-metaheuristics basket is a generic methodology for clinical decision-making and quality of life using targeted profiling data where the number of dimensions (p) is smaller than the number of samples (n).
AB - Extended endocrine therapy beyond 5 years is of major concern to ER+ breast cancer survivors. However, it might be unsuitable to apply routinely used genomic tests designed for early recurrence risks to distant recurrence within 10 years in extended treatment context. These tests initially aim at high sensitivities with Type I errors much higher than Type II. Having lower positive predictive values (PPVs), these tests can bring many false positives who might not need further treatment options to avoid adversely affecting quality of life. Alternatively, we proposed a top-down approach to the raised issues. We built 149 targeted genes from four genomic tests upon 381 ER-positive node-negative patients with either metastasis free beyond 10 years (n=202) or metastasis within 10 years (n=179). By a basket of SVM-wrapped length-constraint feature selection (LCFS), we discovered four genomic SVMs that traded off Type I against Type II errors. Two independent cohorts were used to validate disease outcome predictions. A 36-gene SVM balanced sensitivities with PPVs at good levels: 74% vs 76% on 10-fold cross validation (n=347) and 75% vs 71% on a test set (n=34). Neither Oncotype DX RS (cutoff=18, 31, 60.97) nor PAM50 ROR-S (cutoff=29, 53, 61.18) could. Independent cohorts showed the 36-gene SVM predicted disease free survival (n=136, HR=2.59; 95% CI, 1.4-4.8) and disease specific survival (n=127, HR=4.06; 95% CI, 1.63-10.11) better than RS (DFS, HR=2.15; DSS, HR=3.86) and ROR-S (DFS, HR=2.29; DSS, HR=2.76). The case study demonstrated how we identified a genomic test to balance Type I against Type II errors for risk stratification. The top-down approach centered around the LCFS-metaheuristics basket is a generic methodology for clinical decision-making and quality of life using targeted profiling data where the number of dimensions (p) is smaller than the number of samples (n).
UR - http://www.scopus.com/inward/record.url?scp=85133103319&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2022.104112
DO - 10.1016/j.jbi.2022.104112
M3 - Article
C2 - 35680073
SN - 1532-0464
VL - 131
JO - JOURNAL OF BIOMEDICAL INFORMATICS
JF - JOURNAL OF BIOMEDICAL INFORMATICS
M1 - 104112
ER -