Genomics xxx xxxx xxx xxx br linear predictor
Genomics xxx (xxxx) xxx–xxx
linear predictor η (eta) which represents the product of the covariate vector x and the parameter vector β. Patients with η > 0 were classi-fied as high-risk patients and those with η < 0 as low-risk patients. This study complied with the guidelines for reporting recommendations for tumour marker prognostic studies (REMARK; Supplementary Table S1) [29,30]. Pathway analysis was performed using the ToppGene Suite for gene list enrichment analysis and candidate gene prioritization, which detects functional enrichment of input gene lists based on gene Puromycin .
2.6. Comparison with other signatures
The commercially available Oncotype Dx assay analyses 21 genes, where 16 genes are cancer-related and five genes are used as references for normalization . A multivariable model was fitted based on gene expression microarray data using the 16 cancer-related genes. The 12-gene signature was developed by Habermann and colleagues  and comprises a set of 12 genes to predict clinical outcome based on genomic instability. A multivariable model was fitted using 11/12 genes of the signature that were included on the Illumina HumanHT-12 gene expression array (NXF1, JAKMIP2, DNALI1, TBC1D9, MYB, CDKN2A, MMP17, RERG, CCL18, AURKA, FOXA1). C-indices and AUC(t) functions were obtained as described above.
3.1. Genome stability-dependent segregation of copy number and gene expression patterns
The G2I algorithm stratified the cohort into tumours with stable (n = 94; defined as G2I-1 and G2I-2) and unstable genomes (n = 42; G2I-3) based on copy number alterations (CNAs) (Fig. 1A). Tumours identified as G2I-1 exhibited a stable genome, while G2I-3 tumours clearly showed severely altered genomes. G2I-2 tumours were more stable compared to G2I-3 tumours but displayed gains on 1q, 8q, 16p, and 17q as well as losses on 6q, 8p, 11q, and 16q. Hierarchical clus-tering of the CNAs showed tumours with unstable genomes forming two distinct clusters separate from tumours with stable genomes (Fig. 1B). Principal component analysis (PCA) confirmed the separation seen in the clustering analysis, as tumours with stable genomes formed a dense cluster while genomically unstable tumours showed greater variability (Fig. 1C).
Next, we applied hierarchical clustering to the matched gene ex-pression data for the cohort and detected two distinct clusters con-taining primarily tumours with either stable or unstable genomes (Fig. 1D). Again, the PCA plot showed a clear stratification of the cohort based on genome stability, but genomically stable tumours formed a less dense cluster when using gene expression data compared to copy number data (Fig. 1E). In addition, genome stability status was sig-nificantly associated with DNA-related clinical parameters such as DNA index, S-phase fraction, and ploidy (Table 1). An association was also found between genome stability and GGI as well as survival-related parameters, i.e. risk groups identified by the linear predictor, survival status, and follow-up time. Stratification of clinical features based on the risk groups highlighted the same statistically significant association with GGI, survival-related (survival status, follow-up time) and DNA-related parameters (DNA index, S-phase fraction, ploidy; Supplemen-tary Table S2). Additionally, the lymph node ratio, molecular subtypes, ER, PR and HER2 status were significantly associated with the risk groups based on the linear predictor.
3.2. Diﬀerential expression of 335 transcripts associated with genomic instability
In the next step, we aimed to identify genes responsible for the stratification of the cohort using logistic regression. A total of 335