b iframe width height src https
Studies from the UK and the US have attempted to identify BC re-currence using administrative data [6,14–16,18,32]. When Chubak et al.  evaluated radiotherapy as an indicator of recurrence, only 40–50% of these patients had recurrence according to the gold stan-dard. We required a co-diagnosis of BC, Tigecycline or BC recurrence for procedural codes as indicators and reached a PPV of 93% (Table 4).
It has been suggested that metastasis codes are not well suited for identification of recurrent cancer . Nevertheless, the diagnosis codes for metastasis and recurrence reached a SEN of 87% in our study (Table 4). Two previous studies have reported PPVs of 87% and 94% when using codes for metastasis and second primary cancer [6,16]. In our study, a metastasis diagnosis code was disregarded as indicator of recurrence if a second primary cancer was diagnosed before the me-tastasis as we could not determine if the metastasis originated from the BC or the second primary cancer. This may have caused the high PPV of 100% (Table 4) when evaluating diagnosis codes alone.
Hasset et al.  included diagnosis codes of secondary malignant
neoplasms and chemotherapy codes as indicators of colorectal, breast and prostate cancer recurrence. For BC, the algorithm reached a SEN of 80%, but only a PPV of 30%. For chemotherapy alone, the PPV was as low as 11%. Chemotherapy appears to be a source of many false posi-tives. Therefore, we disregarded it as an indicator of recurrence in the present study. In Warren et al. , the inclusion criteria and the gold standard of recurrence was death of BC. Hence, women successfully treated for recurrence were not included in the study. The study is also disadvantaged by only including patients with recurrence according to the gold standard as it excludes assessment of the SPE and PPV of the algorithm. These results underline the importance of having a re-presentative gold standard population and considering both SPE and SEN when evaluating the performance of an algorithm. Wide criteria for indicators of recurrence would yield a high SEN, but too wide criteria would generate false positives and introduce bias.
Only two studies have assessed the accuracy of determining a date of recurrence through an algorithm [16,32]. In Hasset et al. , re-currence dates produced by their algorithm fell within 30 days of the gold standard date in 20–36% of the cases. We managed to identify the recurrence date more accurately; 66% of the dates fell within 30 days of the gold standard recurrence date (Table A.2 in Appendix). Our de-termination of a recurrence date diﬀered by a median of 17 days compared to the gold standard, which was much less than the median diﬀerence of 40 days reported by Lamont et al. . The inclusion of pathology codes as a recurrence indicator in our study may explain why our algorithm provided more accurate estimates of recurrence date.
We have developed an alternative algorithm to identify breast cancer recurrence using Danish routine healthcare data. The algorithm showed very high sensitivity and specificity. Furthermore, the algo-rithm defined the recurrence date fairly accurately. Thus, the algorithm could be an important instrument for future research in the field of breast cancer recurrence.
The patient pathway for cancer recurrence research project (Study 1, id 119) has been approved and is registered in the Record of Processing Activities at the Research Unit of General Practice in Aarhus in accordance with the provisions of the General Data Protection Regulation (GDPR).
This type of study did not require approval from the Committee on Health Research Ethics in the Central Denmark Region as the study did not involve human biological material.