A comprehensive European Colorectal Cancer Cohort dataset

March 12, 2026

Petr Holub 1 2, Outi Törnwall 3, Eva Garcia Alvarez 4, Rumyana Proynova 5, Florian Stampe 5, Saher Maqsood 5, Maxmilian Ataian 6, Irene Schlünder 3, Olli Carpén 7, Gerrit Meijer 8, Rudolf Nenutil 9, Dalibor Valík 10 11, Barbara Parodi 12, Annemieke Hiemstra 8, Mariska Bierkens 8, Esmeralda Castanos-Vélez 13, Frank Ückert 6, Diogo Alexandre 5, Ondřej Vojtíšek 9, Anna-Liisa Bader 3, Birgit Simell 3, Caitlin Ahern 3, Vitomir Horvat 3, Erik Steinfelder 3, Matteo Gnocchi 14, Marco Moscatelli 14, Alexander Fürbaß 3, Jiří Horák 3, Francesca Frexia 15, Cecilia Mascia 15, Alessandro Sulis 15, Giovanni Delussu 15, Mauro Del Rio 15, Vittorio Meloni 15, Luca Pireddu 15, Simone Leo 15, Marco Enrico Piras 15, Martin Kačenga 16, Stéphanie Gofflot 17, Sebastian Mate 18, Hans-Ulrich Prokosch 18 19, Paolo Romano 12, Daniela Pistillo 20, Michael Hoffmeister 21, Alexander Brobeil 22, Amila Kugic 23, Berthold Huppertz 24, Valentina Paleari 20, Heimo Müller 25, Robert Reihs 25, Timo Gemoll 26, Yannick Bantel 27, Tobias Sjöblom 28, Kyriacos Kyriacou 29, Simona di Martino 30, Gennaro Ciliberto 31, Ann-Kristin Kock-Schoppenhauer 32, Martina Oberländer 33, Jens K Habermann 3 32, Gabriele Husmann 34, Per-Henrik D Edqvist 28, Inti Zlobec 35, Martin D Berger 36, Lars Boeckmann 26 37, Fabienne George 38, Tom Southerington 39, Daniel P Brucker 40, Laurence Faugeras 38, Joanna Vella 41, Alex Felice 41, Malcolm Pace 41, Chiara Fallerini 42, Alessandra Renieri 42, Andreas Hadjisavvas 43, Karine Sargsyan 25 44 45, Maria A Loizidou 29, Tatiana Besse-Hammer 46, Franziska Vogl 23, Jan-Eric Litton 47, Michael Hummel 13, Kurt Zatloukal 48, Marialuisa Lavitrano 49

Abstract

Colorectal cancer (CRC) is a leading cause of cancer-related deaths worldwide. The Biobanking and Biomolecular Resources European Research Infrastructure (BBMRI-ERIC) established a CRC-Cohort with European coverage contributed by 26 biobanks from 12 countries. This retrospective, multi-center study contains structured and curated clinical data, supporting research on biomarkers for early detection, prognosis, and treatment. A phenotypical/clinical data model has been defined and individual-level data from 10,780 CRC patients have been collected at BBMRI-ERIC in the central data deposition service hosted as part of its services. The participating biobanks host additional data, which can be accessed on request and used to derive additional data. This mechanism has been used to extend the collected data with scans of histopathological slides to support research in artificial intelligence in digital pathology and with whole genome sequencing data to pilot a use case of the upcoming European Health Data Space (EHDS). Here we present the methodology, the quality assurance mechanisms, and the implementation of FAIR and FAIR-Health principles applied to build the CRC-Cohort.