Bioinformatics

Prognostic biomarkers for prostate cancer

Prostate cancer is the most prevalent cancer disease and the third most common cancer-related cause of death in European men [1]. Clinical behavior of localized prostate cancer is highly variable. Some men have aggressive cancer leading to death but many others have indolent cancers that are cured with initial therapy or may be safely observed. Patients often face unnecessary surgery because clinical and histopathological risk factors, as well as biomarkers and their according classification models, lack discrimination accuracy. Hence, there is a high clinical demand of biomarkers for the early prognosis of prostate cancer.

To address this, we strive for a better understanding of the molecular dysregulation in PCa. We conducted whole-transcriptome variation studies to detect gene signatures of prognostic value comprising protein and non-protein coding genes in fresh frozen radical prostatectomy samples and confirmed them in routine clinical materials of formalin-fixed and paraffin-embedded (FFPE) radical prostatectomy or biopsy samples.

We assessed the transcriptional landscape of more than two hundred tissue specimens of prostate cancer patients with long-term clinical follow up. For an unbiased assessment of transcriptional changes, we used analytical methods like custom expression microarrays and transcriptome-wide next-generation sequencing. We applied survival models to the expression values of each gene and combined evidence from different types of samples via a statistical meta-analysis. We combined all selected genes in a gene expression prognostic score per patient. The combined score showed a strong prognostic effect and correlates with time to death because of the disease. We could confirm the prognostic score in an independent testing cohort of a representative sample size and showed that the score also correlates with time to biochemical recurrence.

We developed a transcriptome-based score that predicts aggressive types of prostate cancer in cohorts of prostate cancer patients treated by radical prostatectomy. We further confirmed the score in an independent cohort of tissue specimens. The score is suitable to support treatment stratification and clinical decision-making for patients diagnosed with prostate cancer. We are currently confirming the score in tissue specimens that represent hands-on clinical material.

References
[1] https://ecis.jrc.ec.europa.eu

PIONEER - Prostate Cancer DIagnOsis and TreatmeNt Enhancement through the Power of Big Data in EuRope

Since 2018 we are an active member of the PIONEER consortium, a European Network of Excellence for Big Data in Prostate Cancer, consisting of 32 partners across 9 countries. PIONEER’s goal is to ensure the optimal care for all European men living with prostate cancer by unlocking the potential of Big Data and Big Data analytics. A key objective of PIONEER is to standardise and integrate existing ‘big data’ from quality multidisciplinary data sources into a single innovative open access data platform, to accelerate prostate cancer research. Within PIONEER, we contribute our strengths and expertise in data harmonisation of transcriptome-wide expression studies as well as statistical data analyses to identify and confirm biomarkers.

PIONEER is funded through the IMI2 Joint Undertaking and is listed under grant agreement No. 777492 and is part of the Big Data for Better Outcomes Programme (BD4BO). IMI2 receives support from the European Union’s Horizon 2020 research and innovation programme and the European Federation of Pharmaceutical Industries and Associations (EFPIA). The above text represents only the views of Fraunhofer IZI.

RNA biomarker discovery

The Bioinformatics Unit is a member of RIBOLUTION – an integrated platform for the identification and validation of innovative RNA-based biological markers for personalized medicine – a research association supported by the Fraunhofer-Zukunftsstiftung (Fraunhofer Future Foundation). We detect and establish RNA-based biological markers that are suitable as reliable indicators for a disease or its course. In this context, we are responsible for the storage, computer-aided processing and statistical analysis of the molecular-biological high-throughput data obtained by state-of-the-art measurement methods. The processes we implement cover the entire data life cycle in the biological marker discovery field, beginning with data creation, through primary and secondary analysis, up to medical knowledge generation. All software solutions have been implemented taking standards of quality managemnt into consideration. Access to a high-performance computing cluster ensures that computer-intensive solutions which have accrued because of the quantity and variety of data, can be efficiently realized.

Computational RNA biology

It has been known for a number of years that RNA molecules not only exclusively convey hereditary information of the DNA into amino acid sequences, but also perform extensive regulatory functions themselves. Non-protein coding RNAs are thereby subdivided into two rough groups, ncRNAs with a nucleotide sequence length of less than 200 nt (short ncRNAs) and the novel long ncRNAs, which have a sequence length of more than 200 nt. The gene regulatory mechanisms of the short ncRNAs, such as miRNAs and snoRNAs, are usually very well explained, while functions are only described exemplarily for the group of long ncRNAs. Studies on individual long ncRNAs have shown that they control central cellular processes such as transcription and translation. Furthermore, they are also involved in sub-cellular localization, in the organization of cellular spatial structures and in the control of epigenetic modifications. We and others were able to show that long ncRNAs in various tissues and signal pathways associated with disease are specifically regulated. Novel therapies based on long ncRNAs could then have specific impact and produce smaller side effects than traditional approaches. With methods from the RNA computational biology and systems biology, such as the prediction, modelling and classification of RNA secondary structure motifs, as well as by evolutionary and transcription studies, we address the topic of which gene regulatory mechanisms control cellular processes by long ncRNAs that have been identified as biomarkers, and to what extent these are suitable as therapeutic targets.

Optimization of the processing and analysis of sequencing data for routine clinical applications

Next-generation sequencing technologies produce genome- or transcriptome-wide data within days. This data is usually processed and analyzed by invoking a variety of bioinformatics software in sequential order. While the time required for data generation is reduced continuously due to enhanced sequencing methods, such optimizations have barely been achieved for data analysis. The effect on clinical routine applications is disadvantageous, because waiting times until therapy decisions are unnecessarily long. Our objective is to optimize the analysis of high-throughput sequencing data, such that it can be applied in clinical routine applications. Our in-house analysis pipeline meets the highest quality criteria because at all times it ensures the availability, integrity, confidentiality and authenticity of the data.

Selected completed projects

Development of custom expression microarrays for an efficient and cost-effective analysis of the tumor-associated expression pattern of long non-coding RNAs. With the aid of the custom expression microarrays, we could show that a multitude of long non-coded RNAs in the mammary carcinoma and glioblastoma are significantly regulated and are therefore suitable as biomarkers.

  • Hackermüller J, Reiche K, Otto C, Hösler N, Blumert C, Brocke-Heidrich K, Böhlig L, Nitsche A, Kasack K, Ahnert P, Krupp W, Engeland K, Stadler PF, Horn F. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs. Genome Biol. 2014 Mar 4;15(3):R48.
  • Reiche K, Kasack K, Schreiber S, Lüders T, Due EU, Naume B, Riis M, Kristensen VN, Horn F, Børresen-Dale AL, Hackermüller J, Baumbusch LO. Long non-coding RNAs differentially expressed between normal versus primary breast tumor tissues disclose converse changes to breast cancer-related protein-coding genes. PLoS One. 2014 Sep 29;9(9):e106076.
  • Arnold C, Externbrink F, Hackermüller J, Reiche K. CEMDesigner: Design of custom expression microarrays in the post-ENCODE Era. Journal of Biotechnology. 2014 Nov 10;189:154-6. DOI dx.doi.org/10.1016/j.jbiotec.2014.09.012.

 

The analysis of transcriptome-wide expression studies showed that non-coding RNAs are not only specifically expressed, but are also to a larger extend than protein-coding genes specifically regulated by disease-relevant signal pathways.

  • Hackermüller J, Reiche K, Otto C, Hösler N, Blumert C, Brocke-Heidrich K, Böhlig L, Nitsche A, Kasack K, Ahnert P, Krupp W, Engeland K, Stadler PF, Horn F. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs. Genome Biol. 2014 Mar 4;15(3):R48.

 

We developed an algorithm (TileShuffle) for the efficient analysis of transcriptome-wide expression data measured by means of tiling arrays. -Using a permutation approach we were able to estimate the background signals more precisely with regard to probe-specific artefacts than other methods. We thus achieve a greater sensitivity at the same specificity.

  • Otto C, Reiche K, Hackermüller J. Detection of differentially expressed segments in tiling array data. Bioinformatics. 2012 Jun 1;28(11):1471-9.