A high-resolution map of the human small non-coding transcriptome.

Tobias Fehlmann, Christina Backes, Julia Alles, Ulrike Fischer, Martin Hart, Fabian Kern, Hilde Langseth, Trine Rounge, Sinan Ugur Umu, Mustafa Kahraman, Thomas Laufer, Jan Haas, Cord Staehler, Nicole Ludwig, Matthias Hübenthal, Benjamin Meder, Andre Franke, Hans-Peter Lenhof, Eckart Meese, Andreas Keller
Year of publication:
Journal title abbreviated:
Journal title long:
Impact factor:
While the amount of small non-coding RNA sequencing data is continuously increasing, it is still unclear to which extent small RNAs are represented in the human genome.In this study we analyzed 303 billion sequencing reads from nearly 25,000 data sets to answer this question. We determined that 0.8% of the human genome are reliably covered by 874,123 regions with an average length of 31nt. On the basis of these regions, we found that among the known small non-coding RNA classes, microRNAs were the most prevalent. In subsequent steps, we characterized variations of miRNAs and performed a staged validation of 11,877 candidate miRNAs. Of these, many were actually expressed and significantly dysregulated in lung cancer. Selected candidates were finally validated by northern blots. While isolated miRNAs could still be present in the human genome, our presented set likely contains the largest fraction of human miRNAs.andreas.keller@ccb.uni-saarland.de.Supplementary data are available at Bioinformatics online.