001     164502
005     20230103103140.0
020 _ _ |a 978-3-030-20241-5 (print)
020 _ _ |a 978-3-030-20242-2 (electronic)
024 7 _ |a 10.1007/978-3-030-20242-2_14
|2 doi
024 7 _ |a 0302-9743
|2 ISSN
024 7 _ |a 1611-3349
|2 ISSN
024 7 _ |a altmetric:61330992
|2 altmetric
037 _ _ |a DZNE-2022-01054
100 1 _ |a Cai, Zhipeng
|b 0
|e Editor
111 2 _ |a International Symposium on Bioinformatics Research and Applications
245 _ _ |a Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles
260 _ _ |a Cham
|c 2019
|b Springer International Publishing
295 1 0 |a Bioinformatics Research and Applications / Cai, Zhipeng (Editor) ; Cham : Springer International Publishing, 2019, Chapter 14 ; ISSN: 0302-9743=1611-3349 ; ISBN: 978-3-030-20241-5=978-3-030-20242-2 ; doi:10.1007/978-3-030-20242-2
300 _ _ |a 159 - 170
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1654068792_17981
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
490 0 _ |a Lecture Notes in Computer Science
|v 11490
520 _ _ |a The lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The average prediction accuracy for tissue groups is 98% (DL), for tissues - 96.5% (DL), and for sex - 77% (DL). The “one dataset out” average accuracy for tissue group prediction is 83% (DL) and 59% (RF). On average, DL provides better results as compared to RF, and considerably improves classification performance for ‘unseen’ datasets.
536 _ _ |a 899 - ohne Topic (POF4-899)
|0 G:(DE-HGF)POF4-899
|c POF4-899
|f POF IV
|x 0
588 _ _ |a Dataset connected to CrossRef Book Series, Journals: pub.dzne.de
700 1 _ |a Skums, Pavel
|b 1
|e Editor
700 1 _ |a Li, Min
|0 0000-0002-0188-1394
|b 2
|e Editor
700 1 _ |a Fiosina, Jelena
|b 3
700 1 _ |a Fiosins, Maksims
|0 P:(DE-2719)2811935
|b 4
|u dzne
700 1 _ |a Bonn, Stefan
|0 P:(DE-2719)2810547
|b 5
|e Last author
|u dzne
773 _ _ |a 10.1007/978-3-030-20242-2_14
909 C O |o oai:pub.dzne.de:164502
|p VDB
910 1 _ |a Deutsches Zentrum für Neurodegenerative Erkrankungen
|0 I:(DE-588)1065079516
|k DZNE
|b 4
|6 P:(DE-2719)2811935
910 1 _ |a Deutsches Zentrum für Neurodegenerative Erkrankungen
|0 I:(DE-588)1065079516
|k DZNE
|b 5
|6 P:(DE-2719)2810547
913 1 _ |a DE-HGF
|b Programmungebundene Forschung
|l ohne Programm
|1 G:(DE-HGF)POF4-890
|0 G:(DE-HGF)POF4-899
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-800
|4 G:(DE-HGF)POF
|v ohne Topic
|x 0
914 1 _ |y 2019
915 _ _ |a Nationallizenz
|0 StatID:(DE-HGF)0420
|2 StatID
|d 2020-08-25
|w ger
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2020-08-25
920 1 _ |0 I:(DE-2719)1210002
|k AG Heutink 1
|l Genome Biology of Neurodegenerative Diseases
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a contb
980 _ _ |a I:(DE-2719)1210002
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21