000145612 001__ 145612
000145612 005__ 20200925153820.0
000145612 037__ $$aDZNE-2020-00942
000145612 041__ $$aEnglish
000145612 1001_ $$0P:(DE-2719)2811935$$aFiosins, Maksims$$b0$$udzne
000145612 1112_ $$a15th International Symposium on Bioinformatics Research and Applications (ISBRA)$$cBarcelona$$d2019-06-03 - 2019-06-06$$wSpain
000145612 245__ $$aDeep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles
000145612 260__ $$c2019
000145612 3367_ $$033$$2EndNote$$aConference Paper
000145612 3367_ $$2DataCite$$aOther
000145612 3367_ $$2BibTeX$$aINPROCEEDINGS
000145612 3367_ $$2DRIVER$$aconferenceObject
000145612 3367_ $$2ORCID$$aLECTURE_SPEECH
000145612 3367_ $$0PUB:(DE-HGF)6$$2PUB:(DE-HGF)$$aConference Presentation$$bconf$$mconf$$s1597395849_15935$$xOther
000145612 520__ $$aThe lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The average prediction accuracy for tissue groups is 98% (DL), for tissues - 96.5% (DL), and for sex - 77% (DL). The “one dataset out” average accuracy for tissue group prediction is 83% (DL) and 59% (RF). On average, DL provides better results as compared to RF, and considerably improves classification performance for ‘unseen’ datasets.
000145612 536__ $$0G:(DE-HGF)POF3-342$$a342 - Disease Mechanisms and Model Systems (POF3-342)$$cPOF3-342$$fPOF III$$x0
000145612 7001_ $$0P:(DE-2719)2810547$$aBonn, Stefan$$b1$$eLast author$$udzne
000145612 8564_ $$uhttps://link.springer.com/chapter/10.1007/978-3-030-20242-2_14
000145612 909CO $$ooai:pub.dzne.de:145612$$pVDB
000145612 9101_ $$0I:(DE-588)1065079516$$6P:(DE-2719)2811935$$aDeutsches Zentrum für Neurodegenerative Erkrankungen$$b0$$kDZNE
000145612 9101_ $$0I:(DE-588)1065079516$$6P:(DE-2719)2810547$$aDeutsches Zentrum für Neurodegenerative Erkrankungen$$b1$$kDZNE
000145612 9131_ $$0G:(DE-HGF)POF3-342$$1G:(DE-HGF)POF3-340$$2G:(DE-HGF)POF3-300$$aDE-HGF$$bForschungsbereich Gesundheit$$lErkrankungen des Nervensystems$$vDisease Mechanisms and Model Systems$$x0
000145612 9141_ $$y2019
000145612 9201_ $$0I:(DE-2719)1410003$$kAG Bonn 1$$lComputational analysis of biological networks$$x0
000145612 980__ $$aconf
000145612 980__ $$aVDB
000145612 980__ $$aI:(DE-2719)1410003
000145612 980__ $$aUNRESTRICTED