Home > Publications Database > Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu > print |
001 | 281367 | ||
005 | 20251008102406.0 | ||
024 | 7 | _ | |a 10.1109/SIU66497.2025.11112382 |2 doi |
037 | _ | _ | |a DZNE-2025-01114 |
041 | _ | _ | |a Turkish |
100 | 1 | _ | |a Şen, Mehmet Umut |b 0 |
111 | 2 | _ | |a 33rd Signal Processing and Communications Applications Conference |g SIU 2025 |c Sile |d 2025-06-25 - 2025-06-28 |w Istanbul |
245 | _ | _ | |a Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu |
260 | _ | _ | |c 2025 |b IEEE |
295 | 1 | 0 | |a 2025 33rd Signal Processing and Communications Applications Conference (SIU) : [Proceedings] - IEEE, 2025. - ISBN 979-8-3315-6655-5 - doi:10.1109/SIU66497.2025.11112382 |
300 | _ | _ | |a 1 - 4 |
336 | 7 | _ | |a CONFERENCE_PAPER |2 ORCID |
336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
336 | 7 | _ | |a conferenceObject |2 DRIVER |
336 | 7 | _ | |a Output Types/Conference Paper |2 DataCite |
336 | 7 | _ | |a Contribution to a conference proceedings |b contrib |m contrib |0 PUB:(DE-HGF)8 |s 1759836423_17319 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a Contribution to a book |0 PUB:(DE-HGF)7 |2 PUB:(DE-HGF) |m contb |
520 | _ | _ | |a Although access to a large number of Ottoman documents has become easier today, the Arabic-Persian-based Ottoman script remains a barrier for interested users in utilizing these documents. To address this challenge, there is a need for automatic transcription systems. While some deep learning-based commercial and academic models exist for Ottoman transcription, no studies have yet explored models based on transformer architectures. This paper introduces an Ottoman transcription system developed using TrOCR, a transformer-based model. Instead of the commonly used two-step approach in the literature, a model was designed to perform both optical character recognition and transcription into Turkish in one step. Additionally, the decoder responsible for language modeling was initialized with a BERT-based model trained on Turkish data, achieving results comparable to the original model. During testing, this model produced outputs more quickly due to improved tokenization performance. |
536 | _ | _ | |a 351 - Brain Function (POF4-351) |0 G:(DE-HGF)POF4-351 |c POF4-351 |f POF IV |x 0 |
588 | _ | _ | |a Dataset connected to CrossRef Conference |
700 | 1 | _ | |a Bilecen, Ali |0 P:(DE-2719)9003244 |b 1 |u dzne |
700 | 1 | _ | |a Bilgin Taşdemir, Esma Fatıma |b 2 |
700 | 1 | _ | |a Yanıkoğlu, Berrin |b 3 |
773 | _ | _ | |a 10.1109/SIU66497.2025.11112382 |
856 | 4 | _ | |u https://pub.dzne.de/record/281367/files/DZNE-2025-01114.pdf |y Restricted |
856 | 4 | _ | |u https://pub.dzne.de/record/281367/files/DZNE-2025-01114.pdf?subformat=pdfa |x pdfa |y Restricted |
909 | C | O | |o oai:pub.dzne.de:281367 |p VDB |
910 | 1 | _ | |a Deutsches Zentrum für Neurodegenerative Erkrankungen |0 I:(DE-588)1065079516 |k DZNE |b 1 |6 P:(DE-2719)9003244 |
913 | 1 | _ | |a DE-HGF |b Gesundheit |l Neurodegenerative Diseases |1 G:(DE-HGF)POF4-350 |0 G:(DE-HGF)POF4-351 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-300 |4 G:(DE-HGF)POF |v Brain Function |x 0 |
914 | 1 | _ | |y 2025 |
920 | 1 | _ | |0 I:(DE-2719)1013041 |k AG Gokce |l Spatial Dynamics of Neurodegeneration |x 0 |
980 | _ | _ | |a contrib |
980 | _ | _ | |a VDB |
980 | _ | _ | |a contb |
980 | _ | _ | |a I:(DE-2719)1013041 |
980 | _ | _ | |a UNRESTRICTED |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|