| Home > Publications Database > Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu > print |
| 001 | 281367 | ||
| 005 | 20251111170050.0 | ||
| 024 | 7 | _ | |a 10.1109/SIU66497.2025.11112382 |2 doi |
| 037 | _ | _ | |a DZNE-2025-01114 |
| 041 | _ | _ | |a Turkish |
| 100 | 1 | _ | |a Şen, Mehmet Umut |b 0 |
| 111 | 2 | _ | |a 33rd Signal Processing and Communications Applications Conference |g SIU 2025 |c Sile |d 2025-06-25 - 2025-06-28 |w Istanbul |
| 245 | _ | _ | |a Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu |
| 260 | _ | _ | |c 2025 |b IEEE |
| 295 | 1 | 0 | |a 2025 33rd Signal Processing and Communications Applications Conference (SIU) : [Proceedings] - IEEE, 2025. - ISBN 979-8-3315-6655-5 - doi:10.1109/SIU66497.2025.11112382 |
| 300 | _ | _ | |a 1 - 4 |
| 336 | 7 | _ | |a CONFERENCE_PAPER |2 ORCID |
| 336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
| 336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
| 336 | 7 | _ | |a conferenceObject |2 DRIVER |
| 336 | 7 | _ | |a Output Types/Conference Paper |2 DataCite |
| 336 | 7 | _ | |a Contribution to a conference proceedings |b contrib |m contrib |0 PUB:(DE-HGF)8 |s 1762876809_13039 |2 PUB:(DE-HGF) |
| 336 | 7 | _ | |a Contribution to a book |0 PUB:(DE-HGF)7 |2 PUB:(DE-HGF) |m contb |
| 520 | _ | _ | |a Although access to a large number of Ottoman documents has become easier today, the Arabic-Persian-based Ottoman script remains a barrier for interested users in utilizing these documents. To address this challenge, there is a need for automatic transcription systems. While some deep learning-based commercial and academic models exist for Ottoman transcription, no studies have yet explored models based on transformer architectures. This paper introduces an Ottoman transcription system developed using TrOCR, a transformer-based model. Instead of the commonly used two-step approach in the literature, a model was designed to perform both optical character recognition and transcription into Turkish in one step. Additionally, the decoder responsible for language modeling was initialized with a BERT-based model trained on Turkish data, achieving results comparable to the original model. During testing, this model produced outputs more quickly due to improved tokenization performance. |
| 536 | _ | _ | |a 351 - Brain Function (POF4-351) |0 G:(DE-HGF)POF4-351 |c POF4-351 |f POF IV |x 0 |
| 588 | _ | _ | |a Dataset connected to CrossRef Conference |
| 700 | 1 | _ | |a Bilecen, Ali |0 P:(DE-2719)9003244 |b 1 |u dzne |
| 700 | 1 | _ | |a Bilgin Taşdemir, Esma Fatıma |b 2 |
| 700 | 1 | _ | |a Yanıkoğlu, Berrin |b 3 |
| 773 | _ | _ | |a 10.1109/SIU66497.2025.11112382 |
| 856 | 4 | _ | |u https://pub.dzne.de/record/281367/files/DZNE-2025-01114_Restricted.pdf |
| 856 | 4 | _ | |u https://pub.dzne.de/record/281367/files/DZNE-2025-01114_Restricted.pdf?subformat=pdfa |x pdfa |
| 909 | C | O | |p VDB |o oai:pub.dzne.de:281367 |
| 910 | 1 | _ | |a Deutsches Zentrum für Neurodegenerative Erkrankungen |0 I:(DE-588)1065079516 |k DZNE |b 1 |6 P:(DE-2719)9003244 |
| 913 | 1 | _ | |a DE-HGF |b Gesundheit |l Neurodegenerative Diseases |1 G:(DE-HGF)POF4-350 |0 G:(DE-HGF)POF4-351 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-300 |4 G:(DE-HGF)POF |v Brain Function |x 0 |
| 914 | 1 | _ | |y 2025 |
| 920 | 1 | _ | |0 I:(DE-2719)1013041 |k AG Gokce |l Spatial Dynamics of Neurodegeneration |x 0 |
| 980 | _ | _ | |a contrib |
| 980 | _ | _ | |a VDB |
| 980 | _ | _ | |a contb |
| 980 | _ | _ | |a I:(DE-2719)1013041 |
| 980 | _ | _ | |a UNRESTRICTED |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|