Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu

Şen, Mehmet Umut; Bilecen, Ali; Bilgin Taşdemir, Esma Fatıma; Yanıkoğlu, Berrin

doi:10.1109/SIU66497.2025.11112382

Items
Marc 21

001			281367
005			20251111170050.0
024	7	_	\|a 10.1109/SIU66497.2025.11112382 \|2 doi
037	_	_	\|a DZNE-2025-01114
041	_	_	\|a Turkish
100	1	_	\|a Şen, Mehmet Umut \|b 0
111	2	_	\|a 33rd Signal Processing and Communications Applications Conference \|g SIU 2025 \|c Sile \|d 2025-06-25 - 2025-06-28 \|w Istanbul
245	_	_	\|a Transcription of Ottoman Documents using Transformer Based Models \| Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu
260	_	_	\|c 2025 \|b IEEE
295	1	0	\|a 2025 33rd Signal Processing and Communications Applications Conference (SIU) : [Proceedings] - IEEE, 2025. - ISBN 979-8-3315-6655-5 - doi:10.1109/SIU66497.2025.11112382
300	_	_	\|a 1 - 4
336	7	_	\|a CONFERENCE_PAPER \|2 ORCID
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a Output Types/Conference Paper \|2 DataCite
336	7	_	\|a Contribution to a conference proceedings \|b contrib \|m contrib \|0 PUB:(DE-HGF)8 \|s 1762876809_13039 \|2 PUB:(DE-HGF)
336	7	_	\|a Contribution to a book \|0 PUB:(DE-HGF)7 \|2 PUB:(DE-HGF) \|m contb
520	_	_	\|a Although access to a large number of Ottoman documents has become easier today, the Arabic-Persian-based Ottoman script remains a barrier for interested users in utilizing these documents. To address this challenge, there is a need for automatic transcription systems. While some deep learning-based commercial and academic models exist for Ottoman transcription, no studies have yet explored models based on transformer architectures. This paper introduces an Ottoman transcription system developed using TrOCR, a transformer-based model. Instead of the commonly used two-step approach in the literature, a model was designed to perform both optical character recognition and transcription into Turkish in one step. Additionally, the decoder responsible for language modeling was initialized with a BERT-based model trained on Turkish data, achieving results comparable to the original model. During testing, this model produced outputs more quickly due to improved tokenization performance.
536	_	_	\|a 351 - Brain Function (POF4-351) \|0 G:(DE-HGF)POF4-351 \|c POF4-351 \|f POF IV \|x 0
588	_	_	\|a Dataset connected to CrossRef Conference
700	1	_	\|a Bilecen, Ali \|0 P:(DE-2719)9003244 \|b 1 \|u dzne
700	1	_	\|a Bilgin Taşdemir, Esma Fatıma \|b 2
700	1	_	\|a Yanıkoğlu, Berrin \|b 3
773	_	_	\|a 10.1109/SIU66497.2025.11112382
856	4	_	\|u https://pub.dzne.de/record/281367/files/DZNE-2025-01114_Restricted.pdf
856	4	_	\|u https://pub.dzne.de/record/281367/files/DZNE-2025-01114_Restricted.pdf?subformat=pdfa \|x pdfa
909	C	O	\|p VDB \|o oai:pub.dzne.de:281367
910	1	_	\|a Deutsches Zentrum für Neurodegenerative Erkrankungen \|0 I:(DE-588)1065079516 \|k DZNE \|b 1 \|6 P:(DE-2719)9003244
913	1	_	\|a DE-HGF \|b Gesundheit \|l Neurodegenerative Diseases \|1 G:(DE-HGF)POF4-350 \|0 G:(DE-HGF)POF4-351 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-300 \|4 G:(DE-HGF)POF \|v Brain Function \|x 0
914	1	_	\|y 2025
920	1	_	\|0 I:(DE-2719)1013041 \|k AG Gokce \|l Spatial Dynamics of Neurodegeneration \|x 0
980	_	_	\|a contrib
980	_	_	\|a VDB
980	_	_	\|a contb
980	_	_	\|a I:(DE-2719)1013041
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login DZNEPUB
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help