Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu

Şen, Mehmet Umut; Bilecen, Ali; Bilgin Taşdemir, Esma Fatıma; Yanıkoğlu, Berrin

doi:10.1109/SIU66497.2025.11112382

Contribution to a conference proceedings/Contribution to a book

DZNE-2025-01114

Transcription of Ottoman Documents using Transformer Based Models | Osmanlica Dok manlarin D n st r c Tabanli Modeller ile Transkripsiyonu

Şen, M. U. ; Bilecen, A.DZNE* ; Bilgin Taşdemir, E. F. ; Yanıkoğlu, B.

2025
IEEE

2025 33rd Signal Processing and Communications Applications Conference (SIU) : [Proceedings] - IEEE, 2025. - ISBN 979-8-3315-6655-5 - doi:10.1109/SIU66497.2025.11112382
33rd Signal Processing and Communications Applications Conference, SIU 2025, Sile, Istanbul, 25 Jun 2025 - 28 Jun 2025 IEEE 1 - 4 (2025) [10.1109/SIU66497.2025.11112382]

This record in other databases:

Please use a persistent id in citations: doi:10.1109/SIU66497.2025.11112382

Abstract: Although access to a large number of Ottoman documents has become easier today, the Arabic-Persian-based Ottoman script remains a barrier for interested users in utilizing these documents. To address this challenge, there is a need for automatic transcription systems. While some deep learning-based commercial and academic models exist for Ottoman transcription, no studies have yet explored models based on transformer architectures. This paper introduces an Ottoman transcription system developed using TrOCR, a transformer-based model. Instead of the commonly used two-step approach in the literature, a model was designed to perform both optical character recognition and transcription into Turkish in one step. Additionally, the decoder responsible for language modeling was initialized with a BERT-based model trained on Turkish data, achieving results comparable to the original model. During testing, this model produced outputs more quickly due to improved tokenization performance.

Contributing Institute(s):

Spatial Dynamics of Neurodegeneration (AG Gokce)

Research Program(s):

351 - Brain Function (POF4-351) (POF4-351)

Appears in the scientific report 2025

Click to display QR Code for this record

The record appears in these collections:
Document types > Events > Contributions to a conference proceedings
Document types > Books > Contribution to a book
Institute Collections > BN DZNE > BN DZNE-AG Gokce
Public records
Publications Database

Record created 2025-09-22, last modified 2025-11-11

Similar records

Fulltext:

PDF

PDF (PDFA)

Rate this document:

(Not yet reviewed)

Add to personal basket
Export as Author List with IDs BibTeX (UTF-8), EndNote XML, EndNote Text, RIS, MARC, Print MARC, MARCXML, DC,
Request correction
Submit fulltext

guest :: login DZNEPUB
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help