The Role of Phonological Errors in Evaluation Metrics

Ayşegül Çağlı; Vakkas Karakurt; Kürşat Edabalı Yıldırım; Fatih Soygazi; Yılmaz Kılıçaslan

doi:10.53070/bbd.1350547

Research Article

The Role of Phonological Errors in Evaluation Metrics

Year 2023, Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023, 44 - 51, 18.10.2023

Ayşegül Çağlı Vakkas Karakurt Kürşat Edabalı Yıldırım Fatih Soygazi Yılmaz Kılıçaslan

https://doi.org/10.53070/bbd.1350547

Abstract

In recent years, Natural Language Processing (NLP) has seen a surge in research, particularly in the
areas of text summarization and machine translation. Evaluation metrics like ROUGE and BLEU have been
widely used to assess the quality of texts using N-gram based approaches. However, these metrics often struggle
when applied to data sourced from the internet, such as social media platforms, due to the prevalence of
phonological errors. This study focuses on identifying the sources and frequency of phonological errors while
addressing the question of whether they should be considered or not. Data from Twitter, a platform known for
phonological errors, was collected, and studied, along with existing literature on the subject. The article proposes
enhancing existing metrics by integrating edit distance algorithms like Levenshtein or Damerau-Levenshtein. By
considering phonological errors in evaluations, this approach aims to improve accuracy and reliability in the NLP
and machine translation domains. The ultimate goal of this study is to contribute to more sensitive and reliable
evaluation metrics in these fields.

Keywords

Natural Language Processing, Phonological Errors, ROUGE, Machine Translation, Evaluation Metrics, Edit Distance Metrics

References

Uzdu Yıldız, F., & Çetin, B. (2020). Errors in written expressions of learners of Turkish as a foreign language: A systematic review. Journal of Language and Linguistic Studies, 16(2), 612-625. Doi: 10.17263/jlls.759261
Sağlam, B. & Özek, F. (2023). Levenshtein Uzaklık Algoritmasına Göre Azerbaycan, Türkiye ve Türkmen Türkçeleri Arasındaki Fonetik Uzaklık. Asya Studies-Academic Social Studies / Akademik Sosyal Araştırmalar, 7(Special Issue / Özel Sayı 3), 45-64.
Çalış, T. Sözdizimsel Aktarıma Dayalı Makale Çevirisi Yüksek Lisans Tezi, Trakya Üniversitesi, 2017 Stanley, Theban & Hacioglu, Kadri. (2011). Statistical Machine Translation Framework for Modeling Phonological Errors in Computer Assisted Pronunciation Training System.
L. Yujian and L. Bo, (2007) "A Normalized Levenshtein Distance Metric," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1091-1095, doi: 10.1109/TPAMI.2007.1078.
Santoso, Puji, et al. (2019) “Damerau levenshtein distance for indonesian spelling correction,” J. Inform 13.2: 11. Youness Chaabi, Fadoua Ataa Allah, (2022), “Amazigh spell checker using Damerau-Levenshtein algorithm and N-gram,” Journal of King Saud University - Computer and Information Sciences, Volume 34, Issue 8, Part B, Pages 6116-6124, ISSN 1319-1578.
Schluter, Natalie. (2017). The limits of automatic summarisation according to ROUGE. 41-45. 10.18653/v1/E17-2007.
Liu, Feifan & Liu, Yang. (2008). Correlation between ROUGE and Human Evaluation of Extractive Meeting Summaries.. 201-204. 201-204. 10.3115/1557690.1557747.
Baykara, B., Güngör, T. (2023). Morphosyntactic Evaluation for Text Summarization in Morphologically Rich Languages: A Case Study for Turkish. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham.
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics
Yvette Graham. 2015. Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 128–137, Lisbon, Portugal. Association for Computational Linguistics.

Fonolojik Hataların Değerlendirme Metriklerindeki Rolü

Year 2023, Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023, 44 - 51, 18.10.2023

Ayşegül Çağlı Vakkas Karakurt Kürşat Edabalı Yıldırım Fatih Soygazi Yılmaz Kılıçaslan

https://doi.org/10.53070/bbd.1350547

Abstract

Son yıllarda, Doğal Dil İşleme (DDİ), özellikle metin özeti oluşturma ve makine çevirisi alanlarında
yoğun bir araştırma artışı yaşamıştır. ROUGE ve BLEU gibi değerlendirme metrikleri, N-gram temelli
yaklaşımlar kullanılarak metinlerin kalitesini değerlendirmek için yaygın olarak kullanılmaktadır. Ancak, bu
metrikler özellikle sosyal medya platformlarından elde edilen verilere uygulandığında, sesbilgisel hataların
yaygınlığı nedeniyle zorlanmaktadır. Bu çalışma, sesbilgisel hataların kaynaklarını ve frekansını belirlemeye
odaklanmakta ve bu hataları dikkate almalı mı sorusuna cevap niteliği taşımaktadır. Bu konuyla ilgili olarak
sesbilgisel hataların sık görüldüğü bir platform olan Twitter'dan veri toplanmış ve incelenmiştir. Ayrıca mevcut
literatür de gözden geçirilmiştir. Makale, Levenshtein ve Damerau-Levenshtein gibi düzenleme mesafesi
algoritmalarını mevcut metriklere entegre ederek onları geliştirmeyi önermektedir. Sesbilgisel hataları
değerlendirmelere dahil ederek, DDİ ve makine çevirisi alanlarında doğruluk ve güvenilirliği artırmayı
hedeflemektedir. Bu çalışmanın nihai amacı, bu alanlarda daha hassas ve güvenilir değerlendirme metrikleri
oluşmasına katkı sağlamaktır.

Keywords

Doğal Dil İşleme, Fonolojik Hatalar, ROUGE, Makine Çevirisi, Değerlendirme Metrikleri, Düzeltme Uzaklığı Metrikleri

References

Uzdu Yıldız, F., & Çetin, B. (2020). Errors in written expressions of learners of Turkish as a foreign language: A systematic review. Journal of Language and Linguistic Studies, 16(2), 612-625. Doi: 10.17263/jlls.759261
Sağlam, B. & Özek, F. (2023). Levenshtein Uzaklık Algoritmasına Göre Azerbaycan, Türkiye ve Türkmen Türkçeleri Arasındaki Fonetik Uzaklık. Asya Studies-Academic Social Studies / Akademik Sosyal Araştırmalar, 7(Special Issue / Özel Sayı 3), 45-64.
Çalış, T. Sözdizimsel Aktarıma Dayalı Makale Çevirisi Yüksek Lisans Tezi, Trakya Üniversitesi, 2017 Stanley, Theban & Hacioglu, Kadri. (2011). Statistical Machine Translation Framework for Modeling Phonological Errors in Computer Assisted Pronunciation Training System.
L. Yujian and L. Bo, (2007) "A Normalized Levenshtein Distance Metric," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1091-1095, doi: 10.1109/TPAMI.2007.1078.
Santoso, Puji, et al. (2019) “Damerau levenshtein distance for indonesian spelling correction,” J. Inform 13.2: 11. Youness Chaabi, Fadoua Ataa Allah, (2022), “Amazigh spell checker using Damerau-Levenshtein algorithm and N-gram,” Journal of King Saud University - Computer and Information Sciences, Volume 34, Issue 8, Part B, Pages 6116-6124, ISSN 1319-1578.
Schluter, Natalie. (2017). The limits of automatic summarisation according to ROUGE. 41-45. 10.18653/v1/E17-2007.
Liu, Feifan & Liu, Yang. (2008). Correlation between ROUGE and Human Evaluation of Extractive Meeting Summaries.. 201-204. 201-204. 10.3115/1557690.1557747.
Baykara, B., Güngör, T. (2023). Morphosyntactic Evaluation for Text Summarization in Morphologically Rich Languages: A Case Study for Turkish. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham.
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics
Yvette Graham. 2015. Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 128–137, Lisbon, Portugal. Association for Computational Linguistics.

There are 10 citations in total.

Details

Primary Language	English
Subjects	Natural Language Processing
Journal Section	PAPERS
Authors	Ayşegül Çağlı 0009-0000-0237-4661 Vakkas Karakurt 0009-0006-1489-3833 Kürşat Edabalı Yıldırım 0009-0006-6069-4691 Fatih Soygazi 0000-0001-8426-2283 Yılmaz Kılıçaslan 0000-0002-5020-6547
Publication Date	October 18, 2023
Submission Date	August 26, 2023
Acceptance Date	August 26, 2023
Published in Issue	Year 2023 Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023

Cite

APA	Çağlı, A., Karakurt, V., Yıldırım, K. E., Soygazi, F., et al. (2023). The Role of Phonological Errors in Evaluation Metrics. Computer Science, IDAP-2023 : International Artificial Intelligence and Data Processing Symposium(IDAP-2023), 44-51. https://doi.org/10.53070/bbd.1350547