Дрожащих, Н.В. ЛЕММАТИЗАЦИЯ МАЛОРЕСУРСНЫХ ЯЗЫКОВ В ДИАХРОНИЧЕСКОЙ ЛИНГВИСТИКЕ: ПРОБЛЕМЫ И РЕШЕНИЯ / Н.В. Дрожащих, Е.В. Ефимова // Известия Российского государственного педагогического университета им. А.И. Герцена. – 2025. – № 217. – С. 302-311
1. Bech, K., Eide, K. (2014) The ISWOC corpus. Department of Literature, Area Studies and European Languages, University of Oslo. [Online]. Available at: http://iswoc.github.io (accessed 23.05.2025). (In English)
2. CLTK. (2023) Old English Lemmas: oe.lemmas. [Online]. Available at: https://github.com/cltk/ang_models_cltk/blob/master/data/oe.lemmas (accessed 23.05.2025). (In English)
3. Efmova, E. V. (2025) OE_Lemmatization. [Online]. Available at: https://github.com/webnora/OE_Lemmatization (accessed 23.05.2025). (In English)
4. Johnson, K. P., Burns, P., Stewart, J., Cook, T. (2014-2021)CLTK: The Classical Language Toolkit. [Online]. Available at: https://github.com/cltk/cltk (accessed 05.03.2025). (In English)
5. The ISWOC Project. (2021) The ISWOC Treebank. [Online]. Available at: https://dev.syntacticus.org/iswoc.html#downloads (accessed 23.05.2025). (In English)
6. UniMorph Project. (2022) Universal Morphology (UniMorph). [Online]. Available at: https://github.com/unimorph (accessed 23.05.2025). (In English)
7. Wikimedia Foundation. (2025) Index of /enwiktionary/. [Online]. Available at: https://dumps.wikimedia.org/enwiktionary/ (accessed 23.05.2025). (In English)
8. Category: Old English lemmas. (2022). Wiktionary. [Online]. Available at: https://en.wiktionary.org/wiki/Category:Old_English_lemmas (accessed 23.05.2025). (In English)
9. Bosworth, J. (2014) An anglo-saxon dictionary online. [Online]. Available at: https://bosworthtoller.com/ (accessed 23.05.2025). (In English)
10. Healey, A., diPaolo, A., Holland, J. et al. (2009) The dictionary of Old English corpus in electronic form, TEI-P5 conformant version [CD-ROM]. Toronto: Dictionary of Old English Project, University of Toronto. (In English)
11. Kay, C., Alexander, M., Dallachy, F. et al. (eds.) (2025) The historical thesaurus of English. 2nd ed., version 5.0. Glasgow: University of Glasgow Publ. [Online]. Available at: https://ht.ac.uk/ (accessed 23.05.2025). (In English)
12. Roberts, J., Kay, C., Grundy, L. (2017) A Thesaurus of Old English. Glasgow: University of Glasgow Publ. [Online]. Available at: http://oldenglishthesaurus.arts.gla.ac.uk/ (accessed 23.05.2025). (In English)
13. Watkins, C. (1985) The American heritage dictionary of Indo-European roots. Boston: Houghton Miffin Publ., 149 p. (In English)
14. Wiktionary. (2025) [Online]. Available at: https://en.wiktionary.org/wiki/ (accessed 23.05.2025). (In English)
15. Bajčetić, L., Declerck, T. (2022) Using Wiktionary to create specialized lexical resources and datasets. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022) June 20-25, 2022. Marseille: European Language Resources Association Publ., pp. 3457-3460. (In English)
16. Dereza, O. (2016) Building a dictionary-based lemmatizer for Old Irish. In: Actes de la conférence conjointe JEP-TALN-RECITAL, vol. 6: Celtic Language Technology Workshop. Paris: AFCP - ATALA Publ., pp. 12-17. (In English)
17. Eckhoff, H., Berdičevskis, A. (2016) Automatic parsing as an effcient pre-annotation tool for historical texts. In: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) December 11-17 2016. Osaka: The COLING 2016 Organizing Committee Publ., pp. 62-70. (In English)
18. Eger, S., vor der Brück, T., Mehler, A. (2015) Lexicon-assisted tagging and lemmatization in Latin: A comparison of six taggers and two lemmatization models. In: Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities July 30, 2015. Beijing: Association for Computational Linguistics Publ., pp. 105-113. https://doi.org/10.18653/v1/W15-3716 (In English)
19. Fernández, L. G. (2020) Sources and steps of corpus lemmatization. Old English anomalous verbs. Re-vista Española de Lingüística Aplicada - Spanish Journal of Applied Linguistics, vol. 33, no. 2, pp. 416-442. https://doi.org/10.1075/resla.18024.gar (In English)
20. Gogoi, A., Baruah, N. A (2022) A lemmatizer for low-resource languages: WSD and Its role in the Assamese language. Transactions on Asian and Low-Resource Language Information Processing, vol. 21, no. 4, article 74. https://doi.org/10.1145/3502157 (In English)
21. Hämäläinen, M., Partanen, N., Alnajjar, K. (2021) Lemmatization of historical old literary Finnish texts in modern orthography. In: Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), 28 juin au 2 juillet 2021 [Proceedings of the 28th Conference on Natural Language Processing (TALN), June 28 to July 2, 2021]. Lille: ATALA Publ., pp. 189-198. https://doi.org/10.48550/arXiv.2107.03266 (In English)
22. Hathout, N., Sajous, F., Calderone, B. (2014) Acquisition and enrichment of morphological and morphose-mantic knowledge from the French Wiktionary. In: Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing August 24 2014. Dublin: Association for Computational Linguistics and Dublin City University Publ., pp. 65-74. https://doi.org/10.3115/v1/W14-5809 (In English)
23. Hogg, R. M. (ed.). (1992) The Cambridge History of the English Language. Vol. 1: The Beginning to 1066. Cambridge: Cambridge University Press, 613 p. (In English)
24. Kanerva, J., Ginter, F., Salakoski, T. (2021) Universal lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks. Natural Language Engineering, vol. 27, no. 5, pp. 545-574. https://doi.org/10.1017/S1351324920000224 (In English)
25. Karimov, R., Samkova, M., Nikitina, S., Akinin, A. (2017) Using a hybrid algorithm for lemmatization of a diachronic corpus. In: Proceedings of the Workshop on Computational Linguistics and Language Science, 26 April 2016 CLLS 2016. Vol. 1886, pp. 1-8. [Online]. Available at: https://ceur-ws.org/Vol-1886/ (accessed 05.03.2025). (In English)
26. Liebeck, M., Conrad, S. (2015) iwnlp: inverse wiktionary for natural language processing. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol. 2: Short Papers. Beijing: Association for Computational Linguistics Publ., pp. 414-418. https://doi.org/10.3115/v1/P15-2068 (In English)
27. Magueresse, A., Carles, V., Heetderks, E. (2020) Low-resource languages: A Review of past work and future challenges. [Online]. Available at: https://arxiv.org/pdf/2006.07264 (accessed 05.03.2025). (In English)
28. Manjavacas, E., Kádár, Á., Kestemont, M. (2019) Improving lemmatization of non-standard languages with joint learning. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, vol. 1 (Long and Short Papers. Minneapolis: Association for Computational Linguistics Publ., pp. 1493-1503. https://doi.org/10.18653/v1/N19-1153 (In English)
29. Martín Arista, J. (2025) The Computational Study of Old English. Encyclopedia, vol. 5, no. 3, article 137. https://doi.org/10.3390/encyclopedia5030137 (In English)
30. Mausam, Soderland, S., Etzioni, O. et al. (2009) Compiling a massive, multilingual dictionary via probabilistic inference. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, vol. 1. Suntec: Association for Computational Linguistics Publ., pp. 262-270. (In English)
31. Miletić, A., Siewert, J. (2023) Lemmatization experiments on two low-resourced languages: Low Saxon and Occitan. In: Tenth workshop on NLP for similar languages, varieties and dialects (VarDial 2023) May 5, 2023. Dubrovnik: Association for Computational Linguistics Publ., pp. 163-173. https://doi.org/10.18653/v1/2023.vardial-1.17 (In English)
32. Navarro, E., Sajous, F., Gaume, B. et al. (2009) Wiktionary for Natural language processing: Methodology and limitations. In: Proceedings of the 2009 workshop on the people's web meets NLP: Collaboratively constructed semantic resources (People's Web). Suntec: Association for Computational Linguistics Publ., pp. 19-27. (In English)
33. Nigatu, H. H., Tonja, A. L., Rosman, B. et al. (2024) The Zeno's paradox of "low-resource" languages. In: Proceedings of the 2024 conference on Empirical methods in natural language processing. Miami: Association for Computational Linguistics Publ., pp. 17753-17774. https://doi.org/10.18653/v1/2024.emnlp-main.983 (In English)
34. Percillier, M., Trips, C. (2020) Lemmatising verbs in middle English corpora: The beneft of enriching the Penn-Helsinki parsed corpus of middle English 2 (PPCME2), the parsed corpus of middle English Poetry (PCMEP), and a parsed linguistic atlas of early middle English (PLAEME). In: Proceedings of the 12th language resources and evaluation conference (LREC 2020) 11-16 May, 2020. Marseille: European Language Resources Association Publ., pp. 7170-7178. (In English)
35. Sáenz, M. T., Rodríguez, D. M. (2018) A semiautomatic lemmatisation procedure for treebanks. Old English strong and weak verbs. In: Proceedings of the 16th international workshop on treebanks and linguistic theories (TLT16) January 23-24, 2018. Prague: [s. n.], pp. 88-94. (In English)
36. Saunack, K., Saurav, K., Bhattacharyya, P. (2021) How low is too low? A monolingual take on lemmatisation in Indian languages. In: Proceedings of the 2021 Conference of the North American chapter of the association for computational linguistics: Human language technologies. [Online]. Available at: https://doi.org/10.18653/v1/2021.naacl-main.322 (accessed 05.03.2025). (In English)
37. Swaelens, C., Singh, P., Vos, I. De, Lefever, E. (2024) Lemmatisation of medieval Greek: Against the limits of transformers' capabilities? In: Proceedings of the 2024 joint international conference on computational linguistics, language resources and evaluation (LREC-COLING 2024) 20-25 May, 2024. Torino: ELRA Publ., pp. 10293-10302. (In English)
38. Torre, A. R. (2022) Automatic lemmatization of old English class iii strong verbs (L-Y) with ALOEV3. Journal of English Studies, vol. 20, pp. 237-266. https://doi.org/10.18172/jes.5324 (In English)
39. Wiktionary. (2025) Wikipedia. [Online]. Available at: https://en.wikipedia.org/wiki/Wiktionary (accessed 14.08.2025). (In English)
40. Wiktionary: Old English entry guidelines. (2025) Wiktionary. [Online]. Available at: https://en.wiktionary.org/wiki/Wiktionary%3AOld_English_entry_guidelines (accessed 21.08.2025). (In English)
41. Zesch, T., Gurevych, I. (2009) Wisdom of crowds versus wisdom of linguists - measuring the semantic relatedness of words. Journal of Natural Language Engineering, vol. 16, no. 1, pp. 25-59. https://doi.org/10.1017/S1351324909990167 (In English)