I work in the area of language technology.
Recurring key aspects of my research are:
- Methodology and fundamentals: Development of fundamental linguistic resources such as a part-of-speech tagset for Albanian.
I have an interest in natural language processing (NLP). So far, I have been active in the following areas: tokenization, computational morphology, part-of-speech tagging, sentiment analysis, semantic similarity, and implicit emotion recognition.
Kabashi, Besim, and Thomas Proisl. 2018. “Albanian Part-of-Speech Tagging: Gold Standard and Evaluation.” In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), edited by Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, 2593–9. Miyazaki: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2018/pdf/89.pdf. [bib, pdf]
Kabashi, Besim, and Thomas Proisl. 2016. “A Proposal for a Part-of-Speech Tagset for the Albanian Language.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, and Stelios Piperidis, 4305–10. Portorož: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2016/pdf/1066_Paper.pdf. [bib, pdf]
Lexicography / Valency / Cooccurrence phenomena
I am a member in the Interdisciplinary Centre for Research on Lexicography, Valency and Collocation.
Kabashi, Besim. 2019. “Collecting Collocations for the Albanian Language.” In Proceedings of the Sixth Biennial Conference on Electronic Lexicography: Electronic Lexicography in the 21st Century (eLex 2019), Sintra, Portugal, October 1–3, 2019., edited by Zingano Kuhn Kosem I., 478–89. Brno, Czech Republic: Lexical Computing, s.r.o. https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_27.pdf. [bib]
Proisl, Thomas, Philipp Heinrich, Stefan Evert, and Besim Kabashi. 2017. “Translation Inference Across Dictionaries via a Combination of Graph-Based Methods and Co-Occurrence Statistics.” In Proceedings of the LDK 2017 Workshops: 1st Workshop on the OntoLex Model (OntoLex-2017), Shared Task on Translation Inference Across Dictionaries & Challenges for Wordnets, edited by John P. McCrae, Francis Bond, Paul Buitelaar, Philipp Cimiano, Thierry Declerck, Jorge Gracia, Ilan Kernerman, Elena Montiel-Ponsoda, Noam Ordan, and Maciej Piasecki, 94–102. Galway: CEUR-WS.org. http://ceur-ws.org/Vol-1899/TIAD17_paper_1.pdf. [bib, pdf]
Kabashi, Besim. 2007. “Pronominal Clitics and Valency in Albanian. A Computational Linguistics Prespective and Modelling Within the LAG-Framework.” Edited by Herbst Thomas; Götz-Votteler Katrin, Trends in linguistics. studies and monographs, 187: 339–52. https://doi.org/10.1515/9783110198775.4.339. [bib, pdf]
I am also a member in the Interdisciplinary Centre for Digital Humanities.