Publications
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation
David Rau, Hervé Déjean,
Nadezhda Chirkova, Thibault Formal, Shuai Wang, Vassilina Nikoulina, Stéphane Clinchant
Findings of EMNLP 2024
[code]
Retrieval-augmented generation in multilingual settings
Nadezhda Chirkova, David Rau, Hervé Déjean, Thibault Formal, Stéphane Clinchant, Vassilina Nikoulina
Knowledgeable LLMs workshop at ACL 2024
[code]
Zero-shot cross-lingual transfer in instruction tuning of large language models
Nadezhda Chirkova, Vassilina Nikoulina
INLG 2024
[code]
Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks
Nadezhda Chirkova, Vassilina Nikoulina
NAACL 2024
Should you marginalize over possible tokenizations?
Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman
ACL 2023
[code]
CodeBPE: investigating subtokenization options for large language model pretraining on source code
Nadezhda Chirkova, Sergey Troshin
ICLR 2023, and also DL4Code workshop at ICLR 2022 (spotlight)
Probing pretrained models of source code
Sergey Troshin,
Nadezhda Chirkova
BlackBox NLP Workshop at EMNLP 2022
[code]
Parameter-efficient finetuning of Transformers for source code
Shamil Ayupov,
Nadezhda Chirkova
Workshop on Efficient Natural Language Processing at NeurIPS 2022
On the periodic behavior of neural network training with batch normalization and weight decay
Ekaterina Lobacheva*, Maxim Kodryan*,
Nadezhda Chirkova, Andrey Malinin, Dmitry Vetrov
NeurIPS 2021
[code]
Empirical study of Transformers for source code
Nadezhda Chirkova, Sergey Troshin
ESEC/FSE 2021
[code]
[video]
A simple approach for handling out-of-vocabulary identifiers in deep learning for source code
Nadezhda Chirkova*, Sergey Troshin*
NAACL 2021
[code]
[video]
On the embeddings of variables in recurrent neural networks for source code
Nadezhda Chirkova
NAACL 2021
[code]
On power laws in deep ensembles
Ekaterina Lobacheva,
Nadezhda Chirkova, Maxim Kodryan, Dmitry Vetrov
NeurIPS 2020 (spotlight)
[code]
[video]
Structured sparsification of gated recurrent neural networks
Ekaterina Lobacheva*,
Nadezhda Chirkova*, Alexander Markovich, Dmitry Vetrov
AAAI 2020
[code]
Bayesian compression for natural language processing
Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
EMNLP 2018
[code]