Maria Lomeli

About me

I am a research engineer at Fundamental AI Research, Meta. Previously, I was a senior research scientist at Babylon Health, UK. Before that, I was a research associate, working with Zoubin Ghahramani at the Machine Learning group, CBL, University of Cambridge and member of Trinity Hall college. I studied my PhD at the Gatsby Unit, UCL, my supervisor was Yee Whye Teh.

Journal publications

Petroni, F., Broscheit, S., Piktus, A., Lewis, P., Izacard, G., Hosseini, L., Dwivedi-Yu, J., Lomeli, M., Schick, T., Mazaré, P.E., Joulin, A., Grave, E., Riedel, S., ''Improving wikipedia verifiability with AI'' Nature Machine Intelligence, 2023, Vol 5, pp 1142–1148, NMI .

Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., Dwivedi-Yu, J., Joulin, A., Riedel, S., Grave, E., ''Atlas: few-shot learning with retrieval-augmented language models'' Journal of Machine Learning Research, 2023, Vol 24, pp 1-43, JMLR .

Mialon, G., Dessi, R., Lomeli, M., Nalmpantis, C., Pasunuru, R., Raileanu, R., Rosiere, B., Schick, T., Dwivedi-Yu, J., Celikyilmaz, A., LeCun, Y., Scialom, T., ''Augmented language models: a survey'', Transactions of Machine Learning Research, 2023, Vol 6, TLMR .

Valera, I., Pradier, M., Lomeli, M. and Ghahramani, Z., ''General Latent Feature Model for Heterogeneous Datasets'', Journal of Machine Learning Research, 2020, Vol 21 JMLR.

Lomeli, M., Rowland, M., Gretton, A. and Ghahramani, Z., ''Antithetic and Monte Carlo kernel estimators for partial rankings'', Statistics and Computing, 2019, Vol 29,1127–1147, StCo.

Lomeli, M., Favaro, S., Teh, Y. W.,'' A marginal sampler for $\sigma$ -Stable Poisson-Kingman mixture models'', Journal of Computational and Graphical Statistics, 2017, Vol 26,pp 44-53 JCGS.

Favaro, S., Lomeli, M., Nipoti, B., Teh, Y.W., ''Stick-breaking representations of $\sigma$ -stable Poisson-Kingman models'' , Electronic Journal of Statistics, 2014, Vol. 8, pp 1063-1085 EJS.

Favaro, S., Lomeli, M., Teh, Y.W.,''On a class of $\sigma$ -stable Poisson-Kingman models and an effective marginalized sampler'', Statistics and Computing, 2014, Vol 25, pp 67-78 StCo .

Proceedings

Mekala, D., Weston, J., Lanchantin, J., Raileanu, R., Lomeli, M., Shang, J., Dwivedi-Yu, J., 2024, ''Toolverifier: Generalization to New Tools via Self-Verification '' EMNLP findings

Dwivedi-Yu, J., Schick, T., Jiang, Z., Lomeli, M., Lewis, P., Izacard, G., Grave, E., Riedel, S., Petroni, F., 2024, ''EditEval: an instruction-based benchmark for text improvements'' CoNLL

Lin, V. X., Chen, X., Chen, M., Shi, W., Lomeli, M., James, R., Rodriguez, P., Kahn, J., Szilvasy, G., Lewis, M., Zettlemoyer, L., Yih, S., 2024. ''RA-DIT: Retrieval-Augmented Dual Instruction Tuning'' ICLR

Shi, W., Min, S., Lomeli, M., Chou, Z., Li, M., Lin, V., Smith, N. A., Zettlemoyer, L., Yih, S., Lewis, M. 2024, ''In-Context Pretraining: Language Modeling Beyond Document Boundaries'' ICLR (accepted as spotlight)

Schick, T., Dwivedi-Yu, J., Dessi, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., Scialom, T., 2023, ''Toolformer: language models can teach themselves to use tools'' Neural Information Processing Systems NeurIPS (accepted as an oral presentation)

Harman, M., Ahlgren, J., Berezin, M., Dulskyte, E., Dvortsova, I., George, J., Gucevska, N., Meijer, E., Spahr-Summers, J., Bojarczuk, K., Sapora, S. and Lomeli, M., 2021, ''Testing Web Enabled Simulation at Scale Using Metamorphic Testing'', International Conference on Software Engineering ICSE

Lomeli, M., Favaro, S.,Teh, Y.W., 2015, ''A hybrid sampler for Poisson-Kingman mixture models'', Neural information Processing Systems NeurIPS

Sejdinovic, D., Strathmann, H., Lomeli Garcia, M., Andrieu, C., Gretton, A., 2014,''Kernel Adaptive Metropolis-Hastings'', International Conference in Machine Learning ICML

Workshops

Gautam, D., Lomeli, M., Gourgoulias, K., Thompson, D., Johri, S., 2019, ''Masking schemes for universal marginalisers'', Advances in Approximate Bayesian Inference symposium

Bloem-Reddy, B., Mathieu, E., Foster, A., Rainforth, T., Ge, H., Lomeli, M., Ghahramani, Z., Teh, Y.W., 2017, ''Sampling and inference for discrete random probability measures in probabilistic programs'', Advances in Approximate Bayesian Inference workshop, NeurIPS

Thesis

General Bayesian inference schemes in infinite mixture models
PhD thesis, University College London
UCL repository: Doctoral dissertation link

Preprints

Mazare, P., Szilvasy, G., Lomeli, M., Massa, F., Murray, N., Jegou, H., and Douze, M., 2025, ''Inference-time sparse attention with asymmetric indexing'' arXiv

Lupidi, A., Gemmell, C., Cancedda, N., Dwivedi-Yu, J., Weston, J., Foerster, J., Raileanu,R. and Lomeli, M.,2024, ''Source2Synth: synthetic data generation and curation grounded in real data sources'' arXiv

Singh, A.K., Kocyigit, M. Y., Poulton, A., Esiobu, D., Lomeli, M., Szilvasy, G. and Hupkes, D., 2024, ''Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?' arXiv

Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazare, P.E., Lomeli, M., Hosseini, L., Jegou, H. 2024, ''The faiss library'' arXiv

Talks

February, 2024. Talk at the RIIA 6.0 summer school, Quito, Ecuador (virtual)

July, 2023. Talk at the Microsoft Research Montreal and Mila seminar, Montreal, Canada (virtual)

June, 2023. Talk at the Gatsby unit anniversary symposium, UCL, London

June, 2023. Talk at the virtual seminar series Sinc instutite, Santa Fe, Argentina (virtual)

March, 2023. Talk at the NLP parallel session, Latin American Meeting in Artificial Intelligence, Khipu, Montevideo, Uruguay

October, 2021. Joint keynote talk with Mark Harman at the ESEM 2021 conference (virtual)

September, 2019. Talk at the ''Recent developments on kernel methods'', UCL, London

July, 2019. Talk at the Gatsby unit anniversary symposium, UCL, London

June, 2019. Talk at ''Congreso Bayesiano de América Latina'', Lima, Peru

April, 2019. Talk at ''Achieving impact in healthcare: from mathematics to clinical support systems and devices'' workshop, Newton Institute, Cambridge, UK

March 20, 2019. Talk at the Statistics seminar series, Queen Mary University, London

June 11, 2018. Talk at Parallelising Monte Carlo Algorithms workshop, School of Mathematics, University of Bristol, Bristol

March 15, 2018. Talk at the CamAIML event, Microsoft research Cambridge, Cambridge, UK

February 16, 2018. Talk at the University of Glasgow, Statistics seminar, Glasgow, UK

Febryary 2, 2018. Talk at UCL, CSML Lunchtime seminar, London, UK

February 1, 2018. Talk at Amazon Cambridge research series seminar, Cambridge, UK

August 30, 2017. Talk at the 2017 SMC workshop, Uppsala, Sweden

June 7, 2017. Talk at the ''Congreso Bayesiano de América Latina'', Guanajuato, Mexico

June 14, 2016. Talk at the ''Bayes Legacy'' sesssion, 13th ISBA Wold meeting in Sardinia, Italy

June 2, 2016. Talk at the Machine Learning group, CBL, University of Cambridge

May 5, 2016. Talk at Machine Learning reading group, CBL, University of Cambridge

July 16, 2015. Talk at CBL, University of Cambridge

June 22, 2015. Talk at the 10th Bayesian Nonparametrics conference

June 15, 2015. Talk at the 9th Bayesian Inference for Stochastic Processes conference

January 26, 2014. Talk at the Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México

October 24, 2014. Talk at the Computational Statistics seminar, University of Oxford

September 24, 2014. Talk at CBL, University of Cambridge

March 3, 2014. Talk at the workshop Advances in Scalable Bayesian Computation, available online here

Teaching

Teaching assistant, Part II Statistical modelling course, Statslab, University of Cambridge (Lent, 2018)

Coding lab demonstrator, APTS, Statistical computing module for Statistics PhD students, University of Cambridge (December, 2017)

Coding lab demonstrator, MLSALT1 graduate course, University of Cambridge (Michaelmas, 2017)

Coding lab demonstrator, 3F8 undergraduate course, University of Cambridge (Lent, 2017)

Teaching assistant, Statistical Data Mining and Machine Learning MSc in Applied Statistics course, University of Oxford (Hilary term 2014 and 2015)

Coding lab demonstrator, Kernel methods module, Introduction to machine learning graduate course, University College London (2013)

Teaching assistant, Probabilistic and Unsupervised Learning graduate course, University College London (Autumn, 2012)

Lecturer, Stochastic Processes, undergraduate course, Instituto Tecnológico Autónomo de México (Summer, 2011)

Reviewing

2023 Action Editor, ACL RR

2025 Reviewer, ACL RR

2019, 2021, 2022 Uncertainty in Artificial Intelligence conference

2018, 2020 Bayesian Analysis

2018 Journal of Machine Learning Research

2017 Biometrika

2017 Scandinavian Journal of Statistics

2016 Computational Statistics and Data Analysis

2016 Statistics and Computing

2016, 2017, 2019, 2024, 2025 International Conference in Machine Learning

2013, 2014, 2015, 2017, 2018, 2019, 2023, 2025 Neural Information Processing Systems

2014, 2015, 2020, 2021, 2022 AISTATS

Miscellaneous

I was one of the organisers of our CSML Lunch Talk Series.