Simula@BI: Large language models under the hood

What you always wanted to know about *GPT

  • Starts:12:00, 30 May 2023
  • Ends:13:00, 30 May 2023
  • Location:BI - campus Oslo, room: C2-005
  • Contact:Siri Johnsen (Siri.johnsen@bi.no)

Simula@BI invites Associate Professor Andrey Kutuzov to give a talk about machine learning.

In the last few years, radical increase in the scale of deep neural language models (both in terms of the size of the training data and the size of the models themselves) has led to impressive achievements in various natural language processing tasks.

"Celebrity" models, like ChatGPT, LLaMa, BLOOM or PaLM are already sometimes described to as "approaching artificial intelligence", although the reality can differ from over-hyped media coverage.

In this talk, Professor Andrey Kutuzov will describe the foundations of the technology behind large-scale language models. Two most important components behind their success are 1) state-of-the-art deep learning architectures (in particular, Transformer) and 2) the availability of tremendous amount of textual data used to train such models. The interaction of these two poses intricate theoretical and practical questions, also linked to issues with unequal distribution of computing resources and biases in training data.

Can we actually reach AI by simply training ever larger language models?


