Faculty and research

Department of Data Science and Analytics

The Department of Data Science and Analytics is the ninth department at BI Norwegian Business School and was established in 2020. Our faculty have a background in statistics, machine learning, statistical learning, and/or artificial intelligence.

About the DataScience@BI Research Events

The research events in data science consists of talks on fundamental and applied research within statistics, machine learning and artificial intelligence. When the event is visible in the list above, please click to register. The registration opens about 1 week before the event. In the table below you can see an overview of upcoming seminars this semester.

If you would like to suggest a speaker, contact Assistant Professor Adam Lee.

Practical questions: contact Siri Johnsen for information.

Join our e-mail list to get invitations to our events about 1 week before the event.

Previous events

2025

13 May: DataScience@BI seminar Matteo Barigozzi
8 April: DataScience@BI seminar with Associate Professor Jad Beyhum from KU Leuven
18 March: DataScience@BI seminar with Bjarni Einarsson, Economist, Central Bank of Iceland

2024

3 December: DataScience@BI seminar with Daniel Heydecker
19 November: DataScience@BI seminar invites Gabriel Lewis
5 November: DataScience@BI invites postdoctoral researcher Amrei Luise Stammannat, University of Bayreuth
22 October: DataScience@BI invites Assistant Professor Yiru Wang, Department of Economics at the University of Pittsburgh
15 October: DataScience@BI invites Assistant Professor Jesper Riis-Vestergaard Sørensen at the University of Copenhagen, Department of Economics
8 October: DataScience@BI invites Doctoral Research Fellow Ingrid Dæhlen, University of Oslo
Tuesday 28 May: Oslo Big Data Day 2024

Tuesday 23 April: "Do high frequency text data help forecast crude oil prices? MF-VAR vs. MIDAS" (Machine learning) with Luigi Gifuni
Monday 22 April: "Innovation Powered Narrative Inference" (Econometrics) with Geert Mesters
Tuesday 9 April: "On the Existence and Information of Orthogonal Moments" (Econometrics) with Juan Carlos Escanciano
Tuesday 19 March: "Signature methods for stochastic portfolio theory" (Data Science/Machine learning) with Christa Chuchiero
Tuesday 12 March: "How to Bet on Winners" (Econometrics) with André B.M. Souza
Tuesday 27 February: "Optimal support for distressed subsidiaries - a systemic risk perspective." with Nils Detering
Tuesday 6 February: "Testing Bayesian-Nash Behaviour in Binary Games with Incomplete Information and Correlated Types" with James Duffy
Tuesday 23 January: Cointegration with Occasionally Binding Constraints (Econometrics) with Elia Lapenta

2023

Spring 2023

Date	Time	Title	Speaker	Institution
24 January	12:00-13:00	Testing for the Cointegration Rank Between Periodically Integrated Processes	Tomás Del Barrio Castro	University of the Balearic Islands
7 February	12:00-13:00	Economic Data Science is All You Need: Evaluating UK Policies and More with Text Analysis	Arthur Turrell	Office for National Statistics UK
14 February	12:00-13:00	An Introduction to Rough Path Theory and it's Relation to Stochastic Calculus	Florian Bechtold	Bielefeld University
7 March	12:00-13:00	On Computational Barriers in Inverse Problems	Luca Eva Gazdag	University of Oslo
14 March	12:00-13:00	Modelling Power Markets in Europe Towards 2030: Challenges and Results	Head of European Power Research Gabriele Martinelli	Refinitiv
21 March	12:00-13:00	An Introduction to Uses of (Causal) Machine Learning in Management and Economics Research	Ed Saiedi	BI Norwegian Business School
18 April	12:00-13:00	Detecting Giver and Receiver Spillover Groups in Large Vector Autoregressions	G. Stefan Gudmundsson	University of Aarhus
16 May	12:00-13:00	Bayesian Hyperparameter Learning	Mattias Villani	Stockholm University
23 May	12:00-13:00	Nonlinear Vector Autoregressive Models and Unit Roots	Rickard Sandberg	Stockholm School of Economics
30 May	12:00-13:00	Large Language Models Under the Hood	Andrei Kutuzov	University of Oslo
6 June	12:00-13:00	Clustering via Coherent Network Partitions	Angela Angeleska	University of Tampa, FL USA

Fall 2023

Date	Time	Title	Speaker	Institution
5 Sep	12:00-13:00	Motivation and Diagnostics for Nonlinear Structural Equation Models Using Non-Parametric Regression Among (Econometrics)	Julien P. Irme	Goethe University Frankfurt
26 Sep	12:00-13:00	Interdisciplinary Analysis of Information Diffusion in Large-scale Media Platforms (Machine Learning)	Ivan Belik	NHH
10 Oct	12:00-13:00	Predictive Ability Tests with Possibly Overlapping Models (with V. Corradi and D. Gutknecht) (Econometrics)	Jack Fosten	King’s Business School, King's College London
30 Oct (Monday!)	12:00-13:00	Open Banking and Customer Data Sharing: Implications for FinTech Borrowers	Rachel Nam	Goethe University Frankfurt
14 Nov	12:00-13:00	On Least Squares Estimation Under Random Breaks In Means For Panel Data (Econometrics)	Joakim Westerlund	Lund University
21 Nov	12:00-13:00	"RNN(p) models for Probabilistic Forecasting: Theory and Applications." (Data Science/Machine Learning)	Pietro Manzoni	Politecnico di Milano
5 Dec	12:00-13:00	Tuning-free testing of factor regression against factor-augmented sparse alternatives (Econometrics)	Jonas Striaukas	Copenhagen Business School

2022

Spring 2022

Date	Time	Title	Speaker	Institution
3 February	13:30 - 14:30	Identifying dominant units using graphical models in panel time series data	Jan Ditzen	Free University of Bozen-Bolzano
10 February	13:30 - 14:30	Automatic misinformation detection: results from the medieval multimedia evaluation challenge	Konstantin Pogorelov	Simula
31 March	13:30 - 14:30	Should we stop using black box machine learning models and use interpretable models instead?	Lars Henry Berge Olsen	University of Oslo
24 March	13:30 - 14:30	Data assimilation: Methods, convergence, results and challenges	Håkon Hoel	Department of Mathematics, RWTH Aachen
3 March	13:30 - 14:30	Jumps or Staleness	Roberto Renò	Department of Economics, the University of Verona
16 June	13:30 - 14:30	Adventures in data-driven software engineering	Leon Moonen	Simula and BI
19 May	13:30 - 14:30	Mean field models in physics and social science - from micro to macro	Avi Mayorcas	University of Cambridge
12 May	13:30 - 14:30	Back to the present: learning about the Euro-area through a now-casting model	Domenico Giannone	Amazon.com

Fall 2022

Date	Time	Title	Speaker	Institution
6 September	12:00 - 13:00	This Shock is Different: Estimation and Inference in Misspecied Two-Way Fixed Effects Panel Regressions?	Arturas Juodis	University of Amsterdam
15 September	12:00 - 13:00	AI risk and AI alignment at the hinge of history	Olle Häggström	Chalmers, Göteborg
20 September	12:00 - 13:00	Towards a general local search framework for job-shop scheduling problems	Karim Tamssaouet	BI Norwegian Business School
11 October	12:00 - 13:00	A Flexible Predictive Density Combination for Large Financial Data Sets in Regular and Crisis Periods	Herman van Dijk	Norges Bank, Erasmus University Rotterdam
18 October	12:00 - 13:00	Adaptive Partial Identification of Treatment Effects	Maria Nareklishvili	The Frisch Centre
25 October	12:00 - 13:00	Dynamic Programming on a Quantum Annealer: Solving the RBC Model	Isaiah Hull	Sveriges Riksbank, BI Norwegian Business School
1 November	12:00 - 13:00	A Bayesian two-way latent structure clustering model for genomic data integration with an application to subtyping breast cancer	Arnoldo Frigessi	Department of Biostatistics, University of Oslo
8 November	12:00 - 13:00	Reformulation and decomposition of Pearson and Spearman correlation coefficients into practical measures	Savas Papadopoulos	Central Bank of Greece
15 November	12:00 - 13:00	Realized principal component analysis of noisy high-frequency data	Francesco Benvenuti	Aarhus University, School of Business and Social Sciences
29 November	12:00 - 13:00	Online inference from graph-connected multi-variate time series	Baltasar Beferull-Lozano	Simula
6 December	12:00 - 13:00	High-dimensional CCE: The Cross-Sectionally Averaged Adaptive Lasso	Luca Margaritella	University of Lund

2021

Spring 2021

Date	Time	Title	Speaker	Institution	Url
10 June	12:30-13:30	Bayesian network structure learning	Wei-Ting Yang	BI	BI
Registration for the seminar Abstract: A Bayesian Network is a probabilistic graphic model describing the conditional dependencies among variables via a directed acyclic graph (DAG). The structure of Bayesian networks can be established based on expert domain knowledge or learned from data. Learning structure from data is known to be a computationally challenging, NP-hard problem. In this seminar, two main structure learning methods will be introduced, namely constraint-based methods and score-based methods. Other approaches will be discussed as well. I will also present my previous research, which uses Bayesian networks to solve process control problems in semiconductor manufacturing. The structure is constructed based on sensor data collected from process machines, and some existing domain knowledge is also integrated into the learning procedure.
27 May	12:30-13:30	Multimodal learning models	Rogelio A Mancisidor	BI	BI
Registration for the seminar Abstract: Multimodal learning engages multiple action systems of a learner. For example, children learn to distinguish cats from dogs by auditory inputs in addition to the obvious visual inputs. There is a field in machine learning that aims to mimic such learning process. The main difference in the machine learning approach for multimodal learning is that learning is not done in the input space (the auditory and visual inputs) but using a (hopefully) better representation for the inputs. Once such alternative representation is learned, it can be used to generate or classify the inputs, e.g. images of cats or dogs. In this seminar, I will present my own work on multimodal learning models that are able to learn and generate alternative representations of the input data even when one of the inputs are missing. That is, the alternative representation that is generated contains information from both auditory and visual inputs even when it was generated only with the visual input for example. I will show generative and classification results, for the models that I have developed, on different domains, e.g. image-to-annotation, image-to-image, acoustic-to-articularoty, and historic-to-future credit risk data.
20 May	12:30-13:30	Scalable changepoint and anomaly detection in cross-correlated data	Martin Tveten	Norwegian Computing Center	NR
Registration for the seminar Abstract: Motivated by a problem of detecting time-periods of suboptimal operation of a sensor-monitored subsea pump, this talk presents work on detecting anomalies or changes in the mean of a subset of variables in cross-correlated data. The approach is based on penalised likelihood methodology, but the maximum likelihood solution scale exponentially in the number of variables. I.e., not many variables are needed before an approximation is necessary. We propose an approximation in terms of a binary quadratic program and derive a dynamic programming algorithm for computing its solution in linear time in the number of variables, given that the precision matrix is banded. Our simulations indicate that little power is lost by using the approximation in place of the exact maximum likelihood, and that our method performs well even if the sparsity structure of the precision matrix estimate is misspecified. Through the simulation study, we also aim to understand when it is worth the effort to incorporate correlations rather than assuming all variables to be independent in terms of detection power and accuracy.
29 April	12:30-13:30	Inference in second-order Bayesian networks	Magdalena Ivanovska	BI	BI
Registration for the seminar Abstract: Bayesian networks are a well-established framework for modelling and reasoning with probabilistic information, with a variety of practical applications. However, they require specification of exact probability values which can be challenging in many cases due to lack of data and/or domain expertise. In this talk I give a short introduction to Bayesian networks and their advantages as a modelling tool. Then I discuss our previous and ongoing research on inference in second-order Bayesian networks, in which the uncertainty about the probabilities is expressed by beta and Dirichlet distributions.
22 April	12:30-13:30	Research at OsloMet AI Lab	Hugo Hammer	OsloMet/ Simula@OsloMet	OsloMet
Registration for the seminar Abstract: Over the recent years machine learning, and especially deep learning, has received a lot of attention in medicine. In this seminar I will present two ongoing projects at OsloMet AI lab where deep learning is used to solve medical problems. The projects focus on improving assisted reproduction technology and detecting abnormalities in the gastrointestinal tract. I will also shortly review other research at OsloMet AI lab.
15 April	12:30-13:30	The regression discontinuity design and survival analysis	Emil Aas Stoltenberg	BI	BI
Registration for the seminar Abstract: In this talk I'll present ongoing work on the regression discontinuity design when applied to right-censored survival data. The regression discontinuity design is a design used to evaluate the effect of treatment (a 0-1 random variable) on some outcome. Assignment to the treatment group is determined by the value of an observable covariate lying on either side of a fixed threshold. The idea is that the individuals whose value of this covariate is in a small interval around the threshold are alike, so that by basing inference on these individuals one is controlling for unobserved confounders. The survival analysis models I'll be discussing are those where the outcome is the minimum of a true event time and a censoring variable, giving rise to right-censored survival data.
25 March	12:30-13:30	Developing NLP models without labelling data: a weak supervision approach	Pierre Lison	Norwegian Computing Center, NR	NR
Registration for the seminar Abstract: Perhaps the most important problem we need to face when developing NLP models is the lack of training data. When in-domain labelled data is available, techniques based on transfer learning and data augmentation can certainly help. But what should we do when there is no hand-labelled data for the target domain? One powerful alternative is weak supervision, which seeks to automatically annotate target-domain texts based on a combination of various labelling functions, such as heuristics, gazetteers, out-of-domain models, and document-level constraints. Those labelling functions are then aggregated into a single, unified annotation layer through unsupervised learning, taking into account the varying accuracies and correlations between labelling functions. We illustrate this approach on a Named Entity Recognition task, where we show that such an approach is able to outcompete more complex neural models based on unsupervised domain adaptation.
18 March	12:30-13:30	Measuring agreement in the light of decision theory	Jonas Moss	BI	BI
Abstract: There are many chance-corrected measures of agreement, such as the weighted Cohen's kappa, Krippendorff's alpha, Scott's pi, and Fleiss' kappa. I propose a general chance-corrected measure of agreement using decision theory, which I call the agreement coefficient. Most well-known measures of agreement are special cases of the agreement coefficient, but there are realistic circumstances where no established measure of agreement estimate it consistently. In this talk, I give a gentle introduction to the why's and how's of measures of agreement, and explain how everything is connected to decision theory.
11 March	12:30-13:30	Deep Randomized Neural Networks	Claudio Gallicchio	University of Pisa	Homepage
Abstract: Deep Neural Networks (DNNs) are a fundamental tool in the modern development of Machine Learning. Beyond the merits of the training algorithms, a great part of DNNs success is due to the inherent properties of their layered architectures, i.e., to the introduced architectural biases. This talk explores recent classes of DNN models in which the majority of connections are untrained, i.e., randomized or more generally fixed according to some specific heuristic. Limiting the training algorithms to operate on a reduced set of weights implies intriguing features. Among them, the extreme efficiency of the learning processes is undoubtedly a striking advantage with respect to fully trained counterparts. Besides, despite the involved simplifications, randomized neural systems possess remarkable properties both in practice, achieving state-of-the-art results in multiple domains, and theoretically, allowing to analyze the intrinsic properties of neural architectures. This talk will cover the major aspects regarding Deep Randomized Neural Networks, with a particular focus on dynamically recurrent deep neural systems for time-series and graphs.
18 Feb	12:30-13:30	Some recent contributions to psychometrics	Steffen Grønneberg	BI	BI
Abstract: I will summarize some recent psychometric papers I have written jointly with Njål Foldnes and Jonas Moss. I will focus on two papers: "Partial identification of latent correlations with binary data" (2020, Psychometrika) and "The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms" (2021, Psychological methods, forthcoming). We consider a class of statistical models for ordinal data much in use in applied psychology and other fields within the social sciences from a critical perspective. A traditional and to a large extent unverifiable assumption is that the ordinal variables originating from a person's answers to a questionnaire on a scale say of 1 to 5 (E.g. "How content are you with your life", "Are you happy with your life?", etc.) is generated by "chopping up" (or more technically, discretizing) a continuous multivariate normal variable. We refused to make this assumption, and saw where it led us. The papers investigate what can be inferred from assuming only that the answers are discretization of some variable, not necessarily normal, and to what extent existing methodology is robust towards underlying non-normality.
11 Feb	12:30-13:30	Fake news, Digital Wildfires, and Graph Neural Networks	Johannes Langguth	Simula/BI	Simula
Abstract: In the recent years, misinformation in social networks, often referred to as fake news, has received significant attention from media, researchers, and the general public. We introduce the the topic while focusing on digital wildfires i.e. fast spreading online misinformation phenomena with the potential to cause harm in the physical world. We introduce possible countermeasures from the computer science point of view and show that among the possible countermeasures, network based approaches are particularly suitable, both due to their generality and robustness. We then explain the recently developed graph neural networks and show why they constitute a powerful tool for network based tracking of misinformation.

Fall 2021

Date	Time	Title	Speaker	Institution
9 September	13:30-14:30	On the asymptotic behaviour of the variance estimator of a U-statistic	Riccardo De Bin	UiO
23 September	13:30-14:30	Explaining News Spreading Phenomena in Social Networks	Daniel Thilo Schroeder	Simula
14 October	13:30-14:30	Predicting service time to improve route optimization	Tarjei Bondevik	Oda
21 October	13:30-14:30	Dynamic Combination and Calibration for Climate Predictions	Francesco Ravazzolo	BI
4 November	13:30-14:30	Probabilistic programming (and what it means for you)	Jan Kudlicka	BI
11 November	13:30-14:30	Explainable AI - from game theory with love	Inga Strumke	NTNU
2 December	13:30-14:30	SMARTboost: Efficient Boosting of Smooth Regression Trees	Paolo Giordani	BI