EPFL Statistics Seminar

  • Statistics seminar organised by UNIL

    Jonas Peters University of Copenhagen
    Tuesday, February 20, 2018
    Time 12:15 to 13:15 - Internef - 237

    Title: Invariant Causal Prediction


    Why are we interested in the causal structure of a process? In classical prediction tasks as regression, for example, it seems that no causal knowledge is required. In many situations, however, we want to understand how a system reacts under interventions, e.g., in gene knock-out experiments. Here, causal models become important because they are usually considered invariant under those changes. A causal prediction uses only direct causes of the target variable as predictors; it remains valid even if we intervene on predictor variables or change the whole experimental setting. In this talk, we show how we can exploit this invariance principle to estimate causal structure from data. We apply the methodology to data sets from biology, epidemiology, and finance.
    The talk does not require any knowledge about causal concepts.

  • Dr. Łukasz Kidziński

    Stanford University
    Friday, February 23, 2018
    Time 15:15 - Room MA10

    Title: Sparse longitudinal modeling using matrix factorization


    A common problem in clinical practice is to predict disease progression from sparse observations of individual patients. The classical approach to modeling this kind of data relies on a mixed-effect model where time is considered as both a fixed effect (a population trajectory) and a random effect (an individual trajectory). In our work, we map the problem to a matrix completion framework and solve it using matrix factorization techniques. The proposed approach does not require assumptions of the mixed-effect model and it can be naturally extended to multivariate measurements

  • Dr. Giacomo Zanella

    Università Bocconi, Milano
    Friday, March 16, 2018
    Time 15:15 - Room MA10

    Title: Optimization and complexity of the Gibbs Sampler for multilevel Gaussian models


    We study the convergence properties of the Gibbs Sampler in the context of Bayesian hierarchical linear models with nested and crossed-effects structures. We develop a novel methodology based on multi-grid decompositions to derive analytic expressions for the convergence rates of the algorithm. In the nested context, our work gives a rather complete understanding of the Gibbs Sampler behavior for models with arbitrary depth, leading to simple and easy-to-implement guidelines to optimize algorithmic implementations. In the context of crossed-effect models, where classical strategies to speed-up convergence are not applicable, we show that the convergence of commonly implemented Gibbs Sampler strategies deteriorates as the data-size increases. This results in super-linear computational complexity (potentially even quadratic) in the number of data-points. Leveraging the insight provided by the multi-grid analysis, we design a simple collapsed Gibbs Sampler whose complexity matches the one of nested scenarios. The implications for scalable Bayesian inferences on large multilevel models are discussed.
    Joint work with Omiros Papaspiliopoulos and Gareth Roberts

  • Ass. Prof. Ben Shaby

    Penn State University
    Friday, March 23, 2018
    Time 15:15 - Room MA10

    Title: Max-Infinitely Divisible Models for Spatial Extremes Using Random Effects


    Rare events can have crippling effects on economies, infrastructure, and human health and wellbeing. Their outsized impacts make extreme events critical to understand, yet their defining characteristic, rareness, means that precious little information is available to study them. Extremes of environmental processes are inherently spatial in structure, as a given event necessarily occurs over a particular spatial extent at a particular collection of locations. Characterizing their probabilistic structure therefore requires moving well beyond the well-understood models that describe marginal extremal behavior at a single location. Rather, stochastic process models are needed to describe joint tail event across space. Distinguishing between the subtly different dependence characteristics implied by current families of stochastic process models for spatial extremes is difficult or impossible based on exploratory analysis of data that is by definition scarce. Furthermore, different choices of extremal dependence classes have large consequences in the analysis they produce. We present stochastic models for extreme events in space that are 1) flexible enough to transition across different classes of extremal dependence, and 2) permit inference through likelihood functions that can be computed for large datasets. It will accomplish these modeling goals by representing stochastic dependence relationships conditionally, which will induce desirable tail dependence properties and allow efficient inference through Markov chain Monte Carlo. We develop models for spatial extremes using max-infinitely divisible processes, a generalization of the limiting max-stable class of processes which has received a great deal of attention. This work extends previous family of max-stable models based on a conditional hierarchical representation to the more flexible max-id class, thus accommodating a wider variety of extremal dependence characteristics while retaining the structure that makes it computationally attractive.

  • Dr. Quentin Berthet

    University of Cambridge
    Friday, April 13, 2018
    Time 15:15 - Room MA10

    Title: Optimal Link Prediction with Matrix Logistic Regression


    We consider the problem of link prediction, based on partial observation of a large network and on covariates associated to its vertices. The generative model is formulated as matrix logistic regression. The performance of the model is analysed in a high-dimensional regime under structural assumption. The minimax rate for the Frobenius norm risk is established and a combinatorial estimator based on the penalised maximum likelihood approach is shown to achieve it. Furthermore, it is shown that this rate cannot be attained by any algorithm computable in polynomial time, under a computational complexity assumption, and we will present the tools needed to establish these fundamental limits, and other problems where they appear. Joint work with Nicolai Baldin

  • Ass. Prof. David Bolin

    Chalmers University of Technology
    Friday, April 27, 2018
    Time 15:15 - Room MA10

    Title: A Bayesian General Linear Modeling Approach to Cortical Surface fMRI Data Analysis


    Cortical surface fMRI (cs-fMRI) has recently grown in popularity versus traditional volumetric fMRI, as it allows for more meaningful spatial smoothing and is more compatible with the common assumptions of isotropy and stationarity in Bayesian spatial models. However, as no Bayesian spatial model has been proposed for cs-fMRI data, most analyses continue to employ the classical, voxel-wise general linear model (GLM). Here, we propose a Bayesian GLM for cs-fMRI, which employs a class of spatial processes based on stochastic partial differential equations to model latent activation fields. Bayesian inference is performed using integrated nested Laplacian approximations (INLA), which is a computationally efficient alternative to Markov Chain Monte Carlo. To identify regions of activation, we propose an excursions set method based on the joint posterior distribution of the latent fields, which eliminates the need for multiple comparisons correction. Finally, we address a gap in the existing literature by proposing a Bayesian approach for multi-subject analysis. The methods are validated and compared to the classical GLM through simulation studies and a motor task fMRI study from the Human Connectome Project. The proposed Bayesian approach results in smoother activation estimates, more accurate false positive control, and increased power to detect truly active regions.

  • Prof. Philippe Rigollet

    Thursday, May 24, 2018
    Time 15:15 - Room MA10

    Title: Learning determinantal point processes


    Determinantal Point Processes (DPPs) are a family of probabilistic models that have a repulsive behavior, and lend themselves naturally to many tasks in machine learning (such as recommendation systems) where returning a diverse set of objects is important. While there are fast algorithms for sampling, marginalization and conditioning, much less is known about learning the parameters of a DPP. In this talk, I will present recent results related to this problem, specifically:
    - Rates of convergence for the maximum likelihood estimator: by studying the local and global geometry of the expected log-likelihood function we are able to establish rates of convergence for the MLE and give a complete characterization of the cases where these are parametric. We also give a partial description of the critical points for the expected log-likelihood.
    - Optimal rates of convergence for this problem: these are achievable by the method of moments and are governed by a combinatorial parameter, which we call the cycle sparsity.
    - A fast combinatorial algorithm to implement the method of moments efficiently.

    Co-authors: Victor-Emmanuel Brunel (MIT), Ankur Moitra (MIT), John Urschel (MIT)

  • Ass. Prof. Patrick Rubin-Delanchy

    University of Bristol
    Friday, June 1st, 2018
    Time 15:15 - Room MA10

    Title: The generalised random dot product graph: a statistical model underpinning spectral embedding


    Finding a statistical framework under which to perform inference about graph-valued data has proved to be surprisingly challenging, considering the wealth of prior work in the fields of (broader) Mathematics and Computer Science. In this talk, a probabilistic model is presented that allows more refined analysis of spectral embedding and clustering as statistical estimation procedures, and which has several other advantages including generality (e.g. the mixed membership and standard stochastic block models are special cases), scalability (e.g. by some arguments requiring computation of only the first few singular vectors of the adjacency matrix), and interpretability (e.g. mixtures of connectivity behaviours are represented as convex combinations in latent space). Corresponding to this canonical statistical interpretation of spectral embedding is an indefinite orthogonal group that describes the identifiability limitations on the latent positions defined by the model. This group, which is most famously relevant to the theory of special relativity, can consist of transformations that affect inter-point distances, with worrying implications for spectral clustering. All such issues are resolved by simple statistical insights on the effect of linear transformations on volumes and Gaussian mixture models, confirming a more generally recognised rule-of-thumb in data science: Gaussian clustering should be preferred over K-means. Methodology and ideas are illustrated with cyber-security applications.

Visitor Information

Directions for visitors

Mailing List

Please email Ms. Schaffner if you would like to be added to the seminar mailing list.