EPFL Statistics Seminar

Statistics Seminars 2012

  • Prof. Mathieu Ribatet

    Université de Montpellier 2
    January 13, 2012
    15.15 - MA30

    Conditional Simulation of Brown-Resnick processes


    Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modelling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. Although some progress has been made to develop a geostatistic of extremes, conditional simulation of max-stable processes is still in its early stage. This paper proposes a framework to get conditional simulations of Brown-Resnick processes. Although closed forms for the regular conditional distribution of Brown-Resnick processes were recently found, sampling from this conditional distribution is a considerable challenge as it leads quickly to a combinatorial explosion. To bypass this computational burden, a Markov chain Monte-Carlo algorithm is presented. We test the method on simulated data and give an application to extreme rainfall around Zurich. Results show that the proposed framework provides accurate conditional simulations of Brown-Resnick processes and can handle real-sized problems.

  • Mr. Kayvan Sadeghi

    University of Oxford
    January 27, 2012
    15.15 - MA3

    Markov Equivalences for Mixed Graphs


    In this talk we describe a class of graphs with three types of edge, called loopless mixed graphs (LMGs). The class of LMGs contains almost all known classes of graphs used in the literature of graphical Markov models as its subclasses. All these graphs use the same interpretation of independence structure called $m$-separation. We motivate the use of LMGs and discuss a number of problems regarding Markov equivalences for LMGs.

  • Prof. Bin Yu [Joint Statistics/IC Seminar]

    University of California at Berkeley
    March 19, 2012
    15.15 - CM1105

    Sparse Modeling: Unified Theory and Movie Reconstruction based on Brain Signal


    Information technology has enabled the collection of massive amounts of data in science, engineering, social science, finance and beyond. Statistics is the science of data and indispensable for extracting useful information from high-dimensional data. After broad successes of statistical machine learning on prediction through regularization, interpretability is gaining attention and sparsity is used as its proxy. With the virtues of both regularization and sparsity, L1 penalized Least Squares(e.g. Lasso) has been intensively studied by researchers from statistics, applied mathematics and signal processing. Lasso is a special case of sparse modeling and has also been the focus of compressive sensing lately. In this talk, I would like to give an overview of both theory and pratcice of Lasso and its extensions. First, I will review theoretical results of Lasso and present an insightful unified M- estimation theory with decomposable penalties under sparse high dimensional statistical models. Second, I will present collaborative research with the Gallant Neuroscience Lab at Berkeley on human understanding visual pathway. In particular, I will show how sparsity modeling enters our movie reconstruction work (dubbed by the TIME Magazine as "mind-reading computers" as one of its 50 Best Inventions of 2011).

  • Prof. Sofia Olhede

    University College London
    April 27, 2012
    15.15 - MA30

    Fourier domain estimation of Lithospheric thickness and moderate sample effects for inference of random fields in the Fourier domain


    The inference for multivariate spatial random fields can either be approached in the spatial domain or in the Fourier domain. In the latter case, asymptotics have been studied and developed for the univariate case by Guyon (1982), Stein (1995), Fuentes (2007) and Robinson (2006). In Geophysics the model for the data is often stated in the Fourier domain, leaving no choice to the analyst in terms of the domain of study, and in addition Fourier domain estimation is usually implemented faster from a numerical perspective than spatial methods. If the spatial sampling is by no means perfect, strategies to ameliorate the moderate sampling effects must be developed. We shall study this problem for the special case of estimating lithospheric flexural rigidity from the set of observations of the Earth's topography and gravity. Such estimates are important as the shallow strength of the lithosphere, Earth's outer layer, influences a variety of processes such as earthquakes, post-glacial rebound, and the like. Traditionally this estimation problem is approached by an iterated technique, which may produce ill-posedness and inefficient estimation, as well as a number of logical fallacies depending on what spectral summary is used as the starting point of the estimation. I show how a fully maximum-likelihood method can be developed, and a number of modelling issues that need to be overcome in order for such methods to work. This is joint work with Frederik Simons (Princeton), sponsored by the EPSRC and the NSF.

  • Prof. Carlos Coelho

    Universidade Nova de Lisboa
    May 4, 2012
    15.15 - MA30

    On the distribution of the product of independent Beta random variables - why near-exact distributions?


    A first approach, based on recently obtained asymptotic expansions of ratios of gamma functions, enables the obtention of the distribution of the product of independent and identically distributed random variables in a much manageable form. However, for the general case, this approach leads to a form which although being much manageable and in line with some previous results, suffers from serious problems of precision and convergence, which have been completely overlooked by other authors and which in most cases prevent its practical use. Nevertheless, it is based on these first results that the authors, using the concept of near-exact distribution, are able to obtain highly manageable but extremely accurate approximations for all cases of the distribution of the product of independent beta random variables. These near-exact approximations, given their high manageability, accuracy and proximity to the exact distribution, may in practice be used instead of the exact distribution.

  • Dr. Aurore Delaigle

    University of Melbourne
    May 10, 2012
    15.15 - MA11

    Nonparametric Regression from Group Testing Data


    To reduce cost and increase speed of large screening studies, data are often pooled in groups. In these cases, instead of carrying out a test (say a blood test) on all individuals in the study to see if they if they are infected or not, one only tests the pooled blood of all individuals in each group. We consider this problem when a covariate is also observed, and one is interested in estimating the conditional probability of contamination. We show how to estimate this conditional probability using a simple nonparametric estimator. We illustrate the procedure on data from the NHANES study.

  • Prof. Laura Sangalli

    Politecnico di Milano
    May 25, 2012
    15.15 - Room TBA

    Spatial Regression Models with a Partial Differential Regularization


    Interfacing statistical methodology and numerical analysis techniques, we propose regression models with partial differential regularization, that accurately estimate surfaces and spatial fields. In particular, the proposed models are able to deal with data scattered over complex bi-dimensional domains, including domains with irregular shapes and holes and non-planar domains; moreover, they have the capacity to include a priori information about the spatial structure of the phenomenon under study. The estimation problem is solved resorting to finite elements and full uncertainty quantification is provided via inferential tools. The method is illustrated in various applied contexts, including data coming from eco-dopplers and computational fluid dynamics simulations. The seminar is based on joint work with James Ramsay (McGill University), and with Laura Azzimonti, Bree Ettinger, Simona Perotto and Piercesare Secchi (Politecnico di Milano).

  • Prof. David Donoho [Joint Mathematics and I&C Colloquium]

    Stanford University
    Friday, October 5, 2012
    12:15 - Rolex Forum
    Compressed Sensing: Examples, Prehistory, and Predictions


    From 2004 to today, the research topic "Compressed Sensing" (CS) became popular in applied mathematics, signal processing, and information theory, and was applied to fields as distant as computational biology and astronomical image processing. Some early papers have gained thousands of citations. Part of the attraction is paradox: CS claims to correctly solve systems of equations with fewer equations than unknowns. One success story for CS comes in pediatric magnetic resonance imaging, where blind trials published in a flagship medical journal by Vansanawala, Lustig et al. gave a 6X MRI speedup while maintaining diagnostic quality images. Concretely, children needed to sit still in an MRI machine for about 1 minute rather than 8 minutes. The prehistory of CS goes back on a metaphoric level to coin-balance weighing puzzles known for millennia and more specifically to convex geometry known for a hundred years, and continues throughout the last century in several very different fields of research. Part of the spectacular recent interest, is that several fields, from information theory to high-dimensional geometry, are convinced that they saw the key ideas first, and that they know the best way to think about it. This talk will review success stories, precursors, and four modern ways of understanding the problem, from four different disciplines.

  • Prof. George Michailidis

    University of Michigan, Ann Arbor
    Thursday, October 25
    15.15 - CM1221
    Joint Estimation of Multiple Graphical Models


    Gaussian graphical models explore dependence relationships between random variables, through estimation of the corresponding inverse covariance (precision) matrices. We consider the problem of estimating multiple related but distinct graphical models coming from heterogeneous data sets. We introduce two models for the problem at hand. The first model aims to preserve the common structure, while allowing for differences between the graphical models and the goal is achieved through a hierarchical penalty. The second model aims to explore a priori available information about structured sparsity between the graphical models. We discuss computational issues and establish norm and variable selection consistency properties for the two modeling strategies and illustrate the methods through simulated and real examples.

  • Swiss Statistics Seminar

    University of Bern
    Friday, October 26
    14.15 - 17.45 | Room B78
  • Prof. Aleksandra Slavkovic

    Penn State University
    November 2, 2012
    15.15 - CM1100
    Private Analysis of Social Networks


    Increasing volumes of personal and sensitive data are collected and archived by health networks, government agencies, search engines, social networking websites, and other organizations. The social networks, in particular, are a prominent source of data for researchers in economics, epidemiology, sociology and many other disciplines and have sparked a flurry of research in statistical methodology for network analysis. While the social benefits of analyzing these data are significant, their release can be devastating to the privacy of individuals and organizations. In this talk, we give a brief overview of challenges associated with protecting social network data and the problem of releasing summary statistics of graphs needed to build statistical models for networks while preserving privacy of individual relations. Our goal is to ensure that a user is able to perform standard statistical inference with the released data while satisfying a rigorous definition of privacy risk. We present an algorithm for releasing graphical degree sequences of simple undirected graphs under the framework of differential privacy. The algorithm is designed to provide utility for statistical inference in random graph models whose sufficient statistics are functions of degree sequences. Specifically, we focus on the tasks of existence of maximum likelihood estimates, parameter estimation and goodness-of-fit testing for the beta model of random graphs. We show the usefulness of our algorithm by evaluating it empirically on simulated and real-life datasets. As the released degree sequence is graphical, our algorithm can also be used to release synthetic graphs under the beta model.

  • Prof. Guy Nason

    University of Bristol
    November 29, 2012
    Aliasing and Spectral Correction for Locally Stationary Wavelet Time Series


    Aliasing occurs when the sampling rate is not fast enough to capture all the high frequencies in a time series. If one could detect aliasing reliably then one could act to increase the sampling rate. However, if aliasing is not present one could reduce the sampling rate and potentially reduce costs. For a stationary series we never really know whether data are subject to aliasing, potentially resulting in poor modelling and estimation of the true underlying situation and hence poorer forecasting. The situation is a bit different for nonstationary time series. In principle in the nonstationary situation one could track power from lower to higher frequencies (and conversely) to judge whether aliasing might be occurring. Generally, though, it is not known how to carry out alias detection for arbitrary processes. This talk examines what happens to locally stationary wavelet (LSW) processes under dyadic subsampling (aliasing) and shows how a hypothesis test can be constructed to look for the absence of aliasing in this case. For Shannon LSW processes we additionally show that components below the Nyquist frequency can be de-aliased. We exhibit our methods on simulations and an example from the wind power industry. (Joint work with Idris Eckley, Lancaster University).

Visitor Information

Directions for visitors

Mailing List

Please email Ms. Schaffner if you would like to be added to the seminar mailing list.