You are here:
Statistical colloquium
Summer semester 2024
Date and time | Details |
10.04.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
17.04.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
24.04.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
01.05.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
08.05.2024 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
15.05.2024 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
22.05.2024 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
24.05.2024 (!!) at 10:15, TU Dortmund University, M/ E 217b (!!) | Titel: Designing experiments on networks Vasiliki Koutra, King's College London The design of experiments provides a formal framework for the collection of data to aid decision making. When such experiments are performed on connected units linked through a network, the resulting design and analysis are more complex; e.g. is the observed response from a given unit due to the direct effect of the treatment applied to that unit or the result of a network, or viral, effect arising from treatments applied to connected units? In this talk, I propose a methodology for constructing efficient designs which control for variation among the experimental units arising from network interference, so that the direct treatment effects can be precisely estimated. Performance gains over conventional designs will be demonstrated via different example experiments. |
29.05.2024 at 14:15, TU Dortmund University, M/ E 21 | Title: Backtesting: two applied tales about prediction Prof. Dr. Ivan Mizera, University of Alberta Two stories about prediction under uncertainty in the applied setting are told. The first concerns the task of prediction of commercial breaks in targeted TV advertising via stochastic means, a research of industrial character realized in the collaboration with the Edmonton based company INVIDI. The second addresses the important risk-assessment task in non-life insurance, the prediction of the financial reserves guaranteeing the payment of existing and potential future insurance claims via so-called run-off triangles. What both stories have in common is their highly nonparametric, model-free character, with virtually no existing approaches to build on; and their empirical vindication accomplished via so-called backtesting on historical data. |
04.06.2024 at 10:15, TU Dortmund University, M/ E 25 (!!) | Titel: Modelling Dynamic Interaction Networks Using Counting Processes Alexander Kreiß, Junior Professor of Statistics, Leipzig In an interaction network one observes a network of vertices and edges. Two vertices can interact when they are connected by an edge, where an interaction is understood as an instantaneous event. A typical example would be users of a social media platform (the vertices) who interact by sending messages to each other when they are in a friendship relation (edges). In a dynamic interaction network we allow that the edges may appear and disappear over time. We suppose to observe the edges and the interactions in a given time period. In addition, we observe covariate processes which describe the network, individual vertices and the relations between pairs of vertices. In the talk we will review parametric and nonparametric models for this type of data based on counting process theory. We will show how these models can be used to incorporate global covariates, perform parameter estimation and goodness of fit tests. Mathematically, a challenge in these models is the dependence between pairs of vertices. We will discuss assumptions under which this dependence can be handled. Finally, we will also look at a real-world dataset on rental bikes to illustrate the models. |
05.06.2024 at 14:15, University Duisburg-Essen, room R11 T08 C01 in building R11 (!!) | Titel: Inference in Regression Discontinuity Designs with High-Dimensional Covariates Alexander Kreiß, Junior Professor of Statistics, Leipzig We study regression discontinuity designs in which many predetermined covariates, possibly much more than the number of observations, can be used to increase the precision of treatment effect estimates. We consider a two-step estimator which first selects a small number of ‘important’ covariates through a localised lasso-type procedure, and then, in a second step, estimates the treatment effect by including the selected covariates linearly into the usual local linear estimator. We provide an in-depth analysis of the algorithm’s theoretical properties, showing that, under an approximate sparsity condition, the resulting estimator is asymptotically normal, with asymptotic bias and variance that are conceptually similar to those obtained in low-dimensional settings. Bandwidth selection and inference can be carried out using standard methods. We also provide simulations and an empirical application. This is joint work with Christoph Rothe. |
12.06.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
19.06.2024 at 14:15, TU Dortmund University, M/ E 21 | Title: Testing and monitoring speculative bubbles Jörg Breitung, Institute for Econometrics and Statistics, University of Cologne We propose a heteroskedasticity-robust LBI statistic to test the hypothesis of a unit root against the alternative of an explosive root related to speculative bubbles. Compared to existing alternatives like Dickey-Fuller type tests, the proposed LBI test has a standard limiting distribution and higher power, especially in the empirically relevant case of a moderate explosive root. Further refinements, such as the point-optimal linear test, come remarkably close to the power envelope. To detect bubbles with an unknown starting date, sequential schemes based on forward and backward expanding windows are considered, with the stacked backward CUSUM procedure proposed by Otto and Breitung (2023) standing out as the most powerful sequential scheme in the homoskedastic case. For the case of time-varying volatility, a heteroskedasticity-robust MOSUM detector is proposed. Finally, we consider simple statistics for consistently estimating the starting date of the bubble. |
26.06.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
03.07.2024 at 14:15, TU Dortmund University, M/ E 21 | tba |
10.07.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
17.07.2024 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
Winter semester 2023/24
Date and time | Details |
18.10.2023 at 14:15, TU Dortmund, M/ E 21 | no seminar |
20.10.2023 at 9 am, TU Dortmund University, M/ E 27 | Title: Causal Discovery under Scaled Noise: Identifiability and Robust Estimation Alexander Marx, Post-Doctoral Fellow of the ETH AI Center, Zurich |
25.10.2023 at 14:15, TU Dortmund University, M/ E 21 | no seminar |
01.11.2023 at 14:15, TU Dortmund University, M/ E 21 | no seminar (All Saints' Day) |
08.11.2023 at 14:15, TU Dortmund University, M/ E 21 | Title: High-Dimensional L2Boosting: Rate of convergence Martin Spindler, Professor at the Faculty of Business Administration of University Hamburg with focus on Statistics with application in business administration |
15.11.2023 at 14:15, TU Dortmund University, M/ E 21 | Title: Isotonic Distributional Regression and Total Positivity Lutz Dümbgen, Professor for Statistics of University Bern In regression analyses, one is often interested in estimating the full conditional distribution of a response, given a covariate X, rather than just its mean, median or other features. Under the assumption that the conditional distribution of Y, given that X = x, is in some sense monotone increasing in x, this problem is solvable. Here one could work with the usual stochastic order or the stronger likelihood ratio order. This talk highlights both paradigms. Furthermore we discuss connections between the likelihood ratio order and total positivity of bivariate distributions. |
22.11.2023 at 14:15, TU Dortmund, M/ E 21 | Title: Statistical depth: Geometry of multivariate quantiles Stanislav Nagy, Assistant Professor, Charles University Prague, Czech Republic Statistical depth is a non-parametric tool applicable to multivariate and non-Euclidean data. Its goal is to reasonably generalise quantiles to multivariate and more exotic datasets. The first depth was proposed in statistics in 1975; rigorous investigation of depths started in the 1990s, and still, an abundance of open problems stimulates research in the area. We discuss two seminal depths: (i) the halfspace depth (Tukey, 1975) and (ii) the simplicial depth (Liu, 1988). We unveil surprising links between these depths and well-studied concepts from geometry and discrete mathematics. Using these relations, we partially resolve several open problems; in particular, the 30-year-old characterization conjecture, asking whether two different distributions can correspond to the same halfspace depth. |
29.11.2023 at 14:15, TU Dortmund, M/ E 21 | no seminar |
06.12.2023 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
13.12.2023 at 14:15, TU Dortmund University, M/ E 21 | Title: Prediction Bands for Functional Time Series Stathis Paparoditis, Professor at the Department of Mathematics and Statistics of the University of Cyprus A bootstrap procedure for constructing prediction bands for a stationary functional time series is proposed. The procedure exploits a general vector autoregressive representation of the time-reversed series of Fourier coefficients appearing in the Karhunen-Lo`eve representation of the functional process. It generates backward-in-time, functional replicates that adequately mimic the dependence structure of the underlying process in a model-free way and have the same conditionally fixed curves at the end of each functional pseudo-time series. The bootstrap prediction error distribution is then calculated as the difference between the model-free, bootstrap-generated future functional observations and the functional forecasts obtained from the model used for prediction. This allows the estimated prediction error distribution to account for the innovation and estimation errors associated with prediction and the possible errors due to model misspecification. We establish the asymptotic validity of the bootstrap procedure in estimating the conditional prediction error distribution of interest, and we also show that the procedure enables the construction of prediction bands that achieve (asymptotically) the desired coverage. Prediction bands based on a consistent estimation of the conditional distribution of the studentized prediction error process also are introduced. Such bands allow for taking more appropriately into account the local uncertainty of prediction. Through a simulation study and the analysis of two data sets, we demonstrate the capabilities and the good finite-sample performance of the proposed method. |
20.12.2023 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
24.01.2024 at 14:15, TU Dortmund University, M/ E 21 | Title: Testing for Dependence by Using Ordinal Patterns: an Introduction Christian Weiß, Professor of Quantitative Methods in Economics of HSU Hamburg About 20 years ago, ordinal patterns have been introduced as a simple, robust, and flexible tool for analyzing the serial dependence structure of univariate real-valued stochastic processes. If applied to continuously distributed processes, one can derive non-parametric tests of the null hypothesis that the process is independent and identically distributed. Recently, also more sophisticated tasks for dependence tests have been considered: serial dependence in a discrete-valued process, the sequential monitoring of serial dependence, cross-dependence in a multivariate process, and spatial dependence in a random field. This talk provides an introduction to the aforementioned topics together with illustrative examples, and it concludes by outlining perspectives for future research. |
Summer semester 2023
Date and time | Details |
19.04.2023 at 14:15, TU Dortmund University, M/ E 21 | Doctoral colloquium |
03.05.2023 at 14:15, TU Dortmund University, M/ E 21 | Title: Affine-equivariant inference for multivariate location under Lp loss functions Dr. Alexander Dürre, Assistant Professor, Mathematical Institute, Leiden University, Netherlands We consider the problem of estimating the location of a d-variate probability measure. The well known multivariate mean can be defined as minimizer of the expected squared Euclidean loss. Its respective estimator, the sample mean, is optimal under normality, but behaves poorly under heavy tails. In the onedimensional setting, the median is therefore often preferred if heavy tails cannot be ruled out. Contrary to the mean, it is defined as the minimizer of the expected absolute loss. Its intuitive multivariate generalization, the spatial median, minimizes the expected Euclidean loss. However, its estimator is not affine-equivariant which can lead to a very low efficiency. We propose a collection of Lp location estimators that minimize the size of suitable `-dimensional data-based simplices. For ` = 1, these estimators reduce to minimizers of empirical euclidean losses, whereas, for ` = d, they are equivariant under affine transformations. Irrespective of `, these estimators reduce to the sample mean for p = 2, whereas for p = 1, the estimators provide the spatial median and Oja median for ` = 1 and ` = d, respectively. Under very mild assumptions, we derive an explicit Bahadur representation and establish asymptotic normality for the new estimators. To allow for large sample size n and/or large dimension d, we introduce a version of our estimators relying on incomplete U-statistics. We also define related location tests and derive explicit expressions for the asymptotic power under contiguous local alternatives. Data applications illustrate the importance of the choice of ` and p. |
10.05.2023 at 14:15, TU Dortmund, M/ E 21 | Doctoral colloquium |
24.05.2023 at 14:15, TU Dortmund University, M/ E 21 | Title: Optimal sensor location for spatiotemporal systems Professor Dariusz Uciński, Institute of Control and Computation Engineering, University of Zielona Góra, Poland For dynamic systems described by partial differential equations it is impossible to observe their states over the entire spatial domain. A typical example is air pollution which is modelled by a convection-diffusion equation. When an experiment is going to be made to estimate the unknown system parameters, a major decision problem is how to locate discrete sensors so as to get the most valuable information about these parameters. Both researchers and practitioners do not doubt that making use of sensors placed in an "intelligent" manner may lead to dramatic gains in the achievable accuracy of the parameter estimates, so efficient sensor location strategies are highly desirable. In turn, the complexity of the sensor location problem implies that there are very few sensor placement methods which are readily applicable to practical situations. What is more, they are little known among researchers. The aim of the talk is to give account of both classical and recent original work on optimal sensor placement strategies for parameter identification in spatiotemporal processes. The reported work concerns the development of new techniques and algorithms or adopting methods which have been successful in akin fields of optimal control and optimum experimental design. While planning, real-valued functions of the Fisher information matrix of parameters are primarily employed as the performance indices to be minimized with respect to the sensor positions. Particular emphasis is placed on determining the "best" way to guide moving and scanning sensors, and making the solutions independent of the parameters to be identified. A couple of case studies regarding the design of air quality monitoring networks are adopted as an illustration aiming at showing the strength of the proposed approach in studying practical problems. |
31.05.2023 at 14:15, TU Dortmund, M/ E 21 | Title: Statistical Plasmode Simulations - Potentials and Challenges Dr. Nicholas Schreck Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg Statistical data simulation is essential in the development of statistical models and methods as well as in their performance evaluation. To capture complex data structures, in particular for high-dimensional data, a variety of simulation approaches have been introduced including parametric and the so-called plasmode simulations. While there are concerns about the realism of parametrically simulated data, it is widely claimed that plasmodes generate realistic data with some aspect of the "truth" known. However, there are no explicit guidelines or state-of-the-art on how to perform plasmode data simulations. We review existing literature and motivate and introduce the concept of statistical plasmode simulation. We discuss advantages and challenges of statistical plasmodes and provide a step-wise procedure for their generation, including key steps to their implementation and reporting. Throughout the talk, we illustrate the concept of statistical plasmodes as well as the proposed plasmode generation procedure by means of a public real RNA dataset on breast carcinoma patients. |
07.06.2023 at 14:15, TU Dortmund, M/ E 21 | Title: Uniform Inference in High-Dimensional Additive Models Prof. Dr. Jannis Kück, Professor of Economics, esp. Data Science in Economics, Düsseldorf Institute for Competition Economics (DICE), Heinrich Heine University Düsseldorf We develop a method for uniformly valid confidence bands of a nonparametric component f1 in the additive model Y = f1(X1) + ... + fp(Xp) + e in a high-dimensional setting. We employ sieve estimation and embed it in a high-dimensional Z-estimation framework that allows us to construct uniformly valid confidence bands for the first component f1. Our study extends the existing results for inference in high-dimensional additive models and clarifies the required assumptions. In a setting where the number of regressors p may increase with the sample size, a sparsity assumption is critical for our analysis. Furthermore, we run simulation studies that show that our proposed method delivers reliable results concerning the estimation and coverage properties even in small samples. Finally, we illustrate our procedure in an empirical application demonstrating the implementation and the use of the proposed method in practice. |
14.06.2023 at 14:15, TU Dortmund, M/ E 21 | Title: On the smoothed empirical distribution function Prof. Dr. Henryk Zähle, Mathematics Department, Saarland University In this talk, I consider a kernel-based smoothing of the empirical distribution function of a sample of size n. I will first present results on the existence and the exact rate of convergence to zero (as n → ∞) of a MISE-minimal bandwidth. I then discuss two databased choices of the bandwidth which turn out to be quite good and, in particular, lead to strongly consistent estimates of the unknown underlying distribution function. Finally, I also discuss the asymptotics of the corresponding (smoothed) empirical process. |
22.06.2023 at 16:45, TU Dortmund, CDI 120 | Title: Automated Claim Detection for Assisting Fact Checkers Sami Nenno, Alexander von Humboldt Institute for Internet and Society, Berlin Disinformation and Fake News are not new phenomena but in recent years they became an increasing problem for public discourse and democracies around the world. Even though the number of fact checking organizations has increased as well, journalists often express the need for computational tools to handle the flood of disinformation. Accordingly, in computer science, and NLP especially, there has been vivid research on automating parts of the fact checking pipeline. Claim detection, the task of automatically retrieving textual claims that require checking, has received the most interest among fact checkers. However, often researchers in the field define the task as classifying "checkworthy" claims without critically discussing what checkworthiness actually means or involves. I review some attempts and datasets and highlight their major shortcomings. I argue for a different task design that takes inspiration from what is known as "news values" in communication science. News values are criteria that guide journalists in selecting events that are worth being reported. In a similar vein, detecting checkworthy claims can be modeled as classifying claims to truth that meet certain criteria that are relevant with regard to disinformation and fact checking. I am presenting an annotated dataset for classifying claims to truth, show some empirical results of models trained on it, and discuss how this task integrates into a broader pipeline for detecting checkworthy claims. |
09.08.2023 at 14:00, TU Dortmund, M/ E 21 | Title: Optimistic bias in the evaluation of statistical methods: illustrations and possible solutions Christina Sauer, PhD student at the working group Biometry in Molecular Medicine, Faculty of Medicine, LMU Munich Statistical methods are a core element of empirical research, necessitating a comprehensive and unbiased evaluation of their strengths and weaknesses. Such evaluation should ideally be conducted not only by the method's authors but also by other researchers in subsequent comparison studies. In both cases, however, the high degree of flexibility in assessing method performance, including data set choice, competing methods, and evaluation criteria, can substantially impact the conclusions drawn for the investigated methods. In the worst case, this flexibility, combined with researchers' hopes and expectations, can lead to optimistically biased results. In this talk, I show an example of a benchmark study where systematic variations of benchmark components allow us to present any of the compared methods as superior or inferior as desired. Additionally, I discuss the results of a "cross-design validation experiment", where we select two methods designed for the same data analysis task, reproduce the original results, and then reevaluate each method based on the study design (i.e., datasets, competing methods, and evaluation criteria) that was used to show the abilities of the other method. Finally, I present an idea on how to make simulation studies more realistic and less prone to optimistic bias by systematically basing them on a sample of real data sets that were selected according to predefined inclusion criteria. |