Advanced science and technology provide a wealth of big data from different sources for extreme value analysis.Classic extreme value theory was extended to obtain an accelerated max-stable distribution family for modelling competing risk-based extreme data in Cao and Zhang (2021). In this paper, we establish probability models for power normalized maxima and minima from competing risks. The limit distributions consist of an extensional new accelerated max-stable and min-stable distribution family (termed as the accelerated p-max/p-min stable distribution), and its left-truncated version. The limit types of distributions are determined principally by the sample generating process and the interplay among the competing risks, which are illustrated by common examples. Further, the statistical inference concerning the maximum likelihood estimation and model diagnosis of this model was investigated. Numerical studies show first the efficient approximation of all limit scenarios as well as its comparable convergence rate in contrast with those under linear normalization, and then present the maximum likelihood estimation and diagnosis of accelerated p-max/p-min stable models for simulated data sets. Finally, two real datasets concerning annual maximum of ground level ozone and survival times of Stanford heart plant demonstrate the performance of our accelerated p-max and accelerated p-min stable models.
We explore two approaches to creatively altering vocal timbre using Differentiable Digital Signal Processing (DDSP). The first approach is inspired by classic cross-synthesis techniques. A pretrained DDSP decoder predicts a filter for a noise source and a harmonic distribution, based on pitch and loudness information extracted from the vocal input. Before synthesis, the harmonic distribution is modified by interpolating between the predicted distribution and the harmonics of the input. We provide a real-time implementation of this approach in the form of a Neutone model. In the second approach, autoencoder models are trained on datasets consisting of both vocal and instrument training data. To apply the effect, the trained autoencoder attempts to reconstruct the vocal input. We find that there is a desirable "sweet spot" during training, where the model has learned to reconstruct the phonetic content of the input vocals, but is still affected by the timbre of the instrument mixed into the training data. After further training, that effect disappears. A perceptual evaluation compares the two approaches. We find that the autoencoder in the second approach is able to reconstruct intelligible lyrical content without any explicit phonetic information provided during training.
In extreme value theory and other related risk analysis fields, probability weighted moments (PWM) have been frequently used to estimate the parameters of classical extreme value distributions. This method-of-moment technique can be applied when second moments are finite, a reasonable assumption in many environmental domains like climatological and hydrological studies. Three advantages of PWM estimators can be put forward: their simple interpretations, their rapid numerical implementation and their close connection to the well-studied class of U-statistics. Concerning the later, this connection leads to precise asymptotic properties, but non asymptotic bounds have been lacking when off-the-shelf techniques (Chernoff method) cannot be applied, as exponential moment assumptions become unrealistic in many extreme value settings. In addition, large values analysis is not immune to the undesirable effect of outliers, for example, defective readings in satellite measurements or possible anomalies in climate model runs. Recently, the treatment of outliers has sparked some interest in extreme value theory, but results about finite sample bounds in a robust extreme value theory context are yet to be found, in particular for PWMs or tail index estimators. In this work, we propose a new class of robust PWM estimators, inspired by the median-of-means framework of Devroye et al. (2016). This class of robust estimators is shown to satisfy a sub-Gaussian inequality when the assumption of finite second moments holds. Such non asymptotic bounds are also derived under the general contamination model. Our main proposition confirms theoretically a trade-off between efficiency and robustness. Our simulation study indicates that, while classical estimators of PWMs can be highly sensitive to outliers.
In epidemiological studies, the capture-recapture (CRC) method is a powerful tool that can be used to estimate the number of diseased cases or potentially disease prevalence based on data from overlapping surveillance systems. Estimators derived from log-linear models are widely applied by epidemiologists when analyzing CRC data. The popularity of the log-linear model framework is largely associated with its accessibility and the fact that interaction terms can allow for certain types of dependency among data streams. In this work, we shed new light on significant pitfalls associated with the log-linear model framework in the context of CRC using real data examples and simulation studies. First, we demonstrate that the log-linear model paradigm is highly exclusionary. That is, it can exclude, by design, many possible estimates that are potentially consistent with the observed data. Second, we clarify the ways in which regularly used model selection metrics (e.g., information criteria) are fundamentally deceiving in the effort to select a best model in this setting. By focusing attention on these important cautionary points and on the fundamental untestable dependency assumption made when fitting a log-linear model to CRC data, we hope to improve the quality of and transparency associated with subsequent surveillance-based CRC estimates of case counts.
We present new results on average causal effects in settings with unmeasured exposure-outcome confounding. Our results are motivated by a class of estimands, e.g., frequently of interest in medicine and public health, that are currently not targeted by standard approaches for average causal effects. We recognize these estimands as queries about the average causal effect of an intervening variable. We anchor our introduction of these estimands in an investigation of the role of chronic pain and opioid prescription patterns in the opioid epidemic, and illustrate how conventional approaches will lead unreplicable estimates with ambiguous policy implications. We argue that our altenative effects are replicable and have clear policy implications, and furthermore are non-parametrically identified by the classical frontdoor formula. As an independent contribution, we derive a new semiparametric efficient estimator of the frontdoor formula with a uniform sample boundedness guarantee. This property is unique among previously-described estimators in its class, and we demonstrate superior performance in finite-sample settings. Theoretical results are applied with data from the National Health and Nutrition Examination Survey.
Elliptical distribution is a basic assumption underlying many multivariate statistical methods. For example, in sufficient dimension reduction and statistical graphical models, this assumption is routinely imposed to simplify the data dependence structure. Before applying such methods, we need to decide whether the data are elliptically distributed. Currently existing tests either focus exclusively on spherical distributions, or rely on bootstrap to determine the null distribution, or require specific forms of the alternative distribution. In this paper, we introduce a general nonparametric test for elliptical distribution based on kernel embedding of the probability measure that embodies the two properties that characterize an elliptical distribution: namely, after centering and rescaling, (1) the direction and length of the random vector are independent, and (2) the directional vector is uniformly distributed on the unit sphere. We derive the null asymptotic distribution of the test statistic via von-Mises expansion, develop the sample-level procedure to determine the rejection region, and establish the consistency and validity of the proposed test. We apply our test to a SENIC dataset with and without a transformation aimed to achieve ellipticity.
A novel positive dependence property is introduced, called positive measure inducing (PMI for short), being fulfilled by numerous copula classes, including Gaussian, Fr\'echet, Farlie-Gumbel-Morgenstern and Frank copulas; it is conjectured that even all positive quadrant dependent Archimedean copulas meet this property. From a geometric viewpoint, a PMI copula concentrates more mass near the main diagonal than in the opposite diagonal. A striking feature of PMI copulas is that they impose an ordering on a certain class of copula-induced measures of concordance, the latter originating in Edwards et al. (2004) and including Spearman's rho $\rho$ and Gini's gamma $\gamma$, leading to numerous new inequalities such as $3 \gamma \geq 2 \rho$. The measures of concordance within this class are estimated using (classical) empirical copulas and the intrinsic construction via empirical checkerboard copulas, and the estimators' asymptotic behaviour is determined. Building upon the presented inequalities, asymptotic tests are constructed having the potential of being used for detecting whether the underlying dependence structure of a given sample is PMI, which in turn can be used for excluding certain copula families from model building. The excellent performance of the tests is demonstrated in a simulation study and by means of a real-data example.
We consider the multiple testing of the general regression framework aiming at studying the relationship between a univariate response and a p-dimensional predictor. To test the hypothesis of the effect of each predictor, we construct an Angular Balanced Statistic (ABS) based on the estimator of the sliced inverse regression without assuming a model of the conditional distribution of the response. According to the developed limiting distribution results in this paper, we have shown that ABS is asymptotically symmetric with respect to zero under the null hypothesis. We then propose a Model-free multiple Testing procedure using Angular balanced statistics (MTA) and show theoretically that the false discovery rate of this method is less than or equal to a designated level asymptotically. Numerical evidence has shown that the MTA method is much more powerful than its alternatives, subject to the control of the false discovery rate.
Estimating heterogeneous treatment effects is crucial for informing personalized treatment strategies and policies. While multiple studies can improve the accuracy and generalizability of results, leveraging them for estimation is statistically challenging. Existing approaches often assume identical heterogeneous treatment effects across studies, but this may be violated due to various sources of between-study heterogeneity, including differences in study design, confounders, and sample characteristics. To this end, we propose a unifying framework for multi-study heterogeneous treatment effect estimation that is robust to between-study heterogeneity in the nuisance functions and treatment effects. Our approach, the multi-study R-learner, extends the R-learner to obtain principled statistical estimation with modern machine learning (ML) in the multi-study setting. The multi-study R-learner is easy to implement and flexible in its ability to incorporate ML for estimating heterogeneous treatment effects, nuisance functions, and membership probabilities, which borrow strength across heterogeneous studies. It achieves robustness in confounding adjustment through its loss function and can leverage both randomized controlled trials and observational studies. We provide asymptotic guarantees for the proposed method in the case of series estimation and illustrate using real cancer data that it has the lowest estimation error compared to existing approaches in the presence of between-study heterogeneity.
In this brief note, we consider estimation of the bitwise combination $x_1 \lor \dots \lor x_n = \max_i x_i$ observing a set of noisy bits $\tilde x_i \in \{0, 1\}$ that represent the true, unobserved bits $x_i \in \{0, 1\}$ under randomized response. We demonstrate that various existing estimators for the extreme bit, including those based on computationally costly estimates of the sum of bits, can be reduced to a simple closed form computed in linear time (in $n$) and constant space, including in an online fashion as new $\tilde x_i$ are observed. In particular, we derive such an estimator and provide its variance using only elementary techniques.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.