The goal of anomaly detection is to identify observations that are generated by a distribution that differs from the reference distribution that qualifies normal behavior. When examining a time series, the reference distribution may evolve over time. The anomaly detector must therefore be able to adapt to such changes. In the online context, it is particularly difficult to adapt to abrupt and unpredictable changes. Our solution to this problem is based on the detection of breakpoints in order to adapt in real time to the new reference behavior of the series and to increase the accuracy of the anomaly detection. This solution also provides a control of the False Discovery Rate by extending methods developed for stationary series.
Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of $m$ new points, giving rise to $m$ conformal $p$-values. While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function. The results hold for arbitrary exchangeable scores, including adaptive ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.
One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which information is distributed across a set of observables is computationally challenging and generally infeasible beyond a handful of measurements. Here we propose a practical and general methodology that uses machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. Guided by the distributed information bottleneck as a learning objective, the information decomposition identifies the variation in the measurements of the system state most relevant to specified macroscale behavior. We focus our analysis on two paradigmatic complex systems: a Boolean circuit and an amorphous material undergoing plastic deformation. In both examples, the large amount of entropy of the system state is decomposed, bit by bit, in terms of what is most related to macroscale behavior. The identification of meaningful variation in data, with the full generality brought by information theory, is made practical for studying the connection between micro- and macroscale structure in complex systems.
The direct parametrisation method for invariant manifold is a model-order reduction technique that can be applied to nonlinear systems described by PDEs and discretised e.g. with a finite element procedure in order to derive efficient reduced-order models (ROMs). In nonlinear vibrations, it has already been applied to autonomous and non-autonomous problems to propose ROMs that can compute backbone and frequency-response curves of structures with geometric nonlinearity. While previous developments used a first-order expansion to cope with the non-autonomous term, this assumption is here relaxed by proposing a different treatment. The key idea is to enlarge the dimension of the parametrising coordinates with additional entries related to the forcing. A new algorithm is derived with this starting assumption and, as a key consequence, the resonance relationships appearing through the homological equations involve multiple occurrences of the forcing frequency, showing that with this new development, ROMs for systems exhibiting a superharmonic resonance, can be derived. The method is implemented and validated on academic test cases involving beams and arches. It is numerically demonstrated that the method generates efficient ROMs for problems involving 3:1 and 2:1 superharmonic resonances, as well as converged results for systems where the first-order truncation on the non-autonomous term showed a clear limitation.
The use of discretized variables in the development of prediction models is a common practice, in part because the decision-making process is more natural when it is based on rules created from segmented models. Although this practice is perhaps more common in medicine, it is extensible to any area of knowledge where a predictive model helps in decision-making. Therefore, providing researchers with a useful and valid categorization method could be a relevant issue when developing prediction models. In this paper, we propose a new general methodology that can be applied to categorize a predictor variable in any regression model where the response variable belongs to the exponential family distribution. Furthermore, it can be applied in any multivariate context, allowing to categorize more than one continuous covariate simultaneously. In addition, a computationally very efficient method is proposed to obtain the optimal number of categories, based on a pseudo-BIC proposal. Several simulation studies have been conducted in which the efficiency of the method with respect to both the location and the number of estimated cut-off points is shown. Finally, the categorization proposal has been applied to a real data set of 543 patients with chronic obstructive pulmonary disease from Galdakao Hospital's five outpatient respiratory clinics, who were followed up for 10 years. We applied the proposed methodology to jointly categorize the continuous variables six-minute walking test and forced expiratory volume in one second in a multiple Poisson generalized additive model for the response variable rate of the number of hospital admissions by years of follow-up. The location and number of cut-off points obtained were clinically validated as being in line with the categorizations used in the literature.
Mendelian randomization uses genetic variants as instrumental variables to make causal inferences about the effects of modifiable risk factors on diseases from observational data. One of the major challenges in Mendelian randomization is that many genetic variants are only modestly or even weakly associated with the risk factor of interest, a setting known as many weak instruments. Many existing methods, such as the popular inverse-variance weighted (IVW) method, could be biased when the instrument strength is weak. To address this issue, the debiased IVW (dIVW) estimator, which is shown to be robust to many weak instruments, was recently proposed. However, this estimator still has non-ignorable bias when the effective sample size is small. In this paper, we propose a modified debiased IVW (mdIVW) estimator by multiplying a modification factor to the original dIVW estimator. After this simple correction, we show that the bias of the mdIVW estimator converges to zero at a faster rate than that of the dIVW estimator under some regularity conditions. Moreover, the mdIVW estimator has smaller variance than the dIVW estimator.We further extend the proposed method to account for the presence of instrumental variable selection and balanced horizontal pleiotropy. We demonstrate the improvement of the mdIVW estimator over the dIVW estimator through extensive simulation studies and real data analysis.
When complex Bayesian models exhibit implausible behaviour, one solution is to assemble available information into an informative prior. Challenges arise as prior information is often only available for the observable quantity, or some model-derived marginal quantity, rather than directly pertaining to the natural parameters in our model. We propose a method for translating available prior information, in the form of an elicited distribution for the observable or model-derived marginal quantity, into an informative joint prior. Our approach proceeds given a parametric class of prior distributions with as yet undetermined hyperparameters, and minimises the difference between the supplied elicited distribution and corresponding prior predictive distribution. We employ a global, multi-stage Bayesian optimisation procedure to locate optimal values for the hyperparameters. Three examples illustrate our approach: a cure-fraction survival model, where censoring implies that the observable quantity is a priori a mixed discrete/continuous quantity; a setting in which prior information pertains to $R^{2}$ -- a model-derived quantity; and a nonlinear regression model.
Fourth-order variational inequalities are encountered in various scientific and engineering disciplines, including elliptic optimal control problems and plate obstacle problems. In this paper, we consider additive Schwarz methods for solving fourth-order variational inequalities. Based on a unified framework of various finite element methods for fourth-order variational inequalities, we develop one- and two-level additive Schwarz methods. We prove that the two-level method is scalable in the sense that the convergence rate of the method depends on $H/h$ and $H/\delta$ only, where $h$ and $H$ are the typical diameters of an element and a subdomain, respectively, and $\delta$ measures the overlap among the subdomains. This proof relies on a new nonlinear positivity-preserving coarse interpolation operator, the construction of which was previously unknown. To the best of our knowledge, this analysis represents the first investigation into the scalability of the two-level additive Schwarz method for fourth-order variational inequalities. Our theoretical results are verified by numerical experiments.
Accurate and precise climate projections are required for climate adaptation and mitigation, but Earth system models still exhibit great uncertainties. Several approaches have been developed to reduce the spread of climate projections and feedbacks, yet those methods cannot capture the non-linear complexity inherent in the climate system. Using a Transfer Learning approach, we show that Machine Learning can be used to optimally leverage and merge the knowledge gained from Earth system models simulations and historical observations to more accurately project global surface air temperature fields in the 21st century. We reach an uncertainty reduction of more than 50% with respect to state-of-the-art approaches. We give evidence that our novel method provides narrower projection uncertainty together with more accurate mean climate projections, urgently required for climate adaptation.
We propose a new framework of Hessian-free force-gradient integrators that do not require the analytical expression of the force-gradient term based on the Hessian of the potential. Due to that the new class of decomposition algorithms for separable Hamiltonian systems with quadratic kinetic energy may be particularly useful when applied to Hamiltonian systems where an evaluation of the Hessian is significantly more expensive than an evaluation of its gradient, e.g. in molecular dynamics simulations of classical systems. Numerical experiments of an N-body problem, as well as applications to the molecular dynamics step in the Hybrid Monte Carlo (HMC) algorithm for lattice simulations of the Schwinger model and Quantum Chromodynamics (QCD) verify these expectations.
The Fisher-Rao distance between two probability distributions of a statistical model is defined as the Riemannian geodesic distance induced by the Fisher information metric. In order to calculate the Fisher-Rao distance in closed-form, we need (1) to elicit a formula for the Fisher-Rao geodesics, and (2) to integrate the Fisher length element along those geodesics. We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances: First, we report generic upper bounds on Fisher-Rao distances based on closed-form 1D Fisher-Rao distances of submodels. Second, we describe several generic approximation schemes depending on whether the Fisher-Rao geodesics or pregeodesics are available in closed-form or not. In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation provided that Fisher-Rao pregeodesics and tight lower and upper bounds are available. Third, we consider the case of Fisher metrics being Hessian metrics, and report generic tight upper bounds on the Fisher-Rao distances using techniques of information geometry. Uniparametric and biparametric statistical models always have Fisher Hessian metrics, and in general a simple test allows to check whether the Fisher information matrix yields a Hessian metric or not. Fourth, we consider elliptical distribution families and show how to apply the above techniques to these models. We also propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics, or based on the Birkhoff/Hilbert projective cone distance. Last, we consider an alternative group-theoretic approach for statistical transformation models based on the notion of maximal invariant which yields insights on the structures of the Fisher-Rao distance formula which may be used fruitfully in applications.