Average-link is widely recognized as one of the most popular and effective methods for building hierarchical agglomerative clustering. The available theoretical analyses show that this method has a much better approximation than other popular heuristics, as single-linkage and complete-linkage, regarding variants of Dasgupta's cost function [STOC 2016]. However, these analyses do not separate average-link from a random hierarchy and they are not appealing for metric spaces since every hierarchical clustering has a 1/2 approximation with regard to the variant of Dasgupta's function that is employed for dissimilarity measures [Moseley and Yang 2020]. In this paper, we present a comprehensive study of the performance of average-link in metric spaces, regarding several natural criteria that capture separability and cohesion and are more interpretable than Dasgupta's cost function and its variants. We also present experimental results with real datasets that, together with our theoretical analyses, suggest that average-link is a better choice than other related methods when both cohesion and separability are important goals.
Local reconstruction analysis (LRA) is a powerful and flexible technique to study images reconstructed from discrete generalized Radon transform (GRT) data, $g=\mathcal R f$. The main idea of LRA is to obtain a simple formula to accurately approximate an image, $f_\epsilon(x)$, reconstructed from discrete data $g(y_j)$ in an $\epsilon$-neighborhood of a point, $x_0$. The points $y_j$ lie on a grid with step size of order $\epsilon$ in each direction. In this paper we study an iterative reconstruction algorithm, which consists of minimizing a quadratic cost functional. The cost functional is the sum of a data fidelity term and a Tikhonov regularization term. The function $f$ to be reconstructed has a jump discontinuity across a smooth surface $\mathcal S$. Fix a point $x_0\in\mathcal S$ and any $A>0$. The main result of the paper is the computation of the limit $\Delta F_0(\check x;x_0):=\lim_{\epsilon\to0}(f_\epsilon(x_0+\epsilon\check x)-f_\epsilon(x_0))$, where $f_\epsilon$ is the solution to the minimization problem and $|\check x|\le A$. A numerical experiment with a circular GRT demonstrates that $\Delta F_0(\check x;x_0)$ accurately approximates the actual reconstruction obtained by the cost functional minimization.
Numerical schemes that conserve invariants have demonstrated superior performance in various contexts, and several unified methods have been developed for constructing such schemes. However, the mathematical properties of these schemes remain poorly understood, except in norm-preserving cases. This study introduces a novel analytical framework applicable to general energy-preserving schemes. The proposed framework is applied to Korteweg-de Vries (KdV)-type equations, establishing global existence and convergence estimates for the numerical solutions.
Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalized linear models. Many improvements and sophistications to the first gradient boosting machine algorithm exist. We present in a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms: GBM, XGBoost, DART, LightGBM, CatBoost, EGBM, PGBM, XGBoostLSS, cyclic GBM, and NGBoost. In this comprehensive numerical study, we compare their performance on five publicly available datasets for claim frequency and severity, of various size and comprising different number of (high cardinality) categorical variables. We explain how varying exposure-to-risk can be handled with boosting in frequency models. We compare the algorithms on the basis of computational efficiency, predictive performance, and model adequacy. LightGBM and XGBoostLSS win in terms of computational efficiency. The fully interpretable EGBM achieves competitive predictive performance compared to the black box algorithms considered. We find that there is no trade-off between model adequacy and predictive accuracy: both are achievable simultaneously.
The optimization of large-scale multibody systems is a numerically challenging task, in particular when considering multiple conflicting criteria at the same time. In this situation, we need to approximate the Pareto set of optimal compromises, which is significantly more expensive than finding a single optimum in single-objective optimization. To prevent large costs, the usage of surrogate models, constructed from a small but informative number of expensive model evaluations, is a very popular and widely studied approach. The central challenge then is to ensure a high quality (that is, near-optimality) of the solutions that were obtained using the surrogate model, which can be hard to guarantee with a single pre-computed surrogate. We present a back-and-forth approach between surrogate modeling and multi-objective optimization to improve the quality of the obtained solutions. Using the example of an expensive-to-evaluate multibody system, we compare different strategies regarding multi-objective optimization, sampling and also surrogate modeling, to identify the most promising approach in terms of computational efficiency and solution quality.
This work studies the parameter-dependent diffusion equation in a two-dimensional domain consisting of locally mirror symmetric layers. It is assumed that the diffusion coefficient is a constant in each layer. The goal is to find approximate parameter-to-solution maps that have a small number of terms. It is shown that in the case of two layers one can find a solution formula consisting of three terms with explicit dependencies on the diffusion coefficient. The formula is based on decomposing the solution into orthogonal parts related to both of the layers and the interface between them. This formula is then expanded to an approximate one for the multi-layer case. We give an analytical formula for square layers and use the finite element formulation for more general layers. The results are illustrated with numerical examples and have applications for reduced basis methods by analyzing the Kolmogorov n-width.
We propose a method utilizing physics-informed neural networks (PINNs) to solve Poisson equations that serve as control variates in the computation of transport coefficients via fluctuation formulas, such as the Green--Kubo and generalized Einstein-like formulas. By leveraging approximate solutions to the Poisson equation constructed through neural networks, our approach significantly reduces the variance of the estimator at hand. We provide an extensive numerical analysis of the estimators and detail a methodology for training neural networks to solve these Poisson equations. The approximate solutions are then incorporated into Monte Carlo simulations as effective control variates, demonstrating the suitability of the method for moderately high-dimensional problems where fully deterministic solutions are computationally infeasible.
Preconditioned eigenvalue solvers offer the possibility to incorporate preconditioners for the solution of large-scale eigenvalue problems, as they arise from the discretization of partial differential equations. The convergence analysis of such methods is intricate. Even for the relatively simple preconditioned inverse iteration (PINVIT), which targets the smallest eigenvalue of a symmetric positive definite matrix, the celebrated analysis by Neymeyr is highly nontrivial and only yields convergence if the starting vector is fairly close to the desired eigenvector. In this work, we prove a new non-asymptotic convergence result for a variant of PINVIT. Our proof proceeds by analyzing an equivalent Riemannian steepest descent method and leveraging convexity-like properties. We show a convergence rate that nearly matches the one of PINVIT. As a major benefit, we require a condition on the starting vector that tends to be less stringent. This improved global convergence property is demonstrated for two classes of preconditioners with theoretical bounds and a range of numerical experiments.
Many real-world processes have complex tail dependence structures that cannot be characterized using classical Gaussian processes. More flexible spatial extremes models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from in high dimensions. In this paper, we aim to push the boundaries on computation and modeling of high-dimensional spatial extremes via integrating a new spatial extremes model that has flexible and non-stationary dependence properties in the encoding-decoding structure of a variational autoencoder called the XVAE. The XVAE can emulate spatial observations and produce outputs that have the same statistical properties as the inputs, especially in the tail. Our approach also provides a novel way of making fast inference with complex extreme-value processes. Through extensive simulation studies, we show that our XVAE is substantially more time-efficient than traditional Bayesian inference while outperforming many spatial extremes models with a stationary dependence structure. Lastly, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes 30 years of daily measurements at 16703 grid cells. We demonstrate how to use XVAE to identify regions susceptible to marine heatwaves under climate change and examine the spatial and temporal variability of the extremal dependence structure.
A statistical network model with overlapping communities can be generated as a superposition of mutually independent random graphs of varying size. The model is parameterized by the number of nodes, the number of communities, and the joint distribution of the community size and the edge probability. This model admits sparse parameter regimes with power-law limiting degree distributions and non-vanishing clustering coefficients. This article presents large-scale approximations of clique and cycle frequencies for graph samples generated by the model, which are valid for regimes with unbounded numbers of overlapping communities. Our results reveal the growth rates of these subgraph frequencies and show that their theoretical densities can be reliably estimated from data.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.