We revisit the generalized hyperbolic (GH) distribution and its nested models. These include widely used parametric choices like the multivariate normal, skew-t, Laplace, and several others. We also introduce the multiple-choice LASSO, a novel penalized method for choosing among alternative constraints on the same parameter. A hierarchical multiple-choice LASSO penalized likelihood is optimized to perform simultaneous model selection and inference within the GH family. We illustrate our approach through a simulation study and a real data example about pre-term infants. The methodology proposed in this paper has been implemented in R functions which are available as supplementary material.
We present a rigorous and precise analysis of the maximum degree and the average degree in a dynamic duplication-divergence graph model introduced by Sol\'e, Pastor-Satorras et al. in which the graph grows according to a duplication-divergence mechanism, i.e. by iteratively creating a copy of some node and then randomly alternating the neighborhood of a new node with probability $p$. This model captures the growth of some real-world processes e.g. biological or social networks. In this paper, we prove that for some $0 < p < 1$ the maximum degree and the average degree of a duplication-divergence graph on $t$ vertices are asymptotically concentrated with high probability around $t^p$ and $\max\{t^{2 p - 1}, 1\}$, respectively, i.e. they are within at most a polylogarithmic factor from these values with probability at least $1 - t^{-A}$ for any constant $A > 0$.
We propose, analyze and realize a variational multiclass segmentation scheme that partitions a given image into multiple regions exhibiting specific properties. Our method determines multiple functions that encode the segmentation regions by minimizing an energy functional combining information from different channels. Multichannel image data can be obtained by lifting the image into a higher dimensional feature space using specific multichannel filtering or may already be provided by the imaging modality under consideration, such as an RGB image or multimodal medical data. Experimental results show that the proposed method performs well in various scenarios. In particular, promising results are presented for two medical applications involving classification of brain abscess and tumor growth, respectively. As main theoretical contributions, we prove the existence of global minimizers of the proposed energy functional and show its stability and convergence with respect to noisy inputs. In particular, these results also apply to the special case of binary segmentation, and these results are also novel in this particular situation.
The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relationships among input and output variables in ATM simulators, those usually of interest in policy analysis. Our experiments show XALM's predictive performance comparable to the XGBoost metamodel with fewer simulations. Additionally, XALM exhibits superior explanatory capabilities compared to non-active learning metamodels. Using the `Mercury' (flight and passenger) ATM simulator, XALM is applied to a real-world scenario in Paris Charles de Gaulle airport, extending an arrival manager's range and scope by analysing six variables. This case study illustrates XALM's effectiveness in enhancing simulation interpretability and understanding variable interactions. By addressing computational challenges and improving explainability, XALM complements traditional simulation-based analyses. Lastly, we discuss two practical approaches for reducing the computational burden of the metamodelling further: we introduce a stopping criterion for active learning based on the inherent uncertainty of the metamodel, and we show how the simulations used for the metamodel can be reused across key performance indicators, thus decreasing the overall number of simulations needed.
A slow decaying Kolmogorov n-width of the solution manifold of a parametric partial differential equation precludes the realization of efficient linear projection-based reduced-order models. This is due to the high dimensionality of the reduced space needed to approximate with sufficient accuracy the solution manifold. To solve this problem, neural networks, in the form of different architectures, have been employed to build accurate nonlinear regressions of the solution manifolds. However, the majority of the implementations are non-intrusive black-box surrogate models, and only a part of them perform dimension reduction from the number of degrees of freedom of the discretized parametric models to a latent dimension. We present a new intrusive and explicable methodology for reduced-order modelling that employs neural networks for solution manifold approximation but that does not discard the physical and numerical models underneath in the predictive/online stage. We will focus on autoencoders used to compress further the dimensionality of linear approximants of solution manifolds, achieving in the end a nonlinear dimension reduction. After having obtained an accurate nonlinear approximant, we seek for the solutions on the latent manifold with the residual-based nonlinear least-squares Petrov-Galerkin method, opportunely hyper-reduced in order to be independent from the number of degrees of freedom. New adaptive hyper-reduction strategies are developed along with the employment of local nonlinear approximants. We test our methodology on two nonlinear time-dependent parametric benchmarks involving a supersonic flow past a NACA airfoil with changing Mach number and an incompressible turbulent flow around the Ahmed body with changing slant angle.
Given samples from two non-negative random variables, we propose a new class of nonparametric tests for the null hypothesis that one random variable dominates the other with respect to second-order stochastic dominance. These tests are based on the Lorenz P-P plot (LPP), which is the composition between the inverse unscaled Lorenz curve of one distribution and the unscaled Lorenz curve of the other. The LPP exceeds the identity function if and only if the dominance condition is violated, providing a rather simple method to construct test statistics, given by functionals defined over the difference between the identity and the LPP. We determine a stochastic upper bound for such test statistics under the null hypothesis, and derive its limit distribution, to be approximated via bootstrap procedures. We also establish the asymptotic validity of the tests under relatively mild conditions, allowing for both dependent and independent samples. Finally, finite sample properties are investigated through simulation studies.
A physics-informed machine learning model, in the form of a multi-output Gaussian process, is formulated using the Euler-Bernoulli beam equation. Given appropriate datasets, the model can be used to regress the analytical value of the structure's bending stiffness, interpolate responses, and make probabilistic inferences on latent physical quantities. The developed model is applied on a numerically simulated cantilever beam, where the regressed bending stiffness is evaluated and the influence measurement noise on the prediction quality is investigated. Further, the regressed probabilistic stiffness distribution is used in a structural health monitoring context, where the Mahalanobis distance is employed to reason about the possible location and extent of damage in the structural system. To validate the developed framework, an experiment is conducted and measured heterogeneous datasets are used to update the assumed analytical structural model.
Partially linear additive models generalize linear ones since they model the relation between a response variable and covariates by assuming that some covariates have a linear relation with the response but each of the others enter through unknown univariate smooth functions. The harmful effect of outliers either in the residuals or in the covariates involved in the linear component has been described in the situation of partially linear models, that is, when only one nonparametric component is involved in the model. When dealing with additive components, the problem of providing reliable estimators when atypical data arise, is of practical importance motivating the need of robust procedures. Hence, we propose a family of robust estimators for partially linear additive models by combining $B-$splines with robust linear regression estimators. We obtain consistency results, rates of convergence and asymptotic normality for the linear components, under mild assumptions. A Monte Carlo study is carried out to compare the performance of the robust proposal with its classical counterpart under different models and contamination schemes. The numerical experiments show the advantage of the proposed methodology for finite samples. We also illustrate the usefulness of the proposed approach on a real data set.
Penalized $M-$estimators for logistic regression models have been previously study for fixed dimension in order to obtain sparse statistical models and automatic variable selection. In this paper, we derive asymptotic results for penalized $M-$estimators when the dimension $p$ grows to infinity with the sample size $n$. Specifically, we obtain consistency and rates of convergence results, for some choices of the penalty function. Moreover, we prove that these estimators consistently select variables with probability tending to 1 and derive their asymptotic distribution.
This work outlines a time-domain numerical integration technique for linear hyperbolic partial differential equations sourced by distributions (Dirac $\delta$-functions and their derivatives). Such problems arise when studying binary black hole systems in the extreme mass ratio limit. We demonstrate that such source terms may be converted to effective domain-wide sources when discretized, and we introduce a class of time-steppers that directly account for these discontinuities in time integration. Moreover, our time-steppers are constructed to respect time reversal symmetry, a property that has been connected to conservation of physical quantities like energy and momentum in numerical simulations. To illustrate the utility of our method, we numerically study a distributionally-sourced wave equation that shares many features with the equations governing linear perturbations to black holes sourced by a point mass.
This paper concerns an expansion of first-order Belnap-Dunn logic which is called $\mathrm{BD}^{\supset,\mathsf{F}}$. Its connectives and quantifiers are all familiar from classical logic and its logical consequence relation is very closely connected to the one of classical logic. Results that convey this close connection are established. Fifteen classical laws of logical equivalence are used to distinguish $\mathrm{BD}^{\supset,\mathsf{F}}$ from all other four-valued logics with the same connectives and quantifiers whose logical consequence relation is as closely connected to the logical consequence relation of classical logic. It is shown that several interesting non-classical connectives added to Belnap-Dunn logic in its expansions that have been studied earlier are definable in $\mathrm{BD}^{\supset,\mathsf{F}}$. It is also established that $\mathrm{BD}^{\supset,\mathsf{F}}$ is both paraconsistent and paracomplete. Moreover, a sequent calculus proof system that is sound and complete with respect to the logical consequence relation of $\mathrm{BD}^{\supset,\mathsf{F}}$ is presented.