The number of modes in a probability density function is representative of the model's complexity and can also be viewed as the number of existing subpopulations. Despite its relevance, little research has been devoted to its estimation. Focusing on the univariate setting, we propose a novel approach targeting prediction accuracy inspired by some overlooked aspects of the problem. We argue for the need for structure in the solutions, the subjective and uncertain nature of modes, and the convenience of a holistic view blending global and local density properties. Our method builds upon a combination of flexible kernel estimators and parsimonious compositional splines. Feature exploration, model selection and mode testing are implemented in the Bayesian inference paradigm, providing soft solutions and allowing to incorporate expert judgement in the process. The usefulness of our proposal is illustrated through a case study in sports analytics, showcasing multiple companion visualisation tools. A thorough simulation study demonstrates that traditional modality-driven approaches paradoxically struggle to provide accurate results. In this context, our method emerges as a top-tier alternative offering innovative solutions for analysts.
We are interested in numerical algorithms for computing the electrical field generated by a charge distribution localized on scale $l$ in an infinite heterogeneous correlated random medium, in a situation where the medium is only known in a box of diameter $L\gg l$ around the support of the charge. We show that the algorithm of Lu, Otto and Wang, suggesting optimal Dirichlet boundary conditions motivated by the multipole expansion of Bella, Giunti and Otto, still performs well in correlated media. With overwhelming probability, we obtain a convergence rate in terms of $l$, $L$ and the size of the correlations for which optimality is supported with numerical simulations. These estimates are provided for ensembles which satisfy a multi-scale logarithmic Sobolev inequality, where our main tool is an extension of the semi-group estimates established by the first author. As part of our strategy, we construct sub-linear second-order correctors in this correlated setting which is of independent interest.
Given a boolean formula $\Phi$(X, Y, Z), the Max\#SAT problem asks for finding a partial model on the set of variables X, maximizing its number of projected models over the set of variables Y. We investigate a strict generalization of Max\#SAT allowing dependencies for variables in X, effectively turning it into a synthesis problem. We show that this new problem, called DQMax\#SAT, subsumes both the DQBF and DSSAT problems. We provide a general resolution method, based on a reduction to Max\#SAT, together with two improvements for dealing with its inherent complexity. We further discuss a concrete application of DQMax\#SAT for symbolic synthesis of adaptive attackers in the field of program security. Finally, we report preliminary results obtained on the resolution of benchmark problems using a prototype DQMax\#SAT solver implementation.
With the increasing availability of large scale datasets, computational power and tools like automatic differentiation and expressive neural network architectures, sequential data are now often treated in a data-driven way, with a dynamical model trained from the observation data. While neural networks are often seen as uninterpretable black-box architectures, they can still benefit from physical priors on the data and from mathematical knowledge. In this paper, we use a neural network architecture which leverages the long-known Koopman operator theory to embed dynamical systems in latent spaces where their dynamics can be described linearly, enabling a number of appealing features. We introduce methods that enable to train such a model for long-term continuous reconstruction, even in difficult contexts where the data comes in irregularly-sampled time series. The potential for self-supervised learning is also demonstrated, as we show the promising use of trained dynamical models as priors for variational data assimilation techniques, with applications to e.g. time series interpolation and forecasting.
Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions for the marginal likelihood. The RJMCMC approach can be employed to samples model and coefficients jointly, but effective design of the transdimensional jumps of RJMCMC can be challenge, making it hard to implement. Alternatively, the marginal likelihood can be derived using data-augmentation scheme e.g. Polya-gamma data argumentation for logistic regression) or through other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear and survival models, and using estimations such as Laplace approximation or correlated pseudo-marginal to derive marginal likelihood within a locally informed proposal can be computationally expensive in the "large n, large p" settings. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distribution in both generalised linear models and survival models. Secondly, in the light of the approximate Laplace approximation, we also describe an efficient and accurate estimation method for the marginal likelihood which involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing the Rao-Blackwellised estimates with the combination of a warm-start estimate and an ergodic average. We present numerous numerical results from simulated data and 8 high-dimensional gene fine mapping data-sets to showcase the efficiency of the novel PARNI proposal compared to the baseline add-delete-swap proposal.
The nonnegative rank of nonnegative matrices is an important quantity that appears in many fields, such as combinatorial optimization, communication complexity, and information theory. In this paper, we study the asymptotic growth of the nonnegative rank of a fixed nonnegative matrix under Kronecker product. This quantity is called the asymptotic nonnegative rank, which is already studied in information theory. By applying the theory of asymptotic spectra of V. Strassen (J. Reine Angew. Math. 1988), we introduce the asymptotic spectrum of nonnegative matrices and give a dual characterization of the asymptotic nonnegative rank. As the opposite of nonnegative rank, we introduce the notion of the subrank of a nonnegative matrix and show that it is exactly equal to the size of the maximum induced matching of the bipartite graph defined on the support of the matrix (therefore, independent of the value of entries). Finally, we show that two matrix parameters, namely rank and fractional cover number, belong to the asymptotic spectrum of nonnegative matrices.
The (modern) arbitrary derivative (ADER) approach is a popular technique for the numerical solution of differential problems based on iteratively solving an implicit discretization of their weak formulation. In this work, focusing on an ODE context, we investigate several strategies to improve this approach. Our initial emphasis is on the order of accuracy of the method in connection with the polynomial discretization of the weak formulation. We demonstrate that precise choices lead to higher-order convergences in comparison to the existing literature. Then, we put ADER methods into a Deferred Correction (DeC) formalism. This allows to determine the optimal number of iterations, which is equal to the formal order of accuracy of the method, and to introduce efficient $p$-adaptive modifications. These are defined by matching the order of accuracy achieved and the degree of the polynomial reconstruction at each iteration. We provide analytical and numerical results, including the stability analysis of the new modified methods, the investigation of the computational efficiency, an application to adaptivity and an application to hyperbolic PDEs with a Spectral Difference (SD) space discretization.
Besov priors are nonparametric priors that can model spatially inhomogeneous functions. They are routinely used in inverse problems and imaging, where they exhibit attractive sparsity-promoting and edge-preserving features. A recent line of work has initiated the study of their asymptotic frequentist convergence properties. In the present paper, we consider the theoretical recovery performance of the posterior distributions associated to Besov-Laplace priors in the density estimation model, under the assumption that the observations are generated by a possibly spatially inhomogeneous true density belonging to a Besov space. We improve on existing results and show that carefully tuned Besov-Laplace priors attain optimal posterior contraction rates. Furthermore, we show that hierarchical procedures involving a hyper-prior on the regularity parameter lead to adaptation to any smoothness level.
Simulating physical problems involving multi-time scale coupling is challenging due to the need of solving these multi-time scale processes simultaneously. In response to this challenge, this paper proposed an explicit multi-time step algorithm coupled with a solid dynamic relaxation scheme. The explicit scheme simplifies the equation system in contrast to the implicit scheme, while the multi-time step algorithm allows the equations of different physical processes to be solved under different time step sizes. Furthermore, an implicit viscous damping relaxation technique is applied to significantly reduce computational iterations required to achieve equilibrium in the comparatively fast solid response process. To validate the accuracy and efficiency of the proposed algorithm, two distinct scenarios, i.e., a nonlinear hardening bar stretching and a fluid diffusion coupled with Nafion membrane flexure, are simulated. The results show good agreement with experimental data and results from other numerical methods, and the simulation time is reduced firstly by independently addressing different processes with the multi-time step algorithm and secondly decreasing solid dynamic relaxation time through the incorporation of damping techniques.
For a singular integral equation on an interval of the real line, we study the behavior of the error of a delta-delta discretization. We show that the convergence is non-uniform, between order $O(h^{2})$ in the interior of the interval and a boundary layer where the consistency error does not tend to zero.
Digital memcomputing machines (DMMs) are a new class of computing machines that employ non-quantum dynamical systems with memory to solve combinatorial optimization problems. Here, we show that the time to solution (TTS) of DMMs follows an inverse Gaussian distribution, with the TTS self-averaging with increasing problem size, irrespective of the problem they solve. We provide both an analytical understanding of this phenomenon and numerical evidence by solving instances of the 3-SAT (satisfiability) problem. The self-averaging property of DMMs with problem size implies that they are increasingly insensitive to the detailed features of the instances they solve. This is in sharp contrast to traditional algorithms applied to the same problems, illustrating another advantage of this physics-based approach to computation.