Model-based sequential approaches to discrete "black-box" optimization, including Bayesian optimization techniques, often access the same points multiple times for a given objective function in interest, resulting in many steps to find the global optimum. Here, we numerically study the effect of a postprocessing method on Bayesian optimization that strictly prohibits duplicated samples in the dataset. We find the postprocessing method significantly reduces the number of sequential steps to find the global optimum, especially when the acquisition function is of maximum a posterior estimation. Our results provide a simple but general strategy to solve the slow convergence of Bayesian optimization for high-dimensional problems.
Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks.
We develop a provably efficient importance sampling scheme that estimates exit probabilities of solutions to small-noise stochastic reaction-diffusion equations from scaled neighborhoods of a stable equilibrium. The moderate deviation scaling allows for a local approximation of the nonlinear dynamics by their linearized version. In addition, we identify a finite-dimensional subspace where exits take place with high probability. Using stochastic control and variational methods we show that our scheme performs well both in the zero noise limit and pre-asymptotically. Simulation studies for stochastically perturbed bistable dynamics illustrate the theoretical results.
We develop commuting finite element projections over smooth Riemannian manifolds. This extension of finite element exterior calculus establishes the stability and convergence of finite element methods for the Hodge-Laplace equation on manifolds. The commuting projections use localized mollification operators, building upon a classical construction by de Rham. These projections are uniformly bounded on Lebesgue spaces of differential forms and map onto intrinsic finite element spaces defined with respect to an intrinsic smooth triangulation of the manifold. We analyze the Galerkin approximation error. Since practical computations use extrinsic finite element methods over approximate computational manifolds, we also analyze the geometric error incurred.
Nonparametric maximum likelihood estimators (MLEs) in inverse problems often have non-normal limit distributions, like Chernoff's distribution. However, if one considers smooth functionals of the model, with corresponding functionals of the MLE, one gets normal limit distributions and faster rates of convergence. We demonstrate this for a model for the incubation time of a disease. The usual approach in the latter models is to use parametric distributions, like Weibull and gamma distributions, which leads to inconsistent estimators. Smoothed bootstrap methods are discussed for constructing confidence intervals. The classical bootstrap, based on the nonparametric MLE itself, has been proved to be inconsistent in this situation.
Differential geometric approaches are ubiquitous in several fields of mathematics, physics and engineering, and their discretizations enable the development of network-based mathematical and computational frameworks, which are essential for large-scale data science. The Forman-Ricci curvature (FRC) - a statistical measure based on Riemannian geometry and designed for networks - is known for its high capacity for extracting geometric information from complex networks. However, extracting information from dense networks is still challenging due to the combinatorial explosion of high-order network structures. Motivated by this challenge we sought a set-theoretic representation theory for high-order network cells and FRC, as well as their associated concepts and properties, which together provide an alternative and efficient formulation for computing high-order FRC in complex networks. We provide a pseudo-code, a software implementation coined FastForman, as well as a benchmark comparison with alternative implementations. Crucially, our representation theory reveals previous computational bottlenecks and also accelerates the computation of FRC. As a consequence, our findings open new research possibilities in complex systems where higher-order geometric computations are required.
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and interactive language learning has been high lighted. In our approach, we embed a multilingual model (mBART, Liu et al., 2020) into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task. The hypothesis is that this will align multiple languages to a shared task space. We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated, including the low-resource language Nepali.
Rational function approximations provide a simple but flexible alternative to polynomial approximation, allowing one to capture complex non-linearities without oscillatory artifacts. However, there have been few attempts to use rational functions on noisy data due to the likelihood of creating spurious singularities. To avoid the creation of singularities, we use Bernstein polynomials and appropriate conditions on their coefficients to force the denominator to be strictly positive. While this reduces the range of rational polynomials that can be expressed, it keeps all the benefits of rational functions while maintaining the robustness of polynomial approximation in noisy data scenarios. Our numerical experiments on noisy data show that existing rational approximation methods continually produce spurious poles inside the approximation domain. This contrasts our method, which cannot create poles in the approximation domain and provides better fits than a polynomial approximation and even penalized splines on functions with multiple variables. Moreover, guaranteeing pole-free in an interval is critical for estimating non-constant coefficients when numerically solving differential equations using spectral methods. This provides a compact representation of the original differential equation, allowing numeric solvers to achieve high accuracy quickly, as seen in our experiments.
A new mechanical model on noncircular shallow tunnelling considering initial stress field is proposed in this paper by constraining far-field ground surface to eliminate displacement singularity at infinity, and the originally unbalanced tunnel excavation problem in existing solutions is turned to an equilibrium one of mixed boundaries. By applying analytic continuation, the mixed boundaries are transformed to a homogenerous Riemann-Hilbert problem, which is subsequently solved via an efficient and accurate iterative method with boundary conditions of static equilibrium, displacement single-valuedness, and traction along tunnel periphery. The Lanczos filtering technique is used in the final stress and displacement solution to reduce the Gibbs phenomena caused by the constrained far-field ground surface for more accurte results. Several numerical cases are conducted to intensively verify the proposed solution by examining boundary conditions and comparing with existing solutions, and all the results are in good agreements. Then more numerical cases are conducted to investigate the stress and deformation distribution along ground surface and tunnel periphery, and several engineering advices are given. Further discussions on the defects of the proposed solution are also conducted for objectivity.
The equioscillation theorem interleaves the Haar condition, the existence and uniqueness and strong uniqueness of the optimal Chebyshev approximation and its characterization by the equioscillation condition in a way that cannot extend to multivariate approximation: Rice~[\emph{Transaction of the AMS}, 1963] says ''A form of alternation is still present for functions of several variables. However, there is apparently no simple method of distinguishing between the alternation of a best approximation and the alternation of other approximating functions. This is due to the fact that there is no natural ordering of the critical points.'' In addition, in the context of multivariate approximation the Haar condition is typically not satisfied and strong uniqueness may hold or not. The present paper proposes an multivariate equioscillation theorem, which includes such a simple alternation condition on error extrema, existence and a sufficient condition for strong uniqueness. To this end, the relationship between the properties interleaved in the univariate equioscillation theorem is clarified: first, a weak Haar condition is proposed, which simply implies existence. Second, the equioscillation condition is shown to be equivalent to the optimality condition of convex optimization, hence characterizing optimality independently from uniqueness. It is reformulated as the synchronized oscillations between the error extrema and the components of a related Haar matrix kernel vector, in a way that applies to multivariate approximation. Third, an additional requirement on the involved Haar matrix and its kernel vector, called strong equioscillation, is proved to be sufficient for the strong uniqueness of the solution. These three disconnected conditions give rise to a multivariate equioscillation theorem, where existence, characterization by equioscillation and strong uniqueness are separated, without involving the too restrictive Haar condition. Remarkably, relying on optimality condition of convex optimization gives rise to a quite simple proof. Instances of multivariate problems with strongly unique, non-strong but unique and non-unique solutions are presented to illustrate the scope of the theorem.
We develop randomized matrix-free algorithms for estimating partial traces. Our algorithm improves on the typicality-based approach used in [T. Chen and Y-C. Cheng, Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems, J. Chem. Phys. 157, 064106 (2022)] by deflating important subspaces (e.g. corresponding to the low-energy eigenstates) explicitly. This results in a significant variance reduction for matrices with quickly decaying singular values. We then apply our algorithm to study the thermodynamics of several Heisenberg spin systems, particularly the entanglement spectrum and ergotropy.