We consider the fundamental task of optimizing a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use the warped Riemannian geometry notions to redefine the optimisation problem of a function on Euclidean space to a Riemannian manifold with a warped metric, and then find the function's optimum along this manifold. The warped metric chosen for the search domain induces a computational friendly metric-tensor for which optimal search directions associate with geodesic curves on the manifold becomes easier to compute. Performing optimization along geodesics is known to be generally infeasible, yet we show that in this specific manifold we can analytically derive Taylor approximations up to third-order. In general these approximations to the geodesic curve will not lie on the manifold, however we construct suitable retraction maps to pull them back onto the manifold. Therefore, we can efficiently optimize along the approximate geodesic curves. We cover the related theory, describe a practical optimization algorithm and empirically evaluate it on a collection of challenging optimisation benchmarks. Our proposed algorithm, using third-order approximation of geodesics, outperforms standard Euclidean gradient-based counterparts in term of number of iterations until convergence and an alternative method for Hessian-based optimisation routines.
Inspired by biology's most sophisticated computer, the brain, neural networks constitute a profound reformulation of computational principles. Remarkably, analogous high-dimensional, highly-interconnected computational architectures also arise within information-processing molecular systems inside living cells, such as signal transduction cascades and genetic regulatory networks. Might neuromorphic collective modes be found more broadly in other physical and chemical processes, even those that ostensibly play non-information-processing roles such as protein synthesis, metabolism, or structural self-assembly? Here we examine nucleation during self-assembly of multicomponent structures, showing that high-dimensional patterns of concentrations can be discriminated and classified in a manner similar to neural network computation. Specifically, we design a set of 917 DNA tiles that can self-assemble in three alternative ways such that competitive nucleation depends sensitively on the extent of co-localization of high-concentration tiles within the three structures. The system was trained in-silico to classify a set of 18 grayscale 30 x 30 pixel images into three categories. Experimentally, fluorescence and atomic force microscopy monitoring during and after a 150-hour anneal established that all trained images were correctly classified, while a test set of image variations probed the robustness of the results. While slow compared to prior biochemical neural networks, our approach is surprisingly compact, robust, and scalable. This success suggests that ubiquitous physical phenomena, such as nucleation, may hold powerful information processing capabilities when scaled up as high-dimensional multicomponent systems.
A deep generative model yields an implicit estimator for the unknown distribution or density function of the observation. This paper investigates some statistical properties of the implicit density estimator pursued by VAE-type methods from a nonparametric density estimation framework. More specifically, we obtain convergence rates of the VAE-type density estimator under the assumption that the underlying true density function belongs to a locally H\"{o}lder class. Remarkably, a near minimax optimal rate with respect to the Hellinger metric can be achieved by the simplest network architecture, a shallow generative model with a one-dimensional latent variable.
We extend classical work by Janusz Czelakowski on the closure properties of the class of matrix models of entailment relations - nowadays more commonly called multiple-conclusion logics - to the setting of non-deterministic matrices (Nmatrices), characterizing the Nmatrix models of an arbitrary logic through a generalization of the standard class operators to the non-deterministic setting. We highlight the main differences that appear in this more general setting, in particular: the possibility to obtain Nmatrix quotients using any compatible equivalence relation (not necessarily a congruence); the problem of determining when strict homomorphisms preserve the logic of a given Nmatrix; the fact that the operations of taking images and preimages cannot be swapped, which determines the exact sequence of operators that generates, from any complete semantics, the class of all Nmatrix models of a logic. Many results, on the other hand, generalize smoothly to the non-deterministic setting: we show for instance that a logic is finitely based if and only if both the class of its Nmatrix models and its complement are closed under ultraproducts. We conclude by mentioning possible developments in adapting the Abstract Algebraic Logic approach to logics induced by Nmatrices and the associated equational reasoning over non-deterministic algebras.
Recently, advancements in deep learning-based superpixel segmentation methods have brought about improvements in both the efficiency and the performance of segmentation. However, a significant challenge remains in generating superpixels that strictly adhere to object boundaries while conveying rich visual significance, especially when cross-surface color correlations may interfere with objects. Drawing inspiration from neural structure and visual mechanisms, we propose a biological network architecture comprising an Enhanced Screening Module (ESM) and a novel Boundary-Aware Label (BAL) for superpixel segmentation. The ESM enhances semantic information by simulating the interactive projection mechanisms of the visual cortex. Additionally, the BAL emulates the spatial frequency characteristics of visual cortical cells to facilitate the generation of superpixels with strong boundary adherence. We demonstrate the effectiveness of our approach through evaluations on both the BSDS500 dataset and the NYUv2 dataset.
We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajectories to compute, resulting in sluggish credit assignment issues due to use of entire trajectories and a learning signal present only at the terminal time. In this work, we present Diffusion Generative Flow Samplers (DGFS), a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments, via parameterizing an additional "flow function". Our method takes inspiration from the theory developed for generative flow networks (GFlowNets), allowing us to make use of intermediate learning signals and benefit from off-policy exploration capabilities. Through a variety of challenging experiments, we demonstrate that DGFS results in more accurate estimates of the normalization constant than closely-related prior methods.
We present a priori error estimates for a multirate time-stepping scheme for coupled differential equations. The discretization is based on Galerkin methods in time using two different time meshes for two parts of the problem. We aim at surface coupled multiphysics problems like two-phase flows. Special focus is on the handling of the interface coupling to guarantee a coercive formulation as key to optimal order error estimates. In a sequence of increasing complexity, we begin with the coupling of two ordinary differential equations, coupled heat conduction equation, and finally a coupled Stokes problem. For this we show optimal multi-rate estimates in velocity and a suboptimal result in pressure. The a priori estimates prove that the multirate method decouples the two subproblems exactly. This is the basis for adaptive methods which can choose optimal lattices for the respective subproblems.
Fitted finite element methods are constructed for a singularly perturbed convection-diffusion problem in two space dimensions. Exponential splines as basis functions are combined with Shishkin meshes to obtain a stable parameter-uniform numerical method. These schemes satisfy a discrete maximum principle. In the classical case, the numerical approximations converge, in the maximum pointwise norm, at a rate of second order and the approximations converge at a rate of first order for all values of the singular perturbation parameter.
When dealing with a parametric statistical model, a Riemannian manifold can naturally appear by endowing the parameter space with the Fisher information metric. The geometry induced on the parameters by this metric is then referred to as the Fisher-Rao information geometry. Interestingly, this yields a point of view that allows for leveragingmany tools from differential geometry. After a brief introduction about these concepts, we will present some practical uses of these geometric tools in the framework of elliptical distributions. This second part of the exposition is divided into three main axes: Riemannian optimization for covariance matrix estimation, Intrinsic Cram\'er-Rao bounds, and classification using Riemannian distances.
Semitopologies model consensus in distributed system by equating the notion of a quorum -- a set of participants sufficient to make local progress -- with that of an open set. This yields a topology-like theory of consensus, but semitopologies generalise topologies, since the intersection of two quorums need not necessarily be a quorum. The semitopological model of consensus is naturally heterogeneous and local, just like topologies can be heterogenous and local, and for the same reasons: points may have different quorums and there is no restriction that open sets / quorums be uniformly generated (e.g. open sets can be something other than two-thirds majorities of the points in the space). Semiframes are an algebraic abstraction of semitopologies. They are to semitopologies as frames are to topologies. We give a notion of semifilter, which plays a role analogous to filters, and show how to build a semiframe out of the open sets of a semitopology, and a semitopology out of the semifilters of a semiframe. We define suitable notions of category and morphism and prove a categorical duality between (sober) semiframes and (spatial) semitopologies, and investigate well-behavedness properties on semitopologies and semiframes across the duality. Surprisingly, the structure of semiframes is not what one might initially expect just from looking at semitopologies, and the canonical structure required for the duality result -- a compatibility relation *, generalising sets intersection -- is also canonical for expressing well-behavedness properties. Overall, we deliver a new categorical, algebraic, abstract framework within which to study consensus on distributed systems, and which is also simply interesting to consider as a mathematical theory in its own right.
It is well known since 1960s that by exploring the tensor product structure of the discrete Laplacian on Cartesian meshes, one can develop a simple direct Poisson solver with an $\mathcal O(N^{\frac{d+1}d})$ complexity in $d$-dimension. The GPU acceleration of numerically solving PDEs has been explored successfully around fifteen years ago and become more and more popular in the past decade, driven by significant advancement in both hardware and software technologies, especially in the recent few years. We present in this paper a simple but extremely fast MATLAB implementation on a modern GPU, which can be easily reproduced, for solving 3D Poisson type equations using a spectral-element method. In particular, it costs less than one second on a Nvidia A100 for solving a Poisson equation with one billion degree of freedoms.