A novel compressed matrix format is proposed that combines an adaptive hierarchical partitioning of the matrix with low-rank approximation. One typical application is the approximation of discretized functions on rectangular domains; the flexibility of the format makes it possible to deal with functions that feature singularities in small, localized regions. To deal with time evolution and relocation of singularities, the partitioning can be dynamically adjusted based on features of the underlying data. Our format can be leveraged to efficiently solve linear systems with Kronecker product structure, as they arise from discretized partial differential equations (PDEs). For this purpose, these linear systems are rephrased as linear matrix equations and a recursive solver is derived from low-rank updates of such equations. We demonstrate the effectiveness of our framework for stationary and time-dependent, linear and nonlinear PDEs, including the Burgers' and Allen-Cahn equations.
A wide variety of biological phenomena can be modeled by the collective activity of a population of individual units. A common strategy for simulating such a system, the population density approach, is to take the macroscopic limit and update its population density function. However, in many cases, the coupling between the units and noise gives rise to complex behaviors challenging to existing population density approach methods. To address these challenges, we develop the asymmetric particle population density (APPD) method that efficiently and accurately simulates such populations consist of coupled elements. The APPD is well-suited for a parallel implementation. We compare the performance of the method against direct Monte-Carlo simulation and verify its accuracy by applying it to the well-studied Hodgkin-Huxley model, with a range of challenging scenarios. We find that our method can accurately reproduce complex macroscopic behaviors such as inhibitory coupling-induced clustering and noise-induced firing while being faster than the direct simulation.
Runge-Kutta (RK) schemes, especially Gauss-Legendre and some other fully implicit RK (FIRK) schemes, are desirable for the time integration of parabolic partial differential equations due to their A-stability and high-order accuracy. However, it is significantly more challenging to construct optimal preconditioners for them compared to diagonally implicit RK (or DIRK) schemes. To address this challenge, we first introduce mathematically optimal preconditioners called block complex Schur decomposition (BCSD), block real Schur decomposition (BRSD), and block Jordan form (BJF), motivated by block-circulant preconditioners and Jordan form solution techniques for IRK. We then derive an efficient, near-optimal singly-diagonal approximate BRSD (SABRSD) by approximating the quasi-triangular matrix in real Schur decomposition using an optimized upper-triangular matrix with a single diagonal value. A desirable feature of SABRSD is that it has comparable memory requirements and factorization (or setup) cost as singly DIRK (SDIRK). We approximate the diagonal blocks in these preconditioning techniques using an incomplete factorization with (near) linear complexity, such as multilevel ILU, ILU(0), or a multigrid method with an ILU-based smoother. We apply the block preconditioners in right-preconditioned GMRES to solve the advection-diffusion equation in 3D using finite element and finite difference methods. We show that BCSD, BRSD, and BJF significantly outperform other preconditioners in terms of GMRES iterations, and SABRSD is competitive with them and the prior state of the art in terms of computational cost while requiring the least amount of memory.
This paper proposes a mesh-free computational framework and machine learning theory for solving elliptic PDEs on unknown manifolds, identified with point clouds, based on diffusion maps (DM) and deep learning. The PDE solver is formulated as a supervised learning task to solve a least-squares regression problem that imposes an algebraic equation approximating a PDE (and boundary conditions if applicable). This algebraic equation involves a graph-Laplacian type matrix obtained via DM asymptotic expansion, which is a consistent estimator of second-order elliptic differential operators. The resulting numerical method is to solve a highly non-convex empirical risk minimization problem subjected to a solution from a hypothesis space of neural-network type functions. In a well-posed elliptic PDE setting, when the hypothesis space consists of feedforward neural networks with either infinite width or depth, we show that the global minimizer of the empirical loss function is a consistent solution in the limit of large training data. When the hypothesis space is a two-layer neural network, we show that for a sufficiently large width, the gradient descent method can identify a global minimizer of the empirical loss function. Supporting numerical examples demonstrate the convergence of the solutions and the effectiveness of the proposed solver in avoiding numerical issues that hampers the traditional approach when a large data set becomes available, e.g., large matrix inversion.
Spectral methods include a family of algorithms related to the eigenvectors of certain data-generated matrices. In this work, we are interested in studying the geometric landscape of the eigendecomposition problem in various spectral methods. In particular, we first extend known results regarding the landscape at critical points to larger regions near the critical points in a special case of finding the leading eigenvector of a symmetric matrix. For a more general eigendecomposition problem, inspired by recent findings on the connection between the landscapes of empirical risk and population risk, we then build a novel connection between the landscape of an eigendecomposition problem that uses random measurements and the one that uses the true data matrix. We also apply our theory to a variety of low-rank matrix optimization problems and conduct a series of simulations to illustrate our theoretical findings.
In this work, we consider the linear inverse problem $y=Ax+\epsilon$, where $A\colon X\to Y$ is a known linear operator between the separable Hilbert spaces $X$ and $Y$, $x$ is a random variable in $X$ and $\epsilon$ is a zero-mean random process in $Y$. This setting covers several inverse problems in imaging including denoising, deblurring, and X-ray tomography. Within the classical framework of regularization, we focus on the case where the regularization functional is not given a priori but learned from data. Our first result is a characterization of the optimal generalized Tikhonov regularizer, with respect to the mean squared error. We find that it is completely independent of the forward operator $A$ and depends only on the mean and covariance of $x$. Then, we consider the problem of learning the regularizer from a finite training set in two different frameworks: one supervised, based on samples of both $x$ and $y$, and one unsupervised, based only on samples of $x$. In both cases, we prove generalization bounds, under some weak assumptions on the distribution of $x$ and $\epsilon$, including the case of sub-Gaussian variables. Our bounds hold in infinite-dimensional spaces, thereby showing that finer and finer discretizations do not make this learning problem harder. The results are validated through numerical simulations.
This paper considers the decentralized composite optimization problem. We propose a novel decentralized variance-reduction proximal-gradient algorithmic framework, called PMGT-VR, which is based on a combination of several techniques including multi-consensus, gradient tracking, and variance reduction. The proposed framework relies on an imitation of centralized algorithms and we demonstrate that algorithms under this framework achieve convergence rates similar to that of their centralized counterparts. We also describe and analyze two representative algorithms, PMGT-SAGA and PMGT-LSVRG, and compare them to existing state-of-the-art proximal algorithms. To the best of our knowledge, PMGT-VR is the first linearly convergent decentralized stochastic algorithm that can solve decentralized composite optimization problems. Numerical experiments are provided to demonstrate the effectiveness of the proposed algorithms.
The need to understand the role of statistical methods for the forecasting of climatological parameters cannot be trivialized. This study gives an in depth review on the different variations of the Mann-Kendall (M-K) trend test and how they can be applied, regression techniques (Simple and Multiple), the Angstrom-Prescott model for solar radiation, etc. The study then goes ahead to apply some of them with data obtained from the Nigerian Meteorological Agency (NiMet), and applying tools like the python programming language and Wolfram Mathematica. Results show that the maximum ambient temperature for Calabar is increasing (Z=2.52) significantly after the calculated p-value < 0.05 (significant level). The seasonal M-K test was also applied for the dry and wet seasons and both were found to be increasing (Z=3.23 and Z=4.04 respectively) after their calculated p-values < 0.05. The relationship between refractivity and other meteorological parameters relating to it was discerned using partial differential equations giving the gradient of each with refractivity; this was compared with results from the correlation matrix to show that the water vapour contents of the atmosphere contributes significantly to the variation of refractivity. Multiple linear regression has also been adopted to give an accurate model for the prediction of refractivity in the region after the residual error between the calculated refractivity and predicted refractivity was minimal.
Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a simple but realistic ``expansion'' assumption, which states that a low-probability subset of the data must expand to a neighborhood with large probability relative to the subset. We also assume that neighborhoods of examples in different classes have minimal overlap. We prove that under these assumptions, the minimizers of population objectives based on self-training and input-consistency regularization will achieve high accuracy with respect to ground-truth labels. By using off-the-shelf generalization bounds, we immediately convert this result to sample complexity guarantees for neural nets that are polynomial in the margin and Lipschitzness. Our results help explain the empirical successes of recently proposed self-training algorithms which use input consistency regularization.
In this paper we present an approach to determine the smallest possible number of neurons in a layer of a neural network in such a way that the topology of the input space can be learned sufficiently well. We introduce a general procedure based on persistent homology to investigate topological invariants of the manifold on which we suspect the data set. We specify the required dimensions precisely, assuming that there is a smooth manifold on or near which the data are located. Furthermore, we require that this space is connected and has a commutative group structure in the mathematical sense. These assumptions allow us to derive a decomposition of the underlying space whose topology is well known. We use the representatives of the $k$-dimensional homology groups from the persistence landscape to determine an integer dimension for this decomposition. This number is the dimension of the embedding that is capable of capturing the topology of the data manifold. We derive the theory and validate it experimentally on toy data sets.
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.