The linear varying coefficient models posits a linear relationship between an outcome and covariates in which the covariate effects are modeled as functions of additional effect modifiers. Despite a long history of study and use in statistics and econometrics, state-of-the-art varying coefficient modeling methods cannot accommodate multivariate effect modifiers without imposing restrictive functional form assumptions or involving computationally intensive hyperparameter tuning. In response, we introduce VCBART, which flexibly estimates the covariate effect in a varying coefficient model using Bayesian Additive Regression Trees. With simple default settings, VCBART outperforms existing varying coefficient methods in terms of covariate effect estimation, uncertainty quantification, and outcome prediction. We illustrate the utility of VCBART with two case studies: one examining how the association between later-life cognition and measures of socioeconomic position vary with respect to age and socio-demographics and another estimating how temporal trends in urban crime vary at the neighborhood level. An R package implementing VCBART is available at //github.com/skdeshpande91/VCBART
In Gaussian graphical models, the likelihood equations must typically be solved iteratively. We investigate two algorithms: A version of iterative proportional scaling which avoids inversion of large matrices, and an algorithm based on convex duality and operating on the covariance matrix by neighbourhood coordinate descent, corresponding to the graphical lasso with zero penalty. For large, sparse graphs, the iterative proportional scaling algorithm appears feasible and has simple convergence properties. The algorithm based on neighbourhood coordinate descent is extremely fast and less dependent on sparsity, but needs a positive definite starting value to converge. We give an algorithm for finding such a starting value for graphs with low colouring number. As a consequence, we also obtain a simplified proof for existence of the maximum likelihood estimator in such cases.
Singularly perturbed boundary value problems pose a significant challenge for their numerical approximations because of the presence of sharp boundary layers. These sharp boundary layers are responsible for the stiffness of solutions, which leads to large computational errors, if not properly handled. It is well-known that the classical numerical methods as well as the Physics-Informed Neural Networks (PINNs) require some special treatments near the boundary, e.g., using extensive mesh refinements or finer collocation points, in order to obtain an accurate approximate solution especially inside of the stiff boundary layer. In this article, we modify the PINNs and construct our new semi-analytic SL-PINNs suitable for singularly perturbed boundary value problems. Performing the boundary layer analysis, we first find the corrector functions describing the singular behavior of the stiff solutions inside boundary layers. Then we obtain the SL-PINN approximations of the singularly perturbed problems by embedding the explicit correctors in the structure of PINNs or by training the correctors together with the PINN approximations. Our numerical experiments confirm that our new SL-PINN methods produce stable and accurate approximations for stiff solutions.
The accurate and efficient evaluation of Newtonian potentials over general 2-D domains is important for the numerical solution of Poisson's equation and volume integral equations. In this paper, we present a simple and efficient high-order algorithm for computing the Newtonian potential over a planar domain discretized by an unstructured mesh. The algorithm is based on the use of Green's third identity for transforming the Newtonian potential into a collection of layer potentials over the boundaries of the mesh elements, which can be easily evaluated by the Helsing-Ojala method. One important component of our algorithm is the use of high-order (up to order 20) bivariate polynomial interpolation in the monomial basis, for which we provide extensive justification. The performance of our algorithm is illustrated through several numerical experiments.
We present a new stability and error analysis of fully discrete approximation schemes for the transient Stokes equation. For the spatial discretization, we consider a wide class of Galerkin finite element methods which includes both inf-sup stable spaces and symmetric pressure stabilized formulations. We extend the results from Burman and Fern\'andez [\textit{SIAM J. Numer. Anal.}, 47 (2009), pp. 409-439] and provide a unified theoretical analysis of backward difference formulae (BDF methods) of order 1 to 6. The main novelty of our approach lies in the use of Dahlquist's G-stability concept together with multiplier techniques introduced by Nevannlina-Odeh and recently by Akrivis et al. [\textit{SIAM J. Numer. Anal.}, 59 (2021), pp. 2449-2472] to derive optimal stability and error estimates for both the velocity and the pressure. When combined with a method dependent Ritz projection for the initial data, unconditional stability can be shown while for arbitrary interpolation, pressure stability is subordinate to the fulfillment of a mild inverse CFL-type condition between space and time discretizations.
Finite-dimensional truncations are routinely used to approximate partial differential equations (PDEs), either to obtain numerical solutions or to derive reduced-order models. The resulting discretized equations are known to violate certain physical properties of the system. In particular, first integrals of the PDE may not remain invariant after discretization. Here, we use the method of reduced-order nonlinear solutions (RONS) to ensure that the conserved quantities of the PDE survive its finite-dimensional truncation. In particular, we develop two methods: Galerkin RONS and finite volume RONS. Galerkin RONS ensures the conservation of first integrals in Galerkin-type truncations, whether used for direct numerical simulations or reduced-order modeling. Similarly, finite volume RONS conserves any number of first integrals of the system, including its total energy, after finite volume discretization. Both methods are applicable to general time-dependent PDEs and can be easily incorporated in existing Galerkin-type or finite volume code. We demonstrate the efficacy of our methods on two examples: direct numerical simulations of the shallow water equation and a reduced-order model of the nonlinear Schrodinger equation. As a byproduct, we also generalize RONS to phenomena described by a system of PDEs.
The one-to-one mapping of control inputs to actuator outputs results in elaborate routing architectures that limit how complex fluidic soft robot behaviours can currently become. Embodied intelligence can be used as a tool to counteract this phenomenon. Control functionality can be embedded directly into actuators by leveraging the characteristics of fluid flow phenomena. Whilst prior soft robotics work has focused exclusively on actuators operating in a state of transient/no flow (constant pressure), or pulsatile/alternating flow, our work begins to explore the possibilities granted by operating in the closed-loop flow recirculation regime. Here we introduce the concept of FlowBots: soft robots that utilise the characteristics of continuous fluid flow to enable the embodiment of complex control functionality directly into the structure of the robot. FlowBots have robust, integrated, no-moving-part control systems, and these architectures enable: monolithic additive manufacturing methods, rapid prototyping, greater sustainability, and an expansive range of applications. Based on three FlowBot examples: a bidirectional actuator, a gripper, and a quadruped swimmer - we demonstrate how the characteristics of flow recirculation contribute to simplifications in fluidic analogue control architectures. We conclude by outlining our design and rapid prototyping methodology to empower others in the field to explore this new, emerging design field, and design their own FlowBots.
Most of the characterizations of probability distributions are based on properties of functions of possibly independent random variables. We investigate characterizations of probability distributions through properties of minima or maxima of max-independent, min-independent and quasi-independent random variables generalizing the results from independent random variables of Kotlarski (1978), Prakasa Rao (1992) and Klebanov (1973).
This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The resulting updates avoid the costly matrix operations like matrix exponentiation and dense matrix multiplication, which are generally required in almost all other Riemannian optimization algorithms on SPD manifold. We further identify a broad class of functions, arising in diverse applications, such as kernel matrix learning, covariance estimation of Gaussian distributions, maximum likelihood parameter estimation of elliptically contoured distributions, and parameter estimation in Gaussian mixture model problems, over which the Riemannian gradients can be calculated efficiently. The proposed uni-directional and multi-directional Riemannian subspace descent variants incur per-iteration complexities of $\O(n)$ and $\O(n^2)$ respectively, as compared to the $\O(n^3)$ or higher complexity incurred by all existing Riemannian gradient descent variants. The superior runtime and low per-iteration complexity of the proposed algorithms is also demonstrated via numerical tests on large-scale covariance estimation and matrix square root problems.
Strong spatial mixing (SSM) is an important quantitative notion of correlation decay for Gibbs distributions arising in statistical physics, probability theory, and theoretical computer science. A longstanding conjecture is that the uniform distribution on proper $q$-colorings on a $\Delta$-regular tree exhibits SSM whenever $q \ge \Delta+1$. Moreover, it is widely believed that as long as SSM holds on bounded-degree trees with $q$ colors, one would obtain an efficient sampler for $q$-colorings on all bounded-degree graphs via simple Markov chain algorithms. It is surprising that such a basic question is still open, even on trees, but then again it also highlights how much we still have to learn about random colorings. In this paper, we show the following: (1) For any $\Delta \ge 3$, SSM holds for random $q$-colorings on trees of maximum degree $\Delta$ whenever $q \ge \Delta + 3$. Thus we almost fully resolve the aforementioned conjecture. Our result substantially improves upon the previously best bound which requires $q \ge 1.59\Delta+\gamma^*$ for an absolute constant $\gamma^* > 0$. (2) For any $\Delta\ge 3$ and girth $g = \Omega_\Delta(1)$, we establish optimal mixing of the Glauber dynamics for $q$-colorings on graphs of maximum degree $\Delta$ and girth $g$ whenever $q \ge \Delta+3$. Our approach is based on a new general reduction from spectral independence on large-girth graphs to SSM on trees that is of independent interest. Using the same techniques, we also prove near-optimal bounds on weak spatial mixing (WSM), a closely-related notion to SSM, for the antiferromagnetic Potts model on trees.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.