Heterogeneity is a dominant factor in the behaviour of many biological processes. Despite this, it is common for mathematical and statistical analyses to ignore biological heterogeneity as a source of variability in experimental data. Therefore, methods for exploring the identifiability of models that explicitly incorporate heterogeneity through variability in model parameters are relatively underdeveloped. We develop a new likelihood-based framework, based on moment matching, for inference and identifiability analysis of differential equation models that capture biological heterogeneity through parameters that vary according to probability distributions. As our novel method is based on an approximate likelihood function, it is highly flexible; we demonstrate identifiability analysis using both a frequentist approach based on profile likelihood, and a Bayesian approach based on Markov-chain Monte Carlo. Through three case studies, we demonstrate our method by providing a didactic guide to inference and identifiability analysis of hyperparameters that relate to the statistical moments of model parameters from independent observed data. Our approach has a computational cost comparable to analysis of models that neglect heterogeneity, a significant improvement over many existing alternatives. We demonstrate how analysis of random parameter models can aid better understanding of the sources of heterogeneity from biological data.
In this work, we give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models with optimal dependence on the dimension in the sample complexity. In the pure DP setting, we give an efficient algorithm that estimates an unknown $d$-dimensional Gaussian distribution up to an arbitrary tiny total variation error using $\widetilde{O}(d^2 \log \kappa)$ samples while tolerating a constant fraction of adversarial outliers. Here, $\kappa$ is the condition number of the target covariance matrix. The sample bound matches best non-private estimators in the dependence on the dimension (up to a polylogarithmic factor). We prove a new lower bound on differentially private covariance estimation to show that the dependence on the condition number $\kappa$ in the above sample bound is also tight. Prior to our work, only identifiability results (yielding inefficient super-polynomial time algorithms) were known for the problem. In the approximate DP setting, we give an efficient algorithm to estimate an unknown Gaussian distribution up to an arbitrarily tiny total variation error using $\widetilde{O}(d^2)$ samples while tolerating a constant fraction of adversarial outliers. Prior to our work, all efficient approximate DP algorithms incurred a super-quadratic sample cost or were not outlier-robust. For the special case of mean estimation, our algorithm achieves the optimal sample complexity of $\widetilde O(d)$, improving on a $\widetilde O(d^{1.5})$ bound from prior work. Our pure DP algorithm relies on a recursive private preconditioning subroutine that utilizes the recent work on private mean estimation [Hopkins et al., 2022]. Our approximate DP algorithms are based on a substantial upgrade of the method of stabilizing convex relaxations introduced in [Kothari et al., 2022].
Recombination is a fundamental evolutionary force, but it is difficult to quantify because the effect of a recombination event on patterns of variation in a sample of genetic data can be hard to discern. Estimators for the recombination rate, which are usually based on the idea of integrating over the unobserved possible evolutionary histories of a sample, can therefore be noisy. Here we consider a related question: how would an estimator behave if the evolutionary history actually was observed? This would offer an upper bound on the performance of estimators used in practice. In this paper we derive an expression for the maximum likelihood estimator for the recombination rate based on a continuously observed, multi-locus, Wright--Fisher diffusion of haplotype frequencies, complementing existing work for an estimator of selection. We show that, contrary to selection, the estimator has unusual properties because the observed information matrix can explode in finite time whereupon the recombination parameter is learned without error. We also show that the recombination estimator is robust to the presence of selection in the sense that incorporating selection into the model leaves the estimator unchanged. We study the properties of the estimator by simulation and show that its distribution can be quite sensitive to the underlying mutation rates.
Matrix splitting iteration methods play a vital role in solving large sparse linear systems. Their performance heavily depends on the splitting parameters, however, the approach of selecting optimal splitting parameters has not been well developed. In this paper, we present a multitask kernel-learning parameter prediction method to automatically obtain relatively optimal splitting parameters, which contains simultaneous multiple parameters prediction and a data-driven kernel learning. For solving time-dependent linear systems, including linear differential systems and linear matrix systems, we give a new matrix splitting Kronecker product method, as well as its convergence analysis and preconditioning strategy. Numerical results illustrate our methods can save an enormous amount of time in selecting the relatively optimal splitting parameters compared with the exists methods. Moreover, our iteration method as a preconditioner can effectively accelerate GMRES. As the dimension of systems increases, all the advantages of our approaches becomes significantly. Especially, for solving the differential Sylvester matrix equation, the speedup ratio can reach tens to hundreds of times when the scale of the system is larger than one hundred thousand.
Exponential family models, generalized linear models (GLMs), generalized linear mixed models (GLMMs) and generalized additive models (GAMs) are widely used methods in statistics. However, many scientific applications necessitate constraints be placed on model parameters such as shape and linear inequality constraints. Constrained estimation and inference of parameters remains a pervasive problem in statistics where many methods rely on modifying rigid large sample theory assumptions for inference. We propose a flexible slice sampler Gibbs algorithm for Bayesian GLMMs and GAMs with linear inequality and shape constraints. We prove our posterior samples follow a Markov chain central limit theorem (CLT) by proving uniform ergodicity of our Markov chain and existence of the a moment generating function for our posterior distributions. We use our CLT results to derive joint bands and multiplicity adjusted Bayesian inference for nonparametric functional effects. Our rigorous CLT results address a shortcoming in the literature by obtaining valid estimation and inference on constrained parameters in finite sample settings. Our algorithmic and proof techniques are adaptable to a myriad of important statistical modeling problems. We apply our Bayesian GAM to a real data analysis example involving proportional odds regression for concussion recovery in children with shape constraints and smoothed nonparametric effects. We obtain multiplicity adjusted inference on monotonic nonparametric time effect to elucidate recovery trends in children as a function of time.
We consider the problem of estimating the interacting neighborhood of a Markov Random Field model with finite support and homogeneous pairwise interactions based on relative positions of a two-dimensional lattice. Using a Bayesian framework, we propose a Reversible Jump Monte Carlo Markov Chain algorithm that jumps across subsets of a maximal range neighborhood, allowing us to perform model selection based on a marginal pseudoposterior distribution of models. To show the strength of our proposed methodology we perform a simulation study and apply it to a real dataset from a discrete texture image analysis.
In this paper, we study the non-monotone DR-submodular function maximization over integer lattice. Functions over integer lattice have been defined submodular property that is similar to submodularity of set functions. DR-submodular is a further extended submodular concept for functions over the integer lattice, which captures the diminishing return property. Such functions find many applications in machine learning, social networks, wireless networks, etc. The techniques for submodular set function maximization can be applied to DR-submodular function maximization, e.g., the double greedy algorithm has a $1/2$-approximation ratio, whose running time is $O(nB)$, where $n$ is the size of the ground set, $B$ is the integer bound of a coordinate. In our study, we design a $1/2$-approximate binary search double greedy algorithm, and we prove that its time complexity is $O(n\log B)$, which significantly improves the running time. Specifically, we consider its application to the profit maximization problem in social networks with a bipartite model, the goal of this problem is to maximize the net profit gained from a product promoting activity, which is the difference of the influence gain and the promoting cost. We prove that the objective function is DR-submodular over integer lattice. We apply binary search double greedy algorithm to this problem and verify the effectiveness.
In this article we present a numerical analysis for a third-order differential equation with non-periodic boundary conditions and time-dependent coefficients, namely, the linear Korteweg-de Vries Burgers equation. This numerical analysis is motived due to the dispersive and dissipative phenomena that government this kind of equations. This work builds on previous methods for dispersive equations with constant coefficients, expanding the field to include a new class of equations which until now have eluded the time-evolving parameters. More precisely, throughout the Legendre-Petrov-Galerkin method we prove stability and convergence results of the approximation in appropriate weighted Sobolev spaces. These results allow to show the role and trade off of these temporal parameters into the model. Afterwards, we numerically investigate the dispersion-dissipation relation for several profiles, further provide insights into the implementation method, which allow to exhibit the accuracy and efficiency of our numerical algorithms.
The anisotropic diffusion equation is imperative in understanding cosmic ray diffusion across the Galaxy, the heliosphere, and its interplay with the ambient magnetic field. This diffusion term contributes to the highly stiff nature of the CR transport equation. In order to conduct numerical simulations of time-dependent cosmic ray transport, implicit integrators have been traditionally favoured over the CFL-bound explicit integrators in order to be able to take large step sizes. We propose exponential methods that directly compute the exponential of the matrix to solve the linear anisotropic diffusion equation. These methods allow us to take even larger step sizes; in certain cases, we are able to choose a step size as large as the simulation time, i.e., only one time step. This can substantially speed-up the simulations whilst generating highly accurate solutions (l2 error $\leq 10^{-10}$). Additionally, we test an approach based on extracting a constant diffusion coefficient from the anisotropic diffusion equation, where the constant coefficient term is solved implicitly or exponentially and the remainder is treated using some explicit method. We find that this approach, for homogeneous linear problems, is unable to improve on the exponential-based methods that directly evaluate the matrix exponential.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.