This paper is concerned with convergence estimates for fully discrete tree tensor network approximations of high-dimensional functions from several model classes. For functions having standard or mixed Sobolev regularity, new estimates generalizing and refining known results are obtained, based on notions of linear widths of multivariate functions. In the main results of this paper, such techniques are applied to classes of functions with compositional structure, which are known to be particularly suitable for approximation by deep neural networks. As shown here, such functions can also be approximated by tree tensor networks without a curse of dimensionality -- however, subject to certain conditions, in particular on the depth of the underlying tree. In addition, a constructive encoding of compositional functions in tree tensor networks is given.
A class of models that have been widely used are the exponential random graph (ERG) models, which form a comprehensive family of models that include independent and dyadic edge models, Markov random graphs, and many other graph distributions, in addition to allow the inclusion of covariates that can lead to a better fit of the model. Another increasingly popular class of models in statistical network analysis are stochastic block models (SBMs). They can be used for the purpose of grouping nodes into communities or discovering and analyzing a latent structure of a network. The stochastic block model is a generative model for random graphs that tends to produce graphs containing subsets of nodes characterized by being connected to each other, called communities. Many researchers from various areas have been using computational tools to adjust these models without, however, analyzing their suitability for the data of the networks they are studying. The complexity involved in the estimation process and in the goodness-of-fit verification methodologies for these models can be factors that make the analysis of adequacy difficult and a possible discard of one model in favor of another. And it is clear that the results obtained through an inappropriate model can lead the researcher to very wrong conclusions about the phenomenon studied. The purpose of this work is to present a simple methodology, based on Hypothesis Tests, to verify if there is a model specification error for these two cases widely used in the literature to represent complex networks: the ERGM and the SBM. We believe that this tool can be very useful for those who want to use these models in a more careful way, verifying beforehand if the models are suitable for the data under study.
Estimating and reacting to external disturbances is of fundamental importance for robust control of quadrotors. Existing estimators typically require significant tuning or training with a large amount of data, including the ground truth, to achieve satisfactory performance. This paper proposes a data-efficient differentiable moving horizon estimation (DMHE) algorithm that can automatically tune the MHE parameters online and also adapt to different scenarios. We achieve this by deriving the analytical gradient of the estimated trajectory from MHE with respect to the tuning parameters, enabling end-to-end learning for auto-tuning. Most interestingly, we show that the gradient can be calculated efficiently from a Kalman filter in a recursive form. Moreover, we develop a model-based policy gradient algorithm to learn the parameters directly from the trajectory tracking errors without the need for the ground truth. The proposed DMHE can be further embedded as a layer with other neural networks for joint optimization. Finally, we demonstrate the effectiveness of the proposed method via both simulation and experiments on quadrotors, where challenging scenarios such as sudden payload change and flying in downwash are examined.
Low-rank approximation of images via singular value decomposition is well-received in the era of big data. However, singular value decomposition (SVD) is only for order-two data, i.e., matrices. It is necessary to flatten a higher order input into a matrix or break it into a series of order-two slices to tackle higher order data such as multispectral images and videos with the SVD. Higher order singular value decomposition (HOSVD) extends the SVD and can approximate higher order data using sums of a few rank-one components. We consider the problem of generalizing HOSVD over a finite dimensional commutative algebra. This algebra, referred to as a t-algebra, generalizes the field of complex numbers. The elements of the algebra, called t-scalars, are fix-sized arrays of complex numbers. One can generalize matrices and tensors over t-scalars and then extend many canonical matrix and tensor algorithms, including HOSVD, to obtain higher-performance versions. The generalization of HOSVD is called THOSVD. Its performance of approximating multi-way data can be further improved by an alternating algorithm. THOSVD also unifies a wide range of principal component analysis algorithms. To exploit the potential of generalized algorithms using t-scalars for approximating images, we use a pixel neighborhood strategy to convert each pixel to "deeper-order" t-scalar. Experiments on publicly available images show that the generalized algorithm over t-scalars, namely THOSVD, compares favorably with its canonical counterparts.
We show how probabilistic numerics can be used to convert an initial value problem into a Gauss--Markov process parametrised by the dynamics of the initial value problem. Consequently, the often difficult problem of parameter estimation in ordinary differential equations is reduced to hyperparameter estimation in Gauss--Markov regression, which tends to be considerably easier. The method's relation and benefits in comparison to classical numerical integration and gradient matching approaches is elucidated. In particular, the method can, in contrast to gradient matching, handle partial observations, and has certain routes for escaping local optima not available to classical numerical integration. Experimental results demonstrate that the method is on par or moderately better than competing approaches.
In this work, we study a random orthogonal projection based least squares estimator for the stable solution of a multivariate nonparametric regression (MNPR) problem. More precisely, given an integer $d\geq 1$ corresponding to the dimension of the MNPR problem, a positive integer $N\geq 1$ and a real parameter $\alpha\geq -\frac{1}{2},$ we show that a fairly large class of $d-$variate regression functions are well and stably approximated by its random projection over the orthonormal set of tensor product $d-$variate Jacobi polynomials with parameters $(\alpha,\alpha).$ The associated uni-variate Jacobi polynomials have degree at most $N$ and their tensor products are orthonormal over $\mathcal U=[0,1]^d,$ with respect to the associated multivariate Jacobi weights. In particular, if we consider $n$ random sampling points $\mathbf X_i$ following the $d-$variate Beta distribution, with parameters $(\alpha+1,\alpha+1),$ then we give a relation involving $n, N, \alpha$ to ensure that the resulting $(N+1)^d\times (N+1)^d$ random projection matrix is well conditioned. Moreover, we provide squared integrated as well as $L^2-$risk errors of this estimator. Precise estimates of these errors are given in the case where the regression function belongs to an isotropic Sobolev space $H^s(I^d),$ with $s> \frac{d}{2}.$ Also, to handle the general and practical case of an unknown distribution of the $\mathbf X_i,$ we use Shepard's scattered interpolation scheme in order to generate fairly precise approximations of the observed data at $n$ i.i.d. sampling points $\mathbf X_i$ following a $d-$variate Beta distribution. Finally, we illustrate the performance of our proposed multivariate nonparametric estimator by some numerical simulations with synthetic as well as real data.
Contextual bandit algorithms have been recently studied under the federated learning setting to satisfy the demand of keeping data decentralized and pushing the learning of bandit models to the client side. But limited by the required communication efficiency, existing solutions are restricted to linear models to exploit their closed-form solutions for parameter estimation. Such a restricted model choice greatly hampers these algorithms' practical utility. In this paper, we take the first step to addressing this challenge by studying generalized linear bandit models under a federated learning setting. We propose a communication-efficient solution framework that employs online regression for local update and offline regression for global update. We rigorously proved that, though the setting is more general and challenging, our algorithm can attain sub-linear rate in both regret and communication cost, which is also validated by our extensive empirical evaluations.
Recently Physically Informed Neural Networks have gained more and more popularity to solve partial differential equations, given the fact they escape the course of dimensionality. First Physically Informed Neural Networks are viewed as an underdetermined point matching collocation method then we expose the connection between Galerkin Least Square (GALS) and PINNs, to develop an a priori error estimate, in the context of elliptic problems. In particular techniques that belong to the realm of the least square finite elements and Rademacher complexity analysis will be used to obtain the above mentioned error estimate.
We present a method to extend the finite element library FEniCS to solve problems with domains in dimensions above three by constructing tensor product finite elements. This methodology only requires that the high dimensional domain is structured as a Cartesian product of two lower dimensional subdomains. In this study we consider linear partial differential equations, though the methodology can be extended to non-linear problems. The utilization of tensor product finite elements allows us to construct a global system of linear algebraic equations that only relies on the finite element infrastructure of the lower dimensional subdomains contained in FEniCS. We demonstrate the effectiveness of our methodology in three distinctive test cases. The first test case is a Poisson equation posed in a four dimensional domain which is a Cartesian product of two unit squares solved using the classical Galerkin finite element method. The second test case is the wave equation in space-time, where the computational domain is a Cartesian product of a two dimensional space grid and a one dimensional time interval. In this second case we also employ the Galerkin method. Finally, the third test case is an advection dominated advection-diffusion equation where the global domain is a Cartesian product of two one dimensional intervals. The streamline upwind Petrov-Galerkin method is applied to ensure discrete stability. In all three cases, optimal convergence rates are achieved with respect to h refinement.
This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.
Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.