Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on the batch learning setting. In this paper, we tackle these issues by developing functional stochastic gradient descent algorithms and proposing an online bootstrap resampling procedure to systematically study the inference problem for functional linear regression. In particular, the proposed estimation and inference procedures use only one pass over the data; thus they are easy to implement and suitable to the situation where data arrive in a streaming manner. Furthermore, we establish the convergence rate as well as the asymptotic distribution of the proposed estimator. Meanwhile, the proposed perturbed estimator from the bootstrap procedure is shown to enjoy the same theoretical properties, which provide the theoretical justification for our online inference tool. As far as we know, this is the first inference result on the functional linear regression model with streaming data. Simulation studies are conducted to investigate the finite-sample performance of the proposed procedure. An application is illustrated with the Beijing multi-site air-quality data.
We propose a new reduced order modeling strategy for tackling parametrized Partial Differential Equations (PDEs) with linear constraints, in particular Darcy flow systems in which the constraint is given by mass conservation. Our approach employs classical neural network architectures and supervised learning, but it is constructed in such a way that the resulting Reduced Order Model (ROM) is guaranteed to satisfy the linear constraints exactly. The procedure is based on a splitting of the PDE solution into a particular solution satisfying the constraint and a homogenous solution. The homogeneous solution is approximated by mapping a suitable potential function, generated by a neural network model, onto the kernel of the constraint operator; for the particular solution, instead, we propose an efficient spanning tree algorithm. Starting from this paradigm, we present three approaches that follow this methodology, obtained by exploring different choices of the potential spaces: from empirical ones, derived via Proper Orthogonal Decomposition (POD), to more abstract ones based on differential complexes. All proposed approaches combine computational efficiency with rigorous mathematical interpretation, thus guaranteeing the explainability of the model outputs. To demonstrate the efficacy of the proposed strategies and to emphasize their advantages over vanilla black-box approaches, we present a series of numerical experiments on fluid flows in porous media, ranging from mixed-dimensional problems to nonlinear systems. This research lays the foundation for further exploration and development in the realm of model order reduction, potentially unlocking new capabilities and solutions in computational geosciences and beyond.
We present methodology for constructing pointwise confidence intervals for the cumulative distribution function and the quantiles of mixing distributions on the unit interval from binomial mixture distribution samples. No assumptions are made on the shape of the mixing distribution. The confidence intervals are constructed by inverting exact tests of composite null hypotheses regarding the mixing distribution. Our method may be applied to any deconvolution approach that produces test statistics whose distribution is stochastically monotone for stochastic increase of the mixing distribution. We propose a hierarchical Bayes approach, which uses finite Polya Trees for modelling the mixing distribution, that provides stable and accurate deconvolution estimates without the need for additional tuning parameters. Our main technical result establishes the stochastic monotonicity property of the test statistics produced by the hierarchical Bayes approach. Leveraging the need for the stochastic monotonicity property, we explicitly derive the smallest asymptotic confidence intervals that may be constructed using our methodology. Raising the question whether it is possible to construct smaller confidence intervals for the mixing distribution without making parametric assumptions on its shape.
We address the problem of testing conditional mean and conditional variance for non-stationary data. We build e-values and p-values for four types of non-parametric composite hypotheses with specified mean and variance as well as other conditions on the shape of the data-generating distribution. These shape conditions include symmetry, unimodality, and their combination. Using the obtained e-values and p-values, we construct tests via e-processes, also known as testing by betting, as well as some tests based on combining p-values for comparison. Although we mainly focus on one-sided tests, the two-sided test for the mean is also studied. Simulation and empirical studies are conducted under a few settings, and they illustrate features of the methods based on e-processes.
We present a new Krylov subspace recycling method for solving a linear system of equations, or a sequence of slowly changing linear systems. Our new method, named GMRES-SDR, combines randomized sketching and deflated restarting in a way that avoids orthogononalizing a full Krylov basis. We provide new theory which characterizes sketched GMRES with and without augmentation as a projection method using a semi-inner product. We present results of numerical experiments demonstrating the effectiveness of GMRES-SDR over competitor methods such as GMRES-DR and GCRO-DR.
Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the tradeoff between minimizing the fitting error and the norm of the learned model coefficients. As this hyperparameter is scalar, it can be easily selected via random or grid search optimizing a cross-validation criterion. However, using a scalar hyperparameter limits the algorithm's flexibility and potential for better generalization. In this paper, we address the problem of linear regression with l2-regularization, where a different regularization hyperparameter is associated with each input variable. We optimize these hyperparameters using a gradient-based approach, wherein the gradient of a cross-validation criterion with respect to the regularization hyperparameters is computed analytically through matrix differential calculus. Additionally, we introduce two strategies tailored for sparse model learning problems aiming at reducing the risk of overfitting to the validation data. Numerical examples demonstrate that our multi-hyperparameter regularization approach outperforms LASSO, Ridge, and Elastic Net regression. Moreover, the analytical computation of the gradient proves to be more efficient in terms of computational time compared to automatic differentiation, especially when handling a large number of input variables. Application to the identification of over-parameterized Linear Parameter-Varying models is also presented.
Many time-dependent differential equations are equipped with invariants. Preserving such invariants under discretization can be important, e.g., to improve the qualitative and quantitative properties of numerical solutions. Recently, relaxation methods have been proposed as small modifications of standard time integration schemes guaranteeing the correct evolution of functionals of the solution. Here, we investigate how to combine these relaxation techniques with efficient step size control mechanisms based on local error estimates for explicit Runge-Kutta methods. We demonstrate our results in several numerical experiments including ordinary and partial differential equations.
We address the communication overhead of distributed sparse matrix-(multiple)-vector multiplication in the context of large-scale eigensolvers, using filter diagonalization as an example. The basis of our study is a performance model which includes a communication metric that is computed directly from the matrix sparsity pattern without running any code. The performance model quantifies to which extent scalability and parallel efficiency are lost due to communication overhead. To restore scalability, we identify two orthogonal layers of parallelism in the filter diagonalization technique. In the horizontal layer the rows of the sparse matrix are distributed across individual processes. In the vertical layer bundles of multiple vectors are distributed across separate process groups. An analysis in terms of the communication metric predicts that scalability can be restored if, and only if, one implements the two orthogonal layers of parallelism via different distributed vector layouts. Our theoretical analysis is corroborated by benchmarks for application matrices from quantum and solid state physics, road networks, and nonlinear programming. We finally demonstrate the benefits of using orthogonal layers of parallelism with two exemplary application cases -- an exciton and a strongly correlated electron system -- which incur either small or large communication overhead.
We consider the task of estimating functions belonging to a specific class of nonsmooth functions, namely so-called tame functions. These functions appear in a wide range of applications: training deep learning, value functions of mixed-integer programs, or wave functions of small molecules. We show that tame functions are approximable by piecewise polynomials on any full-dimensional cube. We then present the first ever mixed-integer programming formulation of piecewise polynomial regression. Together, these can be used to estimate tame functions. We demonstrate promising computational results.
Nonlinear Fokker-Planck equations play a major role in modeling large systems of interacting particles with a proved effectiveness in describing real world phenomena ranging from classical fields such as fluids and plasma to social and biological dynamics. Their mathematical formulation has often to face with physical forces having a significant random component or with particles living in a random environment which characterization may be deduced through experimental data and leading consequently to uncertainty-dependent equilibrium states. In this work, to address the problem of effectively solving stochastic Fokker-Planck systems, we will construct a new equilibrium preserving scheme through a micro-macro approach based on stochastic Galerkin methods. The resulting numerical method, contrarily to the direct application of a stochastic Galerkin projection in the parameter space of the unknowns of the underlying Fokker-Planck model, leads to highly accurate description of the uncertainty dependent large time behavior. Several numerical tests in the context of collective behavior for social and life sciences are presented to assess the validity of the present methodology against standard ones.
Robotic capacities in object manipulation are incomparable to those of humans. Besides years of learning, humans rely heavily on the richness of information from physical interaction with the environment. In particular, tactile sensing is crucial in providing such rich feedback. Despite its potential contributions to robotic manipulation, tactile sensing is less exploited; mainly due to the complexity of the time series provided by tactile sensors. In this work, we propose a method for assessing grasp stability using tactile sensing. More specifically, we propose a methodology to extract task-relevant features and design efficient classifiers to detect object slippage with respect to individual fingertips. We compare two classification models: support vector machine and logistic regression. We use highly sensitive Uskin tactile sensors mounted on an Allegro hand to test and validate our method. Our results demonstrate that the proposed method is effective in slippage detection in an online fashion.