In this paper, we present a predictor-corrector strategy for constructing rank-adaptive dynamical low-rank approximations (DLRAs) of matrix-valued ODE systems. The strategy is a compromise between (i) low-rank step-truncation approaches that alternately evolve and compress solutions and (ii) strict DLRA approaches that augment the low-rank manifold using subspaces generated locally in time by the DLRA integrator. The strategy is based on an analysis of the error between a forward temporal update into the ambient full-rank space, which is typically computed in a step-truncation approach before re-compressing, and the standard DLRA update, which is forced to live in a low-rank manifold. We use this error, without requiring its full-rank representation, to correct the DLRA solution. A key ingredient for maintaining a low-rank representation of the error is a randomized singular value decomposition (SVD), which introduces some degree of stochastic variability into the implementation. The strategy is formulated and implemented in the context of discontinuous Galerkin spatial discretizations of partial differential equations and applied to several versions of DLRA methods found in the literature, as well as a new variant. Numerical experiments comparing the predictor-corrector strategy to other methods demonstrate robustness to overcome short-comings of step truncation or strict DLRA approaches: the former may require more memory than is strictly needed while the latter may miss transients solution features that cannot be recovered. The effect of randomization, tolerances, and other implementation parameters is also explored.
In domains where sample sizes are limited, efficient learning algorithms are critical. Learning using privileged information (LuPI) offers increased sample efficiency by allowing prediction models access to auxiliary information at training time which is unavailable when the models are used. In recent work, it was shown that for prediction in linear-Gaussian dynamical systems, a LuPI learner with access to intermediate time series data is never worse and often better in expectation than any unbiased classical learner. We provide new insights into this analysis and generalize it to nonlinear prediction tasks in latent dynamical systems, extending theoretical guarantees to the case where the map connecting latent variables and observations is known up to a linear transform. In addition, we propose algorithms based on random features and representation learning for the case when this map is unknown. A suite of empirical results confirm theoretical findings and show the potential of using privileged time-series information in nonlinear prediction.
The selection of time step plays a crucial role in improving stability and efficiency in the Discontinuous Galerkin (DG) solution of hyperbolic conservation laws on adaptive moving meshes that typically employs explicit stepping. A commonly used selection of time step is a direct extension based on Courant-Friedrichs-Levy (CFL) conditions established for fixed and uniform meshes. In this work, we provide a mathematical justification for those time step selection strategies used in practical adaptive DG computations. A stability analysis is presented for a moving mesh DG method for linear scalar conservation laws. Based on the analysis, a new selection strategy of the time step is proposed, which takes into consideration the coupling of the $\alpha$-function (that is related to the eigenvalues of the Jacobian matrix of the flux and the mesh movement velocity) and the heights of the mesh elements. The analysis also suggests several stable combinations of the choices of the $\alpha$-function in the numerical scheme and in the time step selection. Numerical results obtained with a moving mesh DG method for Burgers' and Euler equations are presented. For comparison purpose, numerical results obtained with an error-based time step-size selection strategy are also given.
Traffic speed is central to characterizing the fluidity of the road network. Many transportation applications rely on it, such as real-time navigation, dynamic route planning, and congestion management. Rapid advances in sensing and communication techniques make traffic speed detection easier than ever. However, due to sparse deployment of static sensors or low penetration of mobile sensors, speeds detected are incomplete and far from network-wide use. In addition, sensors are prone to error or missing data due to various kinds of reasons, speeds from these sensors can become highly noisy. These drawbacks call for effective techniques to recover credible estimates from the incomplete data. In this work, we first identify the problem as a spatiotemporal kriging problem and propose a unified graph embedded tensor (SGET) learning framework featuring both low-rankness and multi-dimensional correlations for network-wide traffic speed kriging under limited observations. To be specific, three types of speed correlation including temporal continuity, temporal periodicity, and spatial proximity are carefully chosen. We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging. By performing experiments on two public million-level traffic speed datasets, we finally draw the conclusion and find our proposed SGET achieves the state-of-the-art kriging performance even under low observation rates, while at the same time saving more than half computing time compared with baseline methods. Some insights into spatiotemporal traffic data kriging at the network level are provided as well.
In solving multi-modal, multi-objective optimization problems (MMOPs), the objective is not only to find a good representation of the Pareto-optimal front (PF) in the objective space but also to find all equivalent Pareto-optimal subsets (PSS) in the variable space. Such problems are practically relevant when a decision maker (DM) is interested in identifying alternative designs with similar performance. There has been significant research interest in recent years to develop efficient algorithms to deal with MMOPs. However, the existing algorithms still require prohibitive number of function evaluations (often in several thousands) to deal with problems involving as low as two objectives and two variables. The algorithms are typically embedded with sophisticated, customized mechanisms that require additional parameters to manage the diversity and convergence in the variable and the objective spaces. In this letter, we introduce a steady-state evolutionary algorithm for solving MMOPs, with a simple design and no additional userdefined parameters that need tuning compared to a standard EA. We report its performance on 21 MMOPs from various test suites that are widely used for benchmarking using a low computational budget of 1000 function evaluations. The performance of the proposed algorithm is compared with six state-of-the-art algorithms (MO Ring PSO SCD, DN-NSGAII, TriMOEA-TA&R, CPDEA, MMOEA/DC and MMEA-WI). The proposed algorithm exhibits significantly better performance than the above algorithms based on the established metrics including IGDX, PSP and IGD. We hope this study would encourage design of simple, efficient and generalized algorithms to improve its uptake for practical applications.
Two-stage randomized experiments are becoming an increasingly popular experimental design for causal inference when the outcome of one unit may be affected by the treatment assignments of other units in the same cluster. In this paper, we provide a methodological framework for general tools of statistical inference and power analysis for two-stage randomized experiments. Under the randomization-based framework, we consider the estimation of a new direct effect of interest as well as the average direct and spillover effects studied in the literature. We provide unbiased estimators of these causal quantities and their conservative variance estimators in a general setting. Using these results, we then develop hypothesis testing procedures and derive sample size formulas. We theoretically compare the two-stage randomized design with the completely randomized and cluster randomized designs, which represent two limiting designs. Finally, we conduct simulation studies to evaluate the empirical performance of our sample size formulas. For empirical illustration, the proposed methodology is applied to the randomized evaluation of the Indian national health insurance program. An open-source software package is available for implementing the proposed methodology.
This paper is concerned with a blood flow problem coupled with a slow plaque growth at the artery wall. In the model, the micro (fast) system is the Navier-Stokes equation with a periodically applied force and the macro (slow) system is a fractional reaction equation, which is used to describe the plaque growth with memory effect. We construct an auxiliary temporal periodic problem and an effective time-average equation to approximate the original problem and analyze the approximation error of the corresponding linearized PDE (Stokes) system, where the simple front-tracking technique is used to update the slow moving boundary. An effective multiscale method is then designed based on the approximate problem and the front tracking framework. We also present a temporal finite difference scheme with a spatial continuous finite element method and analyze its temporal discrete error. Furthermore, a fast iterative procedure is designed to find the initial value of the temporal periodic problem and its convergence is analyzed as well. Our designed front-tracking framework and the iterative procedure for solving the temporal periodic problem make it easy to implement the multiscale method on existing PDE solving software. The numerical method is implemented by a combination of the finite element platform COMSOL Multiphysics and the mainstream software MATLAB, which significantly reduce the programming effort and easily handle the fluid-structure interaction, especially moving boundaries with more complex geometries. We present some numerical examples of ODEs and 2-D Navier-Stokes system to demonstrate the effectiveness of the multiscale method. Finally, we have a numerical experiment on the plaque growth problem and discuss the physical implication of the fractional order parameter.
Variational Bayes methods are a scalable estimation approach for many complex state space models. However, existing methods exhibit a trade-off between accurate estimation and computational efficiency. This paper proposes a variational approximation that mitigates this trade-off. This approximation is based on importance densities that have been proposed in the context of efficient importance sampling. By directly conditioning on the observed data, the proposed method produces an accurate approximation to the exact posterior distribution. Because the steps required for its calibration are computationally efficient, the approach is faster than existing variational Bayes methods. The proposed method can be applied to any state space model that has a closed-form measurement density function and a state transition distribution that belongs to the exponential family of distributions. We illustrate the method in numerical experiments with stochastic volatility models and a macroeconomic empirical application using a high-dimensional state space model.
High-dimensional matrix-variate time series data are becoming widely available in many scientific fields, such as economics, biology, and meteorology. To achieve significant dimension reduction while preserving the intrinsic matrix structure and temporal dynamics in such data, Wang et al. (2017) proposed a matrix factor model that is shown to provide effective analysis. In this paper, we establish a general framework for incorporating domain or prior knowledge in the matrix factor model through linear constraints. The proposed framework is shown to be useful in achieving parsimonious parameterization, facilitating interpretation of the latent matrix factor, and identifying specific factors of interest. Fully utilizing the prior-knowledge-induced constraints results in more efficient and accurate modeling, inference, dimension reduction as well as a clear and better interpretation of the results. In this paper, constrained, multi-term, and partially constrained factor models for matrix-variate time series are developed, with efficient estimation procedures and their asymptotic properties. We show that the convergence rates of the constrained factor loading matrices are much faster than those of the conventional matrix factor analysis under many situations. Simulation studies are carried out to demonstrate the finite-sample performance of the proposed method and its associated asymptotic properties. We illustrate the proposed model with three applications, where the constrained matrix-factor models outperform their unconstrained counterparts in the power of variance explanation under the out-of-sample 10-fold cross-validation setting.
We introduce a Fourier-based fast algorithm for Gaussian process regression. It approximates a translationally-invariant covariance kernel by complex exponentials on an equispaced Cartesian frequency grid of $M$ nodes. This results in a weight-space $M\times M$ system matrix with Toeplitz structure, which can thus be applied to a vector in ${\mathcal O}(M \log{M})$ operations via the fast Fourier transform (FFT), independent of the number of data points $N$. The linear system can be set up in ${\mathcal O}(N + M \log{M})$ operations using nonuniform FFTs. This enables efficient massive-scale regression via an iterative solver, even for kernels with fat-tailed spectral densities (large $M$). We include a rigorous error analysis of the kernel approximation, the resulting accuracy (relative to "exact" GP regression), and the condition number. Numerical experiments for squared-exponential and Mat\'ern kernels in one, two and three dimensions often show 1-2 orders of magnitude acceleration over state-of-the-art rank-structured solvers at comparable accuracy. Our method allows 2D Mat\'ern-${\small \frac{3}{2}}$ regression from $N=10^9$ data points to be performed in 2 minutes on a standard desktop, with posterior mean accuracy $10^{-3}$. This opens up spatial statistics applications 100 times larger than previously possible.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.