Time-fractional parabolic equations with a Caputo time derivative are considered. For such equations, we explore and further develop the new methodology of the a-posteriori error estimation and adaptive time stepping proposed in [7]. We improve the earlier time stepping algorithm based on this theory, and specifically address its stable and efficient implementation in the context of high-order methods. The considered methods include an L1-2 method and continuous collocation methods of arbitrary order, for which adaptive temporal meshes are shown to yield optimal convergence rates in the presence of solution singularities.
This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by non-ignorable cluster sizes we mean that "large'' clusters and "small'' clusters may be heterogeneous, and, in particular, the effects of the treatment may vary across clusters of differing sizes. In order to permit this sort of flexibility, we consider a sampling framework in which cluster sizes themselves are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using a covariate-adaptive stratified randomization procedure. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study and empirical demonstration show the practical relevance of our theoretical results.
Learning precise surrogate models of complex computer simulations and physical machines often require long-lasting or expensive experiments. Furthermore, the modeled physical dependencies exhibit nonlinear and nonstationary behavior. Machine learning methods that are used to produce the surrogate model should therefore address these problems by providing a scheme to keep the number of queries small, e.g. by using active learning and be able to capture the nonlinear and nonstationary properties of the system. One way of modeling the nonstationarity is to induce input-partitioning, a principle that has proven to be advantageous in active learning for Gaussian processes. However, these methods either assume a known partitioning, need to introduce complex sampling schemes or rely on very simple geometries. In this work, we present a simple, yet powerful kernel family that incorporates a partitioning that: i) is learnable via gradient-based methods, ii) uses a geometry that is more flexible than previous ones, while still being applicable in the low data regime. Thus, it provides a good prior for active learning procedures. We empirically demonstrate excellent performance on various active learning tasks.
In this paper, we will show the $L^p$-resolvent estimate for the finite element approximation of the Stokes operator for $p \in \left( \frac{2N}{N+2}, \frac{2N}{N-2} \right)$, where $N \ge 2$ is the dimension of the domain. It is expected that this estimate can be applied to error estimates for finite element approximation of the non-stationary Navier--Stokes equations, since studies in this direction are successful in numerical analysis of nonlinear parabolic equations. To derive the resolvent estimate, we introduce the solution of the Stokes resolvent problem with a discrete external force. We then obtain local energy error estimate according to a novel localization technique and establish global $L^p$-type error estimates. The restriction for $p$ is caused by the treatment of lower-order terms appearing in the local energy error estimate. Our result may be a breakthrough in the $L^p$-theory of finite element methods for the non-stationary Navier--Stokes equations.
Time series modeling is a well-established problem, which often requires that methods (1) expressively represent complicated dependencies, (2) forecast long horizons, and (3) efficiently train over long sequences. State-space models (SSMs) are classical models for time series, and prior works combine SSMs with deep learning layers for efficient sequence modeling. However, we find fundamental limitations with these prior approaches, proving their SSM representations cannot express autoregressive time series processes. We thus introduce SpaceTime, a new state-space time series architecture that improves all three criteria. For expressivity, we propose a new SSM parameterization based on the companion matrix -- a canonical representation for discrete-time processes -- which enables SpaceTime's SSM layers to learn desirable autoregressive processes. For long horizon forecasting, we introduce a "closed-loop" variation of the companion SSM, which enables SpaceTime to predict many future time-steps by generating its own layer-wise inputs. For efficient training and inference, we introduce an algorithm that reduces the memory and compute of a forward pass with the companion matrix. With sequence length $\ell$ and state-space size $d$, we go from $\tilde{O}(d \ell)$ na\"ively to $\tilde{O}(d + \ell)$. In experiments, our contributions lead to state-of-the-art results on extensive and diverse benchmarks, with best or second-best AUROC on 6 / 7 ECG and speech time series classification, and best MSE on 14 / 16 Informer forecasting tasks. Furthermore, we find SpaceTime (1) fits AR($p$) processes that prior deep SSMs fail on, (2) forecasts notably more accurately on longer horizons than prior state-of-the-art, and (3) speeds up training on real-world ETTh1 data by 73% and 80% relative wall-clock time over Transformers and LSTMs.
The total generalized variation extends the total variation by incorporating higher-order smoothness. Thus, it can also suffer from similar discretization issues related to isotropy. Inspired by the success of novel discretization schemes of the total variation, there has been recent work to improve the second-order total generalized variation discretization, based on the same design idea. In this work, we propose to extend this to a general discretization scheme based on interpolation filters, for which we prove variational consistency. We then describe how to learn these interpolation filters to optimize the discretization for various imaging applications. We illustrate the performance of the method on a synthetic data set as well as for natural image denoising.
We design and compute first-order implicit-in-time variational schemes with high-order spatial discretization for initial value gradient flows in generalized optimal transport metric spaces. We first review some examples of gradient flows in generalized optimal transport spaces from the Onsager principle. We then use a one-step time relaxation optimization problem for time-implicit schemes, namely generalized Jordan-Kinderlehrer-Otto schemes. Their minimizing systems satisfy implicit-in-time schemes for initial value gradient flows with first-order time accuracy. We adopt the first-order optimization scheme ALG2 (Augmented Lagrangian method) and high-order finite element methods in spatial discretization to compute the one-step optimization problem. This allows us to derive the implicit-in-time update of initial value gradient flows iteratively. We remark that the iteration in ALG2 has a simple-to-implement point-wise update based on optimal transport and Onsager's activation functions. The proposed method is unconditionally stable for convex cases. Numerical examples are presented to demonstrate the effectiveness of the methods in two-dimensional PDEs, including Wasserstein gradient flows, Fisher--Kolmogorov-Petrovskii-Piskunov equation, and two and four species reversible reaction-diffusion systems.
Many two-sample network hypothesis testing methodologies operate under the implicit assumption that the vertex correspondence across networks is a priori known. In this paper, we consider the degradation of power in two-sample graph hypothesis testing when there are misaligned/label-shuffled vertices across networks. In the context of random dot product and stochastic block model networks, we theoretically explore the power loss due to shuffling for a pair of hypothesis tests based on Frobenius norm differences between estimated edge probability matrices or between adjacency matrices. The loss in testing power is further reinforced by numerous simulations and experiments, both in the stochastic block model and in the random dot product graph model, where we compare the power loss across multiple recently proposed tests in the literature. Lastly, we demonstrate the impact that shuffling can have in real-data testing in a pair of examples from neuroscience and from social network analysis.
We present and analyze a multiscale method for wave propagation problems, posed on spatial networks. By introducing a coarse scale, using a finite element space interpolated onto the network, we construct a discrete multiscale space using the localized orthogonal decomposition (LOD) methodology. The spatial discretization is then combined with an energy conserving temporal scheme to form the proposed method. Under the assumption of well-prepared initial data, we derive an a priori error bound of optimal order with respect to the space and time discretization. In the analysis, we combine the theory derived for stationary elliptic problems on spatial networks with classical finite element results for hyperbolic problems. Finally, we present numerical experiments that confirm our theoretical findings.
The Internet of Things (IoT) system generates massive high-speed temporally correlated streaming data and is often connected with online inference tasks under computational or energy constraints. Online analysis of these streaming time series data often faces a trade-off between statistical efficiency and computational cost. One important approach to balance this trade-off is sampling, where only a small portion of the sample is selected for the model fitting and update. Motivated by the demands of dynamic relationship analysis of IoT system, we study the data-dependent sample selection and online inference problem for a multi-dimensional streaming time series, aiming to provide low-cost real-time analysis of high-speed power grid electricity consumption data. Inspired by D-optimality criterion in design of experiments, we propose a class of online data reduction methods that achieve an optimal sampling criterion and improve the computational efficiency of the online analysis. We show that the optimal solution amounts to a strategy that is a mixture of Bernoulli sampling and leverage score sampling. The leverage score sampling involves auxiliary estimations that have a computational advantage over recursive least squares updates. Theoretical properties of the auxiliary estimations involved are also discussed. When applied to European power grid consumption data, the proposed leverage score based sampling methods outperform the benchmark sampling method in online estimation and prediction. The general applicability of the sampling-assisted online estimation method is assessed via simulation studies.
This work outlines a fast, high-precision time-domain solver for scalar, electromagnetic and gravitational perturbations on hyperboloidal foliations of Kerr space-times. Time-domain Teukolsky equation solvers have typically used explicit methods, which numerically violate Noether symmetries and are Courant-limited. These restrictions can limit the performance of explicit schemes when simulating long-time extreme mass ratio inspirals, expected to appear in LISA band for 2-5 years. We thus explore symmetric (exponential, Pad\'e or Hermite) integrators, which are unconditionally stable and known to preserve certain Noether symmetries and phase-space volume. For linear hyperbolic equations, these implicit integrators can be cast in explicit form, making them well-suited for long-time evolution of black hole perturbations. The 1+1 modal Teukolsky equation is discretized in space using polynomial collocation methods and reduced to a linear system of ordinary differential equations, coupled via mode-coupling arrays and discretized (matrix) differential operators. We use a matricization technique to cast the mode-coupled system in a form amenable to a method-of-lines framework, which simplifies numerical implementation and enables efficient parallelization on CPU and GPU architectures. We test our numerical code by studying late-time tails of Kerr spacetime perturbations in the sub-extremal and extremal cases.