This paper analyzes a $\theta$-method and 3-point time filter. This approach adds one additional line of code to the existing source code of $\theta$-method. We prove the method's $0$-stability, accuracy, and $A$-stability for both constant time step and variable time step. Some numerical tests are performed to validate the theoretical results.
We investigate the effect of the well-known Mycielski construction on the Shannon capacity of graphs and on one of its most prominent upper bounds, the (complementary) Lov\'asz theta number. We prove that if the Shannon capacity of a graph, the distinguishability graph of a noisy channel, is attained by some finite power, then its Mycielskian has strictly larger Shannon capacity than the graph itself. For the complementary Lov\'asz theta function we show that its value on the Mycielskian of a graph is completely determined by its value on the original graph, a phenomenon similar to the one discovered for the fractional chromatic number by Larsen, Propp and Ullman. We also consider the possibility of generalizing our results on the Sperner capacity of directed graphs and on the generalized Mycielsky construction. Possible connections with what Zuiddam calls the asymptotic spectrum of graphs are discussed as well.
This paper develops some theory of the matrix Dyson equation (MDE) for correlated linearizations and uses it to solve a problem on asymptotic deterministic equivalent for the test error in random features regression. The theory developed for the correlated MDE includes existence-uniqueness, spectral support bounds, and stability properties of the MDE. This theory is new for constructing deterministic equivalents for pseudoresolvents of a class of correlated linear pencils. In the application, this theory is used to give a deterministic equivalent of the test error in random features ridge regression, in a proportional scaling regime, wherein we have conditioned on both training and test datasets.
A class of (block) rational Krylov subspace based projection method for solving large-scale continuous-time algebraic Riccati equation (CARE) $0 = \mathcal{R}(X) := A^HX + XA + C^HC - XBB^HX$ with a large, sparse $A$ and $B$ and $C$ of full low rank is proposed. The CARE is projected onto a block rational Krylov subspace $\mathcal{K}_j$ spanned by blocks of the form $(A^H+ s_kI)C^H$ for some shifts $s_k, k = 1, \ldots, j.$ The considered projections do not need to be orthogonal and are built from the matrices appearing in the block rational Arnoldi decomposition associated to $\mathcal{K}_j.$ The resulting projected Riccati equation is solved for the small square Hermitian $Y_j.$ Then the Hermitian low-rank approximation $X_j = Z_jY_jZ_j^H$ to $X$ is set up where the columns of $Z_j$ span $\mathcal{K}_j.$ The residual norm $\|R(X_j )\|_F$ can be computed efficiently via the norm of a readily available $2p \times 2p$ matrix. We suggest to reduce the rank of the approximate solution $X_j$ even further by truncating small eigenvalues from $X_j.$ This truncated approximate solution can be interpreted as the solution of the Riccati residual projected to a subspace of $\mathcal{K}_j.$ This gives us a way to efficiently evaluate the norm of the resulting residual. Numerical examples are presented.
By approximating posterior distributions with weighted samples, particle filters (PFs) provide an efficient mechanism for solving non-linear sequential state estimation problems. While the effectiveness of particle filters has been recognised in various applications, their performance relies on the knowledge of dynamic models and measurement models, as well as the construction of effective proposal distributions. An emerging trend involves constructing components of particle filters using neural networks and optimising them by gradient descent, and such data-adaptive particle filtering approaches are often called differentiable particle filters. Due to the expressiveness of neural networks, differentiable particle filters are a promising computational tool for performing inference on sequential data in complex, high-dimensional tasks, such as vision-based robot localisation. In this paper, we review recent advances in differentiable particle filters and their applications. We place special emphasis on different design choices for key components of differentiable particle filters, including dynamic models, measurement models, proposal distributions, optimisation objectives, and differentiable resampling techniques.
This paper develops a general asymptotic theory of local polynomial (LP) regression for spatial data observed at irregularly spaced locations in a sampling region $R_n \subset \mathbb{R}^d$. We adopt a stochastic sampling design that can generate irregularly spaced sampling sites in a flexible manner including both pure increasing and mixed increasing domain frameworks. We first introduce a nonparametric regression model for spatial data defined on $\mathbb{R}^d$ and then establish the asymptotic normality of LP estimators with general order $p \geq 1$. We also propose methods for constructing confidence intervals and establishing uniform convergence rates of LP estimators. Our dependence structure conditions on the underlying processes cover a wide class of random fields such as L\'evy-driven continuous autoregressive moving average random fields. As an application of our main results, we discuss a two-sample testing problem for mean functions and their partial derivatives.
Formalized $1$-category theory forms a core component of various libraries of mathematical proofs. However, more sophisticated results in fields from algebraic topology to theoretical physics, where objects have "higher structure," rely on infinite-dimensional categories in place of $1$-dimensional categories, and $\infty$-category theory has thusfar proved unamenable to computer formalization. Using a new proof assistant called Rzk, which is designed to support Riehl-Shulman's simplicial extension of homotopy type theory for synthetic $\infty$-category theory, we provide the first formalizations of results from $\infty$-category theory. This includes in particular a formalization of the Yoneda lemma, often regarded as the fundamental theorem of category theory, a theorem which roughly states that an object of a given category is determined by its relationship to all of the other objects of the category. A key feature of our framework is that, thanks to the synthetic theory, many constructions are automatically natural or functorial. We plan to use Rzk to formalize further results from $\infty$-category theory, such as the theory of limits and colimits and adjunctions.
We propose a general optimization-based framework for computing differentially private M-estimators and a new method for constructing differentially private confidence regions. Firstly, we show that robust statistics can be used in conjunction with noisy gradient descent or noisy Newton methods in order to obtain optimal private estimators with global linear or quadratic convergence, respectively. We establish local and global convergence guarantees, under both local strong convexity and self-concordance, showing that our private estimators converge with high probability to a small neighborhood of the non-private M-estimators. Secondly, we tackle the problem of parametric inference by constructing differentially private estimators of the asymptotic variance of our private M-estimators. This naturally leads to approximate pivotal statistics for constructing confidence regions and conducting hypothesis testing. We demonstrate the effectiveness of a bias correction that leads to enhanced small-sample empirical performance in simulations. We illustrate the benefits of our methods in several numerical examples.
In this paper, we present a rigorous analysis of root-exponential convergence of Hermite approximations, including projection and interpolation methods, for functions that are analytic in an infinite strip containing the real axis and satisfy certain restrictions on the asymptotic behavior at infinity within this strip. Asymptotically sharp error bounds in the weighted and maximum norms are derived. The key ingredients of our analysis are some remarkable contour integral representations for the Hermite coefficients and the remainder of Hermite spectral interpolations. Further extensions to Gauss--Hermite quadrature, Hermite spectral differentiations, generalized Hermite spectral approximations and the scaling factor of Hermite approximation are also discussed. Numerical experiments confirm our theoretical results.
We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, that are built upon functional principal components analysis, usually assume homogeneous covariance structure across groups. Nonetheless, justifying this assumption in real-world scenarios can be challenging. To eliminate the need for a homogeneous covariance structure, we first develop the functional Bahadur representation for the mean estimator under the RKHS framework; this representation naturally leads to the desirable pointwise limiting distributions. Moreover, we establish weak convergence for the mean estimator, allowing us to construct a test statistic for the mean difference. Our method is easily implementable and outperforms some conventional tests in controlling type I errors across various settings. We demonstrate the finite sample performance of our approach through extensive simulations and two real-world applications.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.