Gaussian processes (GPs) are widely-used tools in spatial statistics and machine learning and the formulae for the mean function and covariance kernel of a GP $T u$ that is the image of another GP $u$ under a linear transformation $T$ acting on the sample paths of $u$ are well known, almost to the point of being folklore. However, these formulae are often used without rigorous attention to technical details, particularly when $T$ is an unbounded operator such as a differential operator, which is common in many modern applications. This note provides a self-contained proof of the claimed formulae for the case of a closed, densely-defined operator $T$ acting on the sample paths of a square-integrable (not necessarily Gaussian) stochastic process. Our proof technique relies upon Hille's theorem for the Bochner integral of a Banach-valued random variable.
We study a class of nonlocal partial differential equations presenting a tensor-mobility, in space, obtained asymptotically from nonlocal dynamics on localising infinite graphs. Our strategy relies on the variational structure of both equations, being a Riemannian and Finslerian gradient flow, respectively. More precisely, we prove that weak solutions of the nonlocal interaction equation on graphs converge to weak solutions of the aforementioned class of nonlocal interaction equation with a tensor-mobility in the Euclidean space. This highlights an interesting property of the graph, being a potential space-discretisation for the equation under study.
Rational best approximations (in a Chebyshev sense) to real functions are characterized by an equioscillating approximation error. Similar results do not hold true for rational best approximations to complex functions in general. In the present work, we consider unitary rational approximations to the exponential function on the imaginary axis, which map the imaginary axis to the unit circle. In the class of unitary rational functions, best approximations are shown to exist, to be uniquely characterized by equioscillation of a phase error, and to possess a super-linear convergence rate. Furthermore, the best approximations have full degree (i.e., non-degenerate), achieve their maximum approximation error at points of equioscillation, and interpolate at intermediate points. Asymptotic properties of poles, interpolation nodes, and equioscillation points of these approximants are studied. Three algorithms, which are found very effective to compute unitary rational approximations including candidates for best approximations, are sketched briefly. Some consequences to numerical time-integration are discussed. In particular, time propagators based on unitary best approximants are unitary, symmetric and A-stable.
Deep learning techniques have dominated the literature on aspect-based sentiment analysis (ABSA), achieving state-of-the-art performance. However, deep models generally suffer from spurious correlations between input features and output labels, which hurts the robustness and generalization capability by a large margin. In this paper, we propose to reduce spurious correlations for ABSA, via a novel Contrastive Variational Information Bottleneck framework (called CVIB). The proposed CVIB framework is composed of an original network and a self-pruned network, and these two networks are optimized simultaneously via contrastive learning. Concretely, we employ the Variational Information Bottleneck (VIB) principle to learn an informative and compressed network (self-pruned network) from the original network, which discards the superfluous patterns or spurious correlations between input features and prediction labels. Then, self-pruning contrastive learning is devised to pull together semantically similar positive pairs and push away dissimilar pairs, where the representations of the anchor learned by the original and self-pruned networks respectively are regarded as a positive pair while the representations of two different sentences within a mini-batch are treated as a negative pair. To verify the effectiveness of our CVIB method, we conduct extensive experiments on five benchmark ABSA datasets and the experimental results show that our approach achieves better performance than the strong competitors in terms of overall prediction performance, robustness, and generalization. Code and data to reproduce the results in this paper is available at: //github.com/shesshan/CVIB.
Functions with singularities are notoriously difficult to approximate with conventional approximation schemes. In computational applications they are often resolved with low-order piecewise polynomials, multilevel schemes or other types of grading strategies. Rational functions are an exception to this rule: for univariate functions with point singularities, such as branch points, rational approximations exist with root-exponential convergence in the rational degree. This is typically enabled by the clustering of poles near the singularity. Both the theory and computational practice of rational functions for function approximation have focused on the univariate case, with extensions to two dimensions via identification with the complex plane. Multivariate rational functions, i.e., quotients of polynomials of several variables, are relatively unexplored in comparison. Yet, apart from a steep increase in theoretical complexity, they also offer a wealth of opportunities. A first observation is that singularities of multivariate rational functions may be continuous curves of poles, rather than isolated ones. By generalizing the clustering of poles from points to curves, we explore constructions of multivariate rational approximations to functions with curves of singularities.
Measurement-based quantum computation (MBQC) is a paradigm for quantum computation where computation is driven by local measurements on a suitably entangled resource state. In this work we show that MBQC is related to a model of quantum computation based on Clifford quantum cellular automata (CQCA). Specifically, we show that certain MBQCs can be directly constructed from CQCAs which yields a simple and intuitive circuit model representation of MBQC in terms of quantum computation based on CQCA. We apply this description to construct various MBQC-based Ans\"atze for parameterized quantum circuits, demonstrating that the different Ans\"atze may lead to significantly different performances on different learning tasks. In this way, MBQC yields a family of Hardware-efficient Ans\"atze that may be adapted to specific problem settings and is particularly well suited for architectures with translationally invariant gates such as neutral atoms.
While generalized linear mixed models (GLMMs) are a fundamental tool in applied statistics, many specifications -- such as those involving categorical factors with many levels or interaction terms -- can be computationally challenging to estimate due to the need to compute or approximate high-dimensional integrals. Variational inference (VI) methods are a popular way to perform such computations, especially in the Bayesian context. However, naive VI methods can provide unreliable uncertainty quantification. We show that this is indeed the case in the GLMM context, proving that standard VI (i.e. mean-field) dramatically underestimates posterior uncertainty in high-dimensions. We then show how appropriately relaxing the mean-field assumption leads to VI methods whose uncertainty quantification does not deteriorate in high-dimensions, and whose total computational cost scales linearly with the number of parameters and observations. Our theoretical and numerical results focus on GLMMs with Gaussian or binomial likelihoods, and rely on connections to random graph theory to obtain sharp high-dimensional asymptotic analysis. We also provide generic results, which are of independent interest, relating the accuracy of variational inference to the convergence rate of the corresponding coordinate ascent variational inference (CAVI) algorithm for Gaussian targets. Our proposed partially-factorized VI (PF-VI) methodology for GLMMs is implemented in the R package vglmer, see //github.com/mgoplerud/vglmer . Numerical results with simulated and real data examples illustrate the favourable computation cost versus accuracy trade-off of PF-VI.
This study investigates the misclassification excess risk bound in the context of 1-bit matrix completion, a significant problem in machine learning involving the recovery of an unknown matrix from a limited subset of its entries. Matrix completion has garnered considerable attention in the last two decades due to its diverse applications across various fields. Unlike conventional approaches that deal with real-valued samples, 1-bit matrix completion is concerned with binary observations. While prior research has predominantly focused on the estimation error of proposed estimators, our study shifts attention to the prediction error. This paper offers theoretical analysis regarding the prediction errors of two previous works utilizing the logistic regression model: one employing a max-norm constrained minimization and the other employing nuclear-norm penalization. Significantly, our findings demonstrate that the latter achieves the minimax-optimal rate without the need for an additional logarithmic term. These novel results contribute to a deeper understanding of 1-bit matrix completion by shedding light on the predictive performance of specific methodologies.
In spatial blind source separation the observed multivariate random fields are assumed to be mixtures of latent spatially dependent random fields. The objective is to recover latent random fields by estimating the unmixing transformation. Currently, the algorithms for spatial blind source separation can only estimate linear unmixing transformations. Nonlinear blind source separation methods for spatial data are scarce. In this paper we extend an identifiable variational autoencoder that can estimate nonlinear unmixing transformations to spatially dependent data and demonstrate its performance for both stationary and nonstationary spatial data using simulations. In addition, we introduce scaled mean absolute Shapley additive explanations for interpreting the latent components through nonlinear mixing transformation. The spatial identifiable variational autoencoder is applied to a geochemical dataset to find the latent random fields, which are then interpreted by using the scaled mean absolute Shapley additive explanations. Finally, we illustrate how the proposed method can be used as a pre-processing method when making multivariate predictions.
Infinitary and cyclic proof systems are proof systems for logical formulas with fixed-point operators or inductive definitions. A cyclic proof system is a restriction of the corresponding infinitary proof system. Hence, these proof systems are generally not the same, as in the cyclic system may be weaker than the infinitary system. For several logics, the infinitary proof systems are shown to be cut-free complete. However, cyclic proof systems are characterized with many unknown problems on the (cut-free) completeness or the cut-elimination property. In this study, we show that the provability of infinitary and cyclic proof systems are the same for some propositional logics with fixed-point operators or inductive definitions and that the cyclic proof systems are cut-free complete.
In this contribution we investigate the application of phase-field fracture models on non-linear multiscale computational homogenization schemes. In particular, we introduce different phase-fields on a two-scale problem and develop a thermodynamically consistent model. This allows on the one hand for the prediction of local micro-fracture patterns, which effectively acts as an anisotropic damage model on the macroscale. On the other and, the macro-fracture phase-field model allows to predict complex fracture pattern with regard to local microstructures. Both phase-fields are introduced in a common framework, such that a joint consistent linearization for the Newton-Raphson iteration can be developed. Finally, the limits of both models as well as the applicability are shown in different numerical examples.