In this study, we explore the effects of including noise predictors and noise observations when fitting linear regression models. We present empirical and theoretical results that show that double descent occurs in both cases, albeit with contradictory implications: the implication for noise predictors is that complex models are often better than simple ones, while the implication for noise observations is that simple models are often better than complex ones. We resolve this contradiction by showing that it is not the model complexity but rather the implicit shrinkage by the inclusion of noise in the model that drives the double descent. Specifically, we show how noise predictors or observations shrink the estimators of the regression coefficients and make the test error asymptote, and then how the asymptotes of the test error and the ``condition number anomaly'' ensure that double descent occurs. We also show that including noise observations in the model makes the (usually unbiased) ordinary least squares estimator biased and indicates that the ridge regression estimator may need a negative ridge parameter to avoid over-shrinkage.
Optimal solutions of combinatorial optimization problems can be sensitive to changes in the cost of one or more elements of the ground set E. Single and set tolerances measure the supremum / infimum possible change such that the current solution remains optimal for cost changes in one or more elements. The current definition does not apply to all elements of E or to all subsets of E. In this work, we broaden the definition to all elements for single tolerances and to all subsets of elements for set tolerances, while proving that key theoretical and computational properties still apply.
We present a new Krylov subspace recycling method for solving a linear system of equations, or a sequence of slowly changing linear systems. Our approach is to reduce the computational overhead of recycling techniques while still benefiting from the acceleration afforded by such techniques. As such, this method augments an unprojected Krylov subspace. Furthermore, it combines randomized sketching and deflated restarting in a way that avoids orthogononalizing a full Krylov basis. We call this new method GMRES-SDR (sketched deflated restarting). With this new method, we provide new theory, which initially characterizes unaugmented sketched GMRES as a projection method for which the projectors involve the sketching operator. We demonstrate that sketched GMRES and its sibling method sketched FOM are an MR/OR pairing, just like GMRES and FOM. We furthermore obtain residual convergence estimates. Building on this, we characterize GMRES-SDR also in terms of sketching-based projectors. Compression of the augmented Krylov subspace for recycling is performed using a sketched version of harmonic Ritz vectors. We present results of numerical experiments demonstrating the effectiveness of GMRES-SDR over competitor methods such as GMRES-DR and GCRO-DR.
In this study, we address the challenge of solving elliptic equations with quasiperiodic coefficients. To achieve accurate and efficient computation, we introduce the projection method, which enables the embedding of quasiperiodic systems into higher-dimensional periodic systems. To enhance the computational efficiency, we propose a compressed storage strategy for the stiffness matrix by its multi-level block circulant structure, significantly reducing memory requirements. Furthermore, we design a diagonal preconditioner to efficiently solve the resulting high-dimensional linear system by reducing the condition number of the stiffness matrix. These techniques collectively contribute to the computational effectiveness of our proposed approach. Convergence analysis shows the spectral accuracy of the proposed method. We demonstrate the effectiveness and accuracy of our approach through a series of numerical examples. Moreover, we apply our method to achieve a highly accurate computation of the homogenized coefficients for a quasiperiodic multiscale elliptic equation.
In this work, we develop Crank-Nicolson-type iterative decoupled algorithms for a three-field formulation of Biot's consolidation model using total pressure. We begin by constructing an equivalent fully implicit coupled algorithm using the standard Crank-Nicolson method for the three-field formulation of Biot's model. Employing an iterative decoupled scheme to decompose the resulting coupled system, we derive two distinctive forms of Crank-Nicolson-type iterative decoupled algorithms based on the order of temporal computation and iteration: a time-stepping iterative decoupled algorithm and a global-in-time iterative decoupled algorithm. Notably, the proposed global-in-time algorithm supports a partially parallel-in-time feature. Capitalizing on the convergence properties of the iterative decoupled scheme, both algorithms exhibit second-order time accuracy and unconditional stability. Through numerical experiments, we validate theoretical predictions and demonstrate the effectiveness and efficiency of these novel approaches.
Given a finite set of matrices with integer entries, the matrix mortality problem asks if there exists a product of these matrices equal to the zero matrix. We consider a special case of this problem where all entries of the matrices are nonnegative. This case is equivalent to the NFA mortality problem, which, given an NFA, asks for a word $w$ such that the image of every state under $w$ is the empty set. The size of the alphabet of the NFA is then equal to the number of matrices in the set. We study the length of shortest such words depending on the size of the alphabet. We show that for an NFA with $n$ states this length can be at least $2^n - 1$ for an alphabet of size $n$, $2^{(n - 4)/2}$ for an alphabet of size $3$ and $2^{(n - 2)/3}$ for an alphabet of size $2$. We also discuss further open problems related to mortality of NFAs and DFAs.
Symbolic systems are powerful frameworks for modeling cognitive processes as they encapsulate the rules and relationships fundamental to many aspects of human reasoning and behavior. Central to these models are systematicity, compositionality, and productivity, making them invaluable in both cognitive science and artificial intelligence. However, certain limitations remain. For instance, the integration of structured symbolic processes and latent sub-symbolic processes has been implemented at the computational level through fiat methods such as quantization or softmax sampling, which assume, rather than derive, the operations underpinning discretization and symbolicization. In this work, we introduce a novel neural stochastic dynamical systems model that integrates attractor dynamics with symbolic representations to model cognitive processes akin to the probabilistic language of thought (PLoT). Our model segments the continuous representational space into discrete basins, with attractor states corresponding to symbolic sequences, that reflect the semanticity and compositionality characteristic of symbolic systems through unsupervised learning, rather than relying on pre-defined primitives. Moreover, like PLoT, our model learns to sample a diverse distribution of attractor states that reflect the mutual information between the input data and the symbolic encodings. This approach establishes a unified framework that integrates both symbolic and sub-symbolic processing through neural dynamics, a neuro-plausible substrate with proven expressivity in AI, offering a more comprehensive model that mirrors the complex duality of cognitive operations.
Spatial variables can be observed in many different forms, such as regularly sampled random fields (lattice data), point processes, and randomly sampled spatial processes. Joint analysis of such collections of observations is clearly desirable, but complicated by the lack of an easily implementable analysis framework. It is well known that Fourier transforms provide such a framework, but its form has eluded data analysts. We formalize it by providing a multitaper analysis framework using coupled discrete and continuous data tapers, combined with the discrete Fourier transform for inference. Using this set of tools is important, as it forms the backbone for practical spectral analysis. In higher dimensions it is important not to be constrained to Cartesian product domains, and so we develop the methodology for spectral analysis using irregular domain data tapers, and the tapered discrete Fourier transform. We discuss its fast implementation, and the asymptotic as well as large finite domain properties. Estimators of partial association between different spatial processes are provided as are principled methods to determine their significance, and we demonstrate their practical utility on a large-scale ecological dataset.
In this paper, we study the backward stochastic differential equations driven by G-Brownian motion under the condition that the generator is time-varying Lipschitz continuous with respect to y and time-varying uniformly continuous with respect to z. With the help of linearization method and the G-stochastic analysis techniques, we construct the approximating sequences of G-BSDE and obtain some precise a priori estimates. By combining this with the approximation method, we prove the existence and uniqueness of the solution under the time-varying conditions, as well as the comparison theorem.
A probabilistic fatigue lifetime model is developed in conjunction with a multi-scale method for structures with pores whose exact distribution, i.e. geometries and locations, is unknown. The model takes into account uncertainty in fatigue lifetimes in structures due to defects at two scales: micro-scale heterogeneity & meso-scale pores. An element-wise probabilistic strain-life model with its criterion modified for taking into account multiaxial loading is developed for taking into account the effect of micro-scale defects on the fatigue lifetime. The effect of meso-scale pores in the structure is taken into account via statistical modelling of the expected pore populations via a finite element method, based on tomographic scans of a small region of porous material used to make the structure. A previously implemented Neuber-type plastic correction algorithm is used for fast full-field approximation of the strain-life criterion around the statistically generated pore fields. The probability of failure of a porous structure is obtained via a weakest link assumption at the level of its constituent finite elements. The fatigue model can be identified via a maximum likelihood estimate on experimental fatigue data of structures containing different types of pore populations. The proposed method is tested on an existing data-set of an aluminium alloy with two levels of porosity. The model requires lesser data for identification than traditional models that consider porous media as a homogeneous material, as the same base material is considered for the two grades of porous material. Numerical studies on synthetically generated data show that the model is capable of taking into account the statistical size effect in fatigue, and demonstrate that fatigue properties of subsurface porous material are lower than that of core porous material, which makes homogenisation of the model non-trivial.
It is no secret that statistical modelling often involves making simplifying assumptions when attempting to study complex stochastic phenomena. Spatial modelling of extreme values is no exception, with one of the most common such assumptions being stationarity in the marginal and/or dependence features. If non-stationarity has been detected in the marginal distributions, it is tempting to try to model this while assuming stationarity in the dependence, without necessarily putting this latter assumption through thorough testing. However, margins and dependence are often intricately connected and the detection of non-stationarity in one feature might affect the detection of non-stationarity in the other. This work is an in-depth case study of this interrelationship, with a particular focus on a spatio-temporal environmental application exhibiting well-documented marginal non-stationarity. Specifically, we compare and contrast four different marginal detrending approaches in terms of our post-detrending ability to detect temporal non-stationarity in the spatial extremal dependence structure of a sea surface temperature dataset from the Red Sea.