A first proposal of a sparse and cellwise robust PCA method is presented. Robustness to single outlying cells in the data matrix is achieved by substituting the squared loss function for the approximation error by a robust version. The integration of a sparsity-inducing $L_1$ or elastic net penalty offers additional modeling flexibility. For the resulting challenging optimization problem, an algorithm based on Riemannian stochastic gradient descent is developed, with the advantage of being scalable to high-dimensional data, both in terms of many variables as well as observations. The resulting method is called SCRAMBLE (Sparse Cellwise Robust Algorithm for Manifold-based Learning and Estimation). Simulations reveal the superiority of this approach in comparison to established methods, both in the casewise and cellwise robustness paradigms. Two applications from the field of tribology underline the advantages of a cellwise robust and sparse PCA method.
This paper presents a novel method using conformal prediction to generate two-sided or one-sided prediction intervals for survival times. Specifically, the method provides both lower and upper predictive bounds for individuals deemed sufficiently similar to the non-censored population, while returning only a lower bound for others. The prediction intervals offer finite-sample coverage guarantees, requiring no distributional assumptions other than the sampled data points are independent and identically distributed. The performance of the procedure is assessed using both synthetic and real-world datasets.
Methods for analyzing representations in neural systems are increasingly popular tools in neuroscience and mechanistic interpretability. Measures comparing neural activations across conditions, architectures, and species give scalable ways to understand information transformation within different neural networks. However, recent findings show that some metrics respond to spurious signals, leading to misleading results. Establishing benchmark test cases is thus essential for identifying the most reliable metric and potential improvements. We propose that compositional learning in recurrent neural networks (RNNs) can provide a test case for dynamical representation alignment metrics. Implementing this case allows us to evaluate if metrics can identify representations that develop throughout learning and determine if representations identified by metrics reflect the network's actual computations. Building both attractor and RNN based test cases, we show that the recently proposed Dynamical Similarity Analysis (DSA) is more noise robust and reliably identifies behaviorally relevant representations compared to prior metrics (Procrustes, CKA). We also demonstrate how such test cases can extend beyond metric evaluation to study new architectures. Specifically, testing DSA in modern (Mamba) state space models suggests that these models, unlike RNNs, may not require changes in recurrent dynamics due to their expressive hidden states. Overall, we develop test cases that showcase how DSA's enhanced ability to detect dynamical motifs makes it highly effective for identifying ongoing computations in RNNs and revealing how networks learn tasks.
Mass scaling is widely used in finite element models of structural dynamics for increasing the critical time step of explicit time integration methods. While the field has been flourishing over the years, it still lacks a strong theoretical basis and mostly relies on numerical experiments as the only means of assessment. This contribution thoroughly reviews existing methods and connects them to established linear algebra results to derive rigorous eigenvalue bounds and condition number estimates. Our results cover some of the most successful mass scaling techniques, unraveling for the first time well-known numerical observations.
We propose an implementable, neural network-based structure preserving probabilistic numerical approximation for a generalized obstacle problem describing the value of a zero-sum differential game of optimal stopping with asymmetric information. The target solution depends on three variables: the time, the spatial (or state) variable, and a variable from a standard $(I-1)$-simplex which represents the probabilities with which the $I$ possible configurations of the game are played. The proposed numerical approximation preserves the convexity of the continuous solution as well as the lower and upper obstacle bounds. We show convergence of the fully-discrete scheme to the unique viscosity solution of the continuous problem and present a range of numerical studies to demonstrate its applicability.
In cluster-randomized experiments, there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. Despite an extensive development of causal mediation methods in the past decade, only a few exceptions have been considered in assessing causal mediation in cluster-randomized studies, all of which depend on parametric model-based estimators. In this article, we develop the formal semiparametric efficiency theory to motivate several doubly-robust methods for addressing several mediation effect estimands corresponding to both the cluster-average and the individual-level treatment effects in cluster-randomized experiments -- the natural indirect effect, natural direct effect, and spillover mediation effect. We derive the efficient influence function for each mediation effect, and carefully parameterize each efficient influence function to motivate practical strategies for operationalizing each estimator. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain semiparametric efficient causal mediation estimators in the latter case. Our methods are illustrated via extensive simulations and two completed cluster-randomized experiments.
We propose a new second-order accurate lattice Boltzmann formulation for linear elastodynamics that is stable for arbitrary combinations of material parameters under a CFL-like condition. The construction of the numerical scheme uses an equivalent first-order hyperbolic system of equations as an intermediate step, for which a vectorial lattice Boltzmann formulation is introduced. The only difference to conventional lattice Boltzmann formulations is the usage of vector-valued populations, so that all computational benefits of the algorithm are preserved. Using the asymptotic expansion technique and the notion of pre-stability structures we further establish second-order consistency as well as analytical stability estimates. Lastly, we introduce a second-order consistent initialization of the populations as well as a boundary formulation for Dirichlet boundary conditions on 2D rectangular domains. All theoretical derivations are numerically verified by convergence studies using manufactured solutions and long-term stability tests.
In this work we revisit the arithmetic and bit complexity of Hermitian eigenproblems. We first provide an analysis for the divide-and-conquer tridiagonal eigensolver of Gu and Eisenstat [GE95] in the Real RAM model, when accelerated with the Fast Multipole Method. The analysis asserts the claimed nearly-$O(n^2)$ complexity to compute a full diagonalization of a symmetric tridiagonal matrix. Combined with the tridiagonal reduction algorithm of Sch\"onhage [Sch72], it implies that a Hermitian matrix can be diagonalized deterministically in $O(n^{\omega}\log(n)+n^2\mathrm{polylog}(n/\epsilon))$ arithmetic operations, where $\omega\lesssim 2.371$ is the square matrix multiplication exponent. This improves the classic deterministic $O(n^3)$ diagonalization algorithms, and derandomizes the $ O(n^{\omega}\log^2(n/\epsilon))$ algorithm of [BGVKS, FOCS '20]. Ultimately, this has a direct application to the SVD, which is widely used as a subroutine in advanced algorithms, but its complexity and approximation guarantees are often unspecified. In finite precision, we show that Sch\"onhage's algorithm is stable in floating point using $O(\log(n/\epsilon))$ bits. Combined with the (rational arithmetic) algorithm of Bini and Pan [BP91], it provides a deterministic algorithm to compute all the eigenvalues of a Hermitian matrix in $O\left(n^{\omega}F\left(\log(n/\epsilon)\right)+n^2\mathrm{polylog}(n/\epsilon)\right)$ bit operations, where $F(b)\in\widetilde{O}(b)$ is the bit complexity of a single floating point operation on $b$ bits. This improves the best known $\widetilde{O}(n^3)$ deterministic and $O\left( n^{\omega}\log^2(n/\epsilon)F\left(\log^4(n/\epsilon)\log(n)\right)\right)$ randomized complexities. We conclude with some other useful subroutines such as computing spectral gaps, condition numbers, and spectral projectors, and few open problems.
Ridge functions are used to describe and study the lower bound of the approximation done by the neural networks which can be written as a linear combination of activation functions. If the activation functions are also ridge functions, these networks are called explainable neural networks. In this brief paper, we first show that quantum neural networks which are based on variational quantum circuits can be written as a linear combination of ridge functions by following matrix notations. Consequently, we show that the interpretability and explainability of such quantum neural networks can be directly considered and studied as an approximation with the linear combination of ridge functions.
Matrix perturbation bounds (such as Weyl and Davis-Kahan) are frequently used in many branches of mathematics. Most of the classical results in this area are optimal, in the worst case analysis. However, in modern applications, both the ground and the nose matrices frequently have extra structural properties. For instance, it is often assumed that the ground matrix is essentially low rank, and the nose matrix is random or pseudo-random. We aim to rebuild a part of perturbation theory, adapting to these modern assumptions. We will do this using a contour expansion argument, which enables us to exploit the skewness among the leading eigenvectors of the ground and the noise matrix (which is significant when the two are uncorrelated) to our advantage. In the current paper, we focus on the perturbation of eigenspaces. This helps us to introduce the arguments in the cleanest way, avoiding the more technical consideration of the general case. In applications, this case is also one of the most useful. More general results appear in a subsequent paper. Our method has led to several improvements, which have direct applications in central problems. Among others, we derive a sharp result for perturbation of a low rank matrix with random perturbation, answering an open question in this area. Next, we derive new results concerning the spike model, an important model in statistics, bridging two different directions of current research. Finally, we use our results on the perturbation of eigenspaces to derive new results concerning eigenvalues of deterministic and random matrices. In particular, we obtain new results concerning the outliers in the deformed Wigner model and the least singular value of random matrices with non-zero mean.
Coupled systems of free flow and porous media arise in a variety of technical and environmental applications. For laminar flow regimes, such systems are described by the Stokes equations in the free-flow region and Darcy's law in the porous medium. An appropriate set of coupling conditions is needed on the fluid-porous interface. Discretisations of the Stokes-Darcy problems yield large, sparse, ill-conditioned, and, depending on the interface conditions, non-symmetric linear systems. Therefore, robust and efficient preconditioners are needed to accelerate convergence of the applied Krylov method. In this work, we consider the second order MAC scheme for the coupled Stokes-Darcy problems and develop and investigate block diagonal, block triangular and constraint preconditioners. We apply two classical sets of coupling conditions considering the Beavers-Joseph and the Beavers-Joseph-Saffman condition for the tangential velocity. For the Beavers-Joseph interface condition, the resulting system is non-symmetric, therefore GMRES method is used for both cases. Spectral analysis is conducted for the exact versions of the preconditioners identifying clusters and bounds. Furthermore, for practical use we develop efficient inexact versions of the preconditioners. We demonstrate effectiveness and robustness of the proposed preconditioners in numerical experiments.