PCA-Net is a recently proposed neural operator architecture which combines principal component analysis (PCA) with neural networks to approximate operators between infinite-dimensional function spaces. The present work develops approximation theory for this approach, improving and significantly extending previous work in this direction: First, a novel universal approximation result is derived, under minimal assumptions on the underlying operator and the data-generating distribution. Then, two potential obstacles to efficient operator learning with PCA-Net are identified, and made precise through lower complexity bounds; the first relates to the complexity of the output distribution, measured by a slow decay of the PCA eigenvalues. The other obstacle relates to the inherent complexity of the space of operators between infinite-dimensional input and output spaces, resulting in a rigorous and quantifiable statement of the curse of dimensionality. In addition to these lower bounds, upper complexity bounds are derived. A suitable smoothness criterion is shown to ensure an algebraic decay of the PCA eigenvalues. Furthermore, it is shown that PCA-Net can overcome the general curse of dimensionality for specific operators of interest, arising from the Darcy flow and the Navier-Stokes equations.
Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks, due to their event-driven spiking computation. However, state-of-the-art deep SNNs (including Spikformer and SEW ResNet) suffer from non-spike computations (integer-float multiplications) caused by the structure of their residual connection. These non-spike computations increase SNNs' power consumption and make them unsuitable for deployment on mainstream neuromorphic hardware, which only supports spike operations. In this paper, we propose a hardware-friendly spike-driven residual learning architecture for SNNs to avoid non-spike computations. Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network. We evaluate Spikingformer on ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS and DVS128 Gesture datasets, and demonstrate that Spikingformer outperforms the state-of-the-art in directly trained pure SNNs as a novel advanced backbone (75.85$\%$ top-1 accuracy on ImageNet, + 1.04$\%$ compared with Spikformer). Furthermore, our experiments verify that Spikingformer effectively avoids non-spike computations and significantly reduces energy consumption by 57.34$\%$ compared with Spikformer on ImageNet. To our best knowledge, this is the first time that a pure event-driven transformer-based SNN has been developed.
As we progress towards physical implementation of quantum algorithms it is vital to determine the explicit resource costs needed to run them. Solving linear systems of equations is a fundamental problem with a wide variety of applications across many fields of science, and there is increasing effort to develop quantum linear solver algorithms. Here we introduce a quantum linear solver algorithm combining ideas from adiabatic quantum computing with filtering techniques based on quantum signal processing. We give a closed formula for the non-asymptotic query complexity $Q$ -- the exact number of calls to a block-encoding of the linear system matrix -- as a function of condition number $\kappa$, error tolerance $\epsilon$ and block-encoding scaling factor $\alpha$. Our protocol reduces the cost of quantum linear solvers over state-of-the-art close to an order of magnitude for early implementations. The asymptotic scaling is $O(\kappa \log(\kappa/\epsilon))$, slightly looser than the $O(\kappa \log(1/\epsilon))$ scaling of the asymptotically optimal algorithm of Costa et al. However, our algorithm outperforms the latter for all condition numbers up to $\kappa \approx 10^{32}$, at which point $Q$ is comparably large, and both algorithms are anyway practically unfeasible. The present optimized analysis is both problem-agnostic and architecture-agnostic, and hence can be deployed in any quantum algorithm that uses linear solvers as a subroutine.
A multiplicity queue is a concurrently-defined data type which relaxes the conditions of a linearizable FIFO queue to allow concurrent Dequeue instances to return the same value. It would seem that this should allow faster implementations, as processes should not need to wait as long to learn about concurrent operations at remote processes and previous work has shown that multiplicity queues are computationally less complex than the unrelaxed version. Intriguingly, recent work has shown that there is, in fact, not much speedup possible versus an unrelaxed queue implementation. Seeking to understand this difference between intuition and real behavior, we extend that work, increasing the lower bound for uniform algorithms. Further, we outline a path forward toward building proofs for even higher lower bounds, allowing us to hypothesize that the worst-case time to Dequeue approaches maximum message delay, which is similar to the time required for an unrelaxed Dequeue. We also give an upper bound for a special case to show that our bounds are tight at that point. To achieve our lower bounds, we use extended shifting arguments, which have been rarely used but allow larger lower bounds than traditional shifting arguments. We use these in series of inductive indistinguishability proofs which allow us to extend our proofs beyond the usual limitations of shifting arguments. This proof structure is an interesting contribution independently of the main result, as developing new lower bound proof techniques may have many uses in future work.
For solving linear inverse problems, particularly of the type that appear in tomographic imaging and compressive sensing, this paper develops two new approaches. The first approach is an iterative algorithm that minimizers a regularized least squares objective function where the regularization is based on a compound Gaussian prior distribution. The Compound Gaussian prior subsumes many of the commonly used priors in image reconstruction, including those of sparsity-based approaches. The developed iterative algorithm gives rise to the paper's second new approach, which is a deep neural network that corresponds to an "unrolling" or "unfolding" of the iterative algorithm. Unrolled deep neural networks have interpretable layers and outperform standard deep learning methods. This paper includes a detailed computational theory that provides insight into the construction and performance of both algorithms. The conclusion is that both algorithms outperform other state-of-the-art approaches to tomographic image formation and compressive sensing, especially in the difficult regime of low training.
Matrix recovery from sparse observations is an extensively studied topic emerging in various applications, such as recommendation system and signal processing, which includes the matrix completion and compressed sensing models as special cases. In this work we propose a general framework for dynamic matrix recovery of low-rank matrices that evolve smoothly over time. We start from the setting that the observations are independent across time, then extend to the setting that both the design matrix and noise possess certain temporal correlation via modified concentration inequalities. By pooling neighboring observations, we obtain sharp estimation error bounds of both settings, showing the influence of the underlying smoothness, the dependence and effective samples. We propose a dynamic fast iterative shrinkage thresholding algorithm that is computationally efficient, and characterize the interplay between algorithmic and statistical convergence. Simulated and real data examples are provided to support such findings.
Extended Dynamic Mode Decomposition (EDMD) is a data-driven tool for forecasting and model reduction of dynamics, which has been extensively taken up in the physical sciences. While the method is conceptually simple, in deterministic chaos it is unclear what its properties are or even what it converges to. In particular, it is not clear how EDMD's least-squares approximation treats the classes of regular functions needed to make sense of chaotic dynamics. In this paper we develop a general, rigorous theory of EDMD on the simplest examples of chaotic maps: analytic expanding maps of the circle. To do this, we prove a new result in the theory of orthogonal polynomials on the unit circle (OPUC) and apply methods from transfer operator theory. We show that in the infinite-data limit, the least-squares projection is exponentially efficient for trigonometric polynomial observable dictionaries. As a result, we show that the forecasts and Koopman spectral data produced using EDMD in this setting converge to the physically meaningful limits, exponentially quickly in the size of the dictionary. This demonstrates that with only a relatively small polynomial dictionary, EDMD can be very effective, even when the sampling measure is not uniform. Furthermore, our OPUC result suggests that data-based least-squares projections may be a very effective approximation strategy.
Polynomial zonotopes, a non-convex set representation, have a wide range of applications from real-time motion planning and control in robotics, to reachability analysis of nonlinear systems and safety shielding in reinforcement learning.Despite this widespread use, a frequently overlooked difficulty associated with polynomial zonotopes is intersection checking. Determining whether the reachable set, represented as a polynomial zonotope, intersects an unsafe set is not straightforward.In fact, we show that this fundamental operation is NP-hard, even for a simple class of polynomial zonotopes.The standard method for intersection checking with polynomial zonotopes is a two-part algorithm that overapproximates a polynomial zonotope with a regular zonotope and then, if the overapproximation error is deemed too large, splits the set and recursively tries again.Beyond the possible need for a large number of splits, we identify two sources of concern related to this algorithm: (1) overapproximating a polynomial zonotope with a zonotope has unbounded error, and (2) after splitting a polynomial zonotope, the overapproximation error can actually increase.Taken together, this implies there may be a possibility that the algorithm does not always terminate.We perform a rigorous analysis of the method and detail necessary conditions for the union of overapproximations to provably converge to the original polynomial zonotope.
This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially observable problem in the landmark state to Markov decision process (MDP), a differentiable field of view to shape the reward, and an attention-based neural network to represent the control policy. The approach is further unified with active volumetric mapping to promote exploration in addition to landmark localization. The performance is demonstrated in several simulated landmark localization tasks in comparison with benchmark methods.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph's topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.