In this paper, we study a general low-rank matrix recovery problem with linear measurements corrupted by some noise. The objective is to understand under what conditions on the restricted isometry property (RIP) of the problem local search methods can find the ground truth with a small error. By analyzing the landscape of the non-convex problem, we first propose a global guarantee on the maximum distance between an arbitrary local minimizer and the ground truth under the assumption that the RIP constant is smaller than $1/2$. We show that this distance shrinks to zero as the intensity of the noise reduces. Our new guarantee is sharp in terms of the RIP constant and is much stronger than the existing results. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. Next, we prove the strict saddle property, which guarantees the global convergence of the perturbed gradient descent method in polynomial time. The developed results demonstrate how the noise intensity and the RIP constant of the problem affect the landscape of the problem.
In this work, we develop a novel efficient quadrature and sparse grid based polynomial interpolation method to price American options with multiple underlying assets. The approach is based on first formulating the pricing of American options using dynamic programming, and then employing static sparse grids to interpolate the continuation value function at each time step. To achieve high efficiency, we first transform the domain from $\mathbb{R}^d$ to $(-1,1)^d$ via a scaled tanh map, and then remove the boundary singularity of the resulting multivariate function over $(-1,1)^d$ by a bubble function and simultaneously, to significantly reduce the number of interpolation points. We rigorously establish that with a proper choice of the bubble function, the resulting function has bounded mixed derivatives up to a certain order, which provides theoretical underpinnings for the use of sparse grids. Numerical experiments for American arithmetic and geometric basket put options with the number of underlying assets up to 16 are presented to validate the effectiveness of the approach.
In this paper, we consider a learning problem among non-cooperative agents interacting in a time-varying system. Specifically, we focus on repeated linear quadratic network games, in which the network of interactions changes with time and agents may not be present at each iteration. To get tractability, we assume that at each iteration, the network of interactions is sampled from an underlying random network model and agents participate at random with a given probability. Under these assumptions, we consider a gradient-based learning algorithm and establish almost sure convergence of the agents' strategies to the Nash equilibrium of the game played over the expected network. Additionally, we prove, in the large population regime, that the learned strategy is an $\epsilon$-Nash equilibrium for each stage game with high probability. We validate our results over an online market application.
Tensor network techniques, known for their low-rank approximation ability that breaks the curse of dimensionality, are emerging as a foundation of new mathematical methods for ultra-fast numerical solutions of high-dimensional Partial Differential Equations (PDEs). Here, we present a mixed Tensor Train (TT)/Quantized Tensor Train (QTT) approach for the numerical solution of time-independent Boltzmann Neutron Transport equations (BNTEs) in Cartesian geometry. Discretizing a realistic three-dimensional (3D) BNTE by (i) diamond differencing, (ii) multigroup-in-energy, and (iii) discrete ordinate collocation leads to huge generalized eigenvalue problems that generally require a matrix-free approach and large computer clusters. Starting from this discretization, we construct a TT representation of the PDE fields and discrete operators, followed by a QTT representation of the TT cores and solving the tensorized generalized eigenvalue problem in a fixed-point scheme with tensor network optimization techniques. We validate our approach by applying it to two realistic examples of 3D neutron transport problems, currently solved by the PARallel TIme-dependent SN (PARTISN) solver. We demonstrate that our TT/QTT method, executed on a standard desktop computer, leads to a yottabyte compression of the memory storage, and more than 7500 times speedup with a discrepancy of less than 1e-5 when compared to the PARTISN solution.
Dexterous manipulation of objects once held in hand remains a challenge. Such skills are, however, necessary for robotics to move beyond gripper-based manipulation and use all the dexterity offered by anthropomorphic robotic hands. One major challenge when manipulating an object within the hand is that fingers must move around the object while avoiding collision with other fingers or the object. Such collision-free paths must be computed in real-time, as the smallest deviation from the original plan can easily lead to collisions. We present a real-time approach to computing collision-free paths in a high-dimensional space. To guide the exploration, we learn an explicit representation of the free space, retrievable in real-time. We further combine this representation with closed-loop control via dynamical systems and sampling-based motion planning and show that the combination increases performance compared to alternatives, offering efficient search of feasible paths and real-time obstacle avoidance in a multi-fingered robotic hand.
In this work, we study minimum data rate tracking of a dynamical system under a neuromorphic event-based sensing paradigm. We begin by bridging the gap between continuous-time (CT) system dynamics and information theory's causal rate distortion theory. We motivate the use of non-singular source codes to quantify bitrates in event-based sampling schemes. This permits an analysis of minimum bitrate event-based tracking using tools already established in the control and information theory literature. We derive novel, nontrivial lower bounds to event-based sensing, and compare the lower bound with the performance of well-known schemes in the established literature.
In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs either during training or testing in real-world situations; and 2) when the computation resources are not available to finetune on heavy transformer models. To this end, we propose to utilize prompt learning and mitigate the above two challenges together. Specifically, our modality-missing-aware prompts can be plugged into multimodal transformers to handle general missing-modality cases, while only requiring less than 1% learnable parameters compared to training the entire model. We further explore the effect of different prompt configurations and analyze the robustness to missing modality. Extensive experiments are conducted to show the effectiveness of our prompt learning framework that improves the performance under various missing-modality cases, while alleviating the requirement of heavy model re-training. Code is available.
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
Video anomaly detection under weak labels is formulated as a typical multiple-instance learning problem in previous works. In this paper, we provide a new perspective, i.e., a supervised learning task under noisy labels. In such a viewpoint, as long as cleaning away label noise, we can directly apply fully supervised action classifiers to weakly supervised anomaly detection, and take maximum advantage of these well-developed classifiers. For this purpose, we devise a graph convolutional network to correct noisy labels. Based upon feature similarity and temporal consistency, our network propagates supervisory signals from high-confidence snippets to low-confidence ones. In this manner, the network is capable of providing cleaned supervision for action classifiers. During the test phase, we only need to obtain snippet-wise predictions from the action classifier without any extra post-processing. Extensive experiments on 3 datasets at different scales with 2 types of action classifiers demonstrate the efficacy of our method. Remarkably, we obtain the frame-level AUC score of 82.12% on UCF-Crime.
The potential of graph convolutional neural networks for the task of zero-shot learning has been demonstrated recently. These models are highly sample efficient as related concepts in the graph structure share statistical strength allowing generalization to new classes when faced with a lack of data. However, knowledge from distant nodes can get diluted when propagating through intermediate nodes, because current approaches to zero-shot learning use graph propagation schemes that perform Laplacian smoothing at each layer. We show that extensive smoothing does not help the task of regressing classifier weights in zero-shot learning. In order to still incorporate information from distant nodes and utilize the graph structure, we propose an Attentive Dense Graph Propagation Module (ADGPM). ADGPM allows us to exploit the hierarchical graph structure of the knowledge graph through additional connections. These connections are added based on a node's relationship to its ancestors and descendants and an attention scheme is further used to weigh their contribution depending on the distance to the node. Finally, we illustrate that finetuning of the feature representation after training the ADGPM leads to considerable improvements. Our method achieves competitive results, outperforming previous zero-shot learning approaches.
With the rapid growth of knowledge bases (KBs), question answering over knowledge base, a.k.a. KBQA has drawn huge attention in recent years. Most of the existing KBQA methods follow so called encoder-compare framework. They map the question and the KB facts to a common embedding space, in which the similarity between the question vector and the fact vectors can be conveniently computed. This, however, inevitably loses original words interaction information. To preserve more original information, we propose an attentive recurrent neural network with similarity matrix based convolutional neural network (AR-SMCNN) model, which is able to capture comprehensive hierarchical information utilizing the advantages of both RNN and CNN. We use RNN to capture semantic-level correlation by its sequential modeling nature, and use an attention mechanism to keep track of the entities and relations simultaneously. Meanwhile, we use a similarity matrix based CNN with two-directions pooling to extract literal-level words interaction matching utilizing CNNs strength of modeling spatial correlation among data. Moreover, we have developed a new heuristic extension method for entity detection, which significantly decreases the effect of noise. Our method has outperformed the state-of-the-arts on SimpleQuestion benchmark in both accuracy and efficiency.