In this paper, we study the problem of recovering two unknown signals from their convolution, which is commonly referred to as blind deconvolution. Reformulation of blind deconvolution as a low-rank recovery problem has led to multiple theoretical recovery guarantees in the past decade due to the success of the nuclear norm minimization heuristic. In particular, in the absence of noise, exact recovery has been established for sufficiently incoherent signals contained in lower-dimensional subspaces. However, if the convolution is corrupted by additive bounded noise, the stability of the recovery problem remains much less understood. In particular, existing reconstruction bounds involve large dimension factors and therefore fail to explain the empirical evidence for dimension-independent robustness of nuclear norm minimization. Recently, theoretical evidence has emerged for ill-posed behavior of low-rank matrix recovery for sufficiently small noise levels. In this work, we develop improved recovery guarantees for blind deconvolution with adversarial noise which exhibit square-root scaling in the noise level. Hence, our results are consistent with existing counterexamples which speak against linear scaling in the noise level as demonstrated for related low-rank matrix recovery problems.
We propose a novel understanding of Sharpness-Aware Minimization (SAM) in the context of adversarial robustness. In this paper, we point out that both SAM and adversarial training (AT) can be viewed as specific feature perturbations, which improve adversarial robustness. However, we note that SAM and AT are distinct in terms of perturbation strength, leading to different accuracy and robustness trade-offs. We provide theoretical evidence for these claims in a simplified model with rigorous mathematical proofs. Furthermore, we conduct experiment to demonstrate that only utilizing SAM can achieve superior adversarial robustness compared to standard training, which is an unexpected benefit. As adversarial training can suffer from a decrease in clean accuracy, we show that using SAM alone can improve robustness without sacrificing clean accuracy. Code is available at //github.com/weizeming/SAM_AT.
Graph neural networks (GNNs), a type of neural network that can learn from graph-structured data and learn the representation of nodes by aggregating their neighbors, have shown excellent performance in downstream tasks.However, it is known that the performance of graph neural networks (GNNs) degrades gradually as the number of layers increases. Based on k-hop subgraph aggregation, which is a new concept, we propose a new perspective to understand the expressive power of GNN.From this perspective, we reveal the potential causes of the performance degradation of the deep traditional GNN - aggregated subgraph overlap, and the fact that the residual-based graph neural networks in fact exploit the aggregation results of 1 to k hop subgraphs to improve the effectiveness.Further, we propose a new sampling-based node-level residual module named SDF, which is shown by theoretical derivation to obtain a superior expressive power compared to previous residual methods by using information from 1 to k hop subgraphs more flexibly. Extensive experiments show that the performance and efficiency of GNN with the SDF module outperform other methods.
The ParaOpt algorithm was recently introduced as a time-parallel solver for optimal-control problems with a terminal-cost objective, and convergence results have been presented for the linear diffusive case with implicit-Euler time integrators. We reformulate ParaOpt for tracking problems and provide generalized convergence analyses for both objectives. We focus on linear diffusive equations and prove convergence bounds that are generic in the time integrators used. For large problem dimensions, ParaOpt's performance depends crucially on having a good preconditioner to solve the arising linear systems. For the case where ParaOpt's cheap, coarse-grained propagator is linear, we introduce diagonalization-based preconditioners inspired by recent advances in the ParaDiag family of methods. These preconditioners not only lead to a weakly-scalable ParaOpt version, but are themselves invertible in parallel, making maximal use of available concurrency. They have proven convergence properties in the linear diffusive case that are generic in the time discretization used, similarly to our ParaOpt results. Numerical results confirm that the iteration count of the iterative solvers used for ParaOpt's linear systems becomes constant in the limit of an increasing processor count. The paper is accompanied by a sequential MATLAB implementation.
It is known that standard stochastic Galerkin methods encounter challenges when solving partial differential equations with high dimensional random inputs, which are typically caused by the large number of stochastic basis functions required. It becomes crucial to properly choose effective basis functions, such that the dimension of the stochastic approximation space can be reduced. In this work, we focus on the stochastic Galerkin approximation associated with generalized polynomial chaos (gPC), and explore the gPC expansion based on the analysis of variance (ANOVA) decomposition. A concise form of the gPC expansion is presented for each component function of the ANOVA expansion, and an adaptive ANOVA procedure is proposed to construct the overall stochastic Galerkin system. Numerical results demonstrate the efficiency of our proposed adaptive ANOVA stochastic Galerkin method.
Anomalous diffusion is often modelled in terms of the subdiffusion equation, which can involve a weakly singular source term. For this case, many predominant time stepping methods, including the correction of high-order BDF schemes [{\sc Jin, Li, and Zhou}, SIAM J. Sci. Comput., 39 (2017), A3129--A3152], may suffer from a severe order reduction. To fill in this gap, we propose a smoothing method for time stepping schemes, where the singular term is regularized by using a $m$-fold integral-differential calculus and the equation is discretized by the $k$-step BDF convolution quadrature, called ID$m$-BDF$k$ method. We prove that the desired $k$th-order convergence can be recovered even if the source term is a weakly singular and the initial data is not compatible. Numerical experiments illustrate the theoretical results.
Linear inverse problems arise in diverse engineering fields especially in signal and image reconstruction. The development of computational methods for linear inverse problems with sparsity tool is one of the recent trends in this area. The so-called optimal $k$-thresholding is a newly introduced method for sparse optimization and linear inverse problems. Compared to other sparsity-aware algorithms, the advantage of optimal $k$-thresholding method lies in that it performs thresholding and error metric reduction simultaneously and thus works stably and robustly for solving medium-sized linear inverse problems. However, the runtime of this method remains high when the problem size is relatively large. The purpose of this paper is to propose an acceleration strategy for this method. Specifically, we propose a heavy-ball-based optimal $k$-thresholding (HBOT) algorithm and its relaxed variants for sparse linear inverse problems. The convergence of these algorithms is shown under the restricted isometry property. In addition, the numerical performance of the heavy-ball-based relaxed optimal $k$-thresholding pursuit (HBROTP) has been studied, and simulations indicate that HBROTP admits robust capability for signal and image reconstruction even in noisy environments.
Deep learning techniques have received much attention in the area of image denoising. However, there are substantial differences in the various types of deep learning methods dealing with image denoising. Specifically, discriminative learning based on deep learning can ably address the issue of Gaussian noise. Optimization models based on deep learning are effective in estimating the real noise. However, there has thus far been little related research to summarize the different deep learning techniques for image denoising. In this paper, we offer a comparative study of deep techniques in image denoising. We first classify the deep convolutional neural networks (CNNs) for additive white noisy images; the deep CNNs for real noisy images; the deep CNNs for blind denoising and the deep CNNs for hybrid noisy images, which represents the combination of noisy, blurred and low-resolution images. Then, we analyze the motivations and principles of the different types of deep learning methods. Next, we compare the state-of-the-art methods on public denoising datasets in terms of quantitative and qualitative analysis. Finally, we point out some potential challenges and directions of future research.
Video anomaly detection under weak labels is formulated as a typical multiple-instance learning problem in previous works. In this paper, we provide a new perspective, i.e., a supervised learning task under noisy labels. In such a viewpoint, as long as cleaning away label noise, we can directly apply fully supervised action classifiers to weakly supervised anomaly detection, and take maximum advantage of these well-developed classifiers. For this purpose, we devise a graph convolutional network to correct noisy labels. Based upon feature similarity and temporal consistency, our network propagates supervisory signals from high-confidence snippets to low-confidence ones. In this manner, the network is capable of providing cleaned supervision for action classifiers. During the test phase, we only need to obtain snippet-wise predictions from the action classifier without any extra post-processing. Extensive experiments on 3 datasets at different scales with 2 types of action classifiers demonstrate the efficacy of our method. Remarkably, we obtain the frame-level AUC score of 82.12% on UCF-Crime.
Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks. Motivated by this observation, we develop new network architectures that increase adversarial robustness by performing feature denoising. Specifically, our networks contain blocks that denoise the features using non-local means or other filters; the entire networks are trained end-to-end. When combined with adversarial training, our feature denoising networks substantially improve the state-of-the-art in adversarial robustness in both white-box and black-box attack settings. On ImageNet, under 10-iteration PGD white-box attacks where prior art has 27.9% accuracy, our method achieves 55.7%; even under extreme 2000-iteration PGD white-box attacks, our method secures 42.6% accuracy. A network based on our method was ranked first in Competition on Adversarial Attacks and Defenses (CAAD) 2018 --- it achieved 50.6% classification accuracy on a secret, ImageNet-like test dataset against 48 unknown attackers, surpassing the runner-up approach by ~10%. Code and models will be made publicly available.
We study how to generate captions that are not only accurate in describing an image but also discriminative across different images. The problem is both fundamental and interesting, as most machine-generated captions, despite phenomenal research progresses in the past several years, are expressed in a very monotonic and featureless format. While such captions are normally accurate, they often lack important characteristics in human languages - distinctiveness for each caption and diversity for different images. To address this problem, we propose a novel conditional generative adversarial network for generating diverse captions across images. Instead of estimating the quality of a caption solely on one image, the proposed comparative adversarial learning framework better assesses the quality of captions by comparing a set of captions within the image-caption joint space. By contrasting with human-written captions and image-mismatched captions, the caption generator effectively exploits the inherent characteristics of human languages, and generates more discriminative captions. We show that our proposed network is capable of producing accurate and diverse captions across images.