The solution of inverse problems is central to a wide range of applications including medicine, biology, and engineering. These problems require finding a desired solution in the presence of noisy observations. A key feature of inverse problems is their ill-posedness, which leads to unstable behavior under noise when standard solution methods are used. For this reason, regularization methods have been developed that compromise between data fitting and prior structure. Recently, data-driven variational regularization methods have been introduced, where the prior in the form of a regularizer is derived from provided ground truth data. However, these methods have mainly been analyzed for Tikhonov regularization, referred to as Network Tikhonov Regularization (NETT). In this paper, we propose and analyze Morozov regularization in combination with a learned regularizer. The regularizers, which can be adapted to the training data, are defined by neural networks and are therefore non-convex. We give a convergence analysis in the non-convex setting allowing noise-dependent regularizers, and propose a possible training strategy. We present numerical results for attenuation correction in the context of photoacoustic tomography.
We consider the numerical approximation of a continuum model of antiferromagnetic and ferrimagnetic materials. The state of the material is described in terms of two unit-length vector fields, which can be interpreted as the magnetizations averaging the spins of two sublattices. For the static setting, which requires the solution of a constrained energy minimization problem, we introduce a discretization based on first-order finite elements and prove its $\Gamma$-convergence. Then, we propose and analyze two iterative algorithms for the computation of low-energy stationary points. The algorithms are obtained from (semi-)implicit time discretizations of gradient flows of the energy. Finally, we extend the algorithms to the dynamic setting, which consists of a nonlinear system of two Landau-Lifshitz-Gilbert equations solved by the two fields, and we prove unconditional stability and convergence of the finite element approximations toward a weak solution of the problem. Numerical experiments assess the performance of the algorithms and demonstrate their applicability for the simulation of physical processes involving antiferromagnetic and ferrimagnetic materials.
Linear transformation of the state variable (linear preconditioning) is a common technique that often drastically improves the practical performance of a Markov chain Monte Carlo algorithm. Despite this, however, the benefits of linear preconditioning are not well-studied theoretically, and rigorous guidelines for choosing preconditioners are not always readily available. Mixing time bounds for various samplers have been produced in recent works for the class of strongly log-concave and Lipschitz target distributions and depend strongly on a quantity known as the condition number. We study linear preconditioning for this class of distributions, and under appropriate assumptions we provide bounds on the condition number after using a given linear preconditioner. We provide bounds on the spectral gap of RWM that are tight in their dependence on the condition number under the same assumptions. Finally we offer a review and analysis of popular preconditioners. Of particular note, we identify a surprising case in which preconditioning with the diagonal of the target covariance can actually make the condition number \emph{increase} relative to doing no preconditioning at all.
Deep learning methods are emerging as popular computational tools for solving forward and inverse problems in traffic flow. In this paper, we study a neural operator framework for learning solutions to nonlinear hyperbolic partial differential equations with applications in macroscopic traffic flow models. In this framework, an operator is trained to map heterogeneous and sparse traffic input data to the complete macroscopic traffic state in a supervised learning setting. We chose a physics-informed Fourier neural operator ($\pi$-FNO) as the operator, where an additional physics loss based on a discrete conservation law regularizes the problem during training to improve the shock predictions. We also propose to use training data generated from random piecewise constant input data to systematically capture the shock and rarefied solutions. From experiments using the LWR traffic flow model, we found superior accuracy in predicting the density dynamics of a ring-road network and urban signalized road. We also found that the operator can be trained using simple traffic density dynamics, e.g., consisting of $2-3$ vehicle queues and $1-2$ traffic signal cycles, and it can predict density dynamics for heterogeneous vehicle queue distributions and multiple traffic signal cycles $(\geq 2)$ with an acceptable error. The extrapolation error grew sub-linearly with input complexity for a proper choice of the model architecture and training data. Adding a physics regularizer aided in learning long-term traffic density dynamics, especially for problems with periodic boundary data.
Fredholm integral equations of the second kind that are defined on a finite or infinite interval arise in many applications. This paper discusses Nystr\"om methods based on Gauss quadrature rules for the solution of such integral equations. It is important to be able to estimate the error in the computed solution, because this allows the choice of an appropriate number of nodes in the Gauss quadrature rule used. This paper explores the application of averaged and weighted averaged Gauss quadrature rules for this purpose, and introduces new stability properties for them.
Current approaches to generic segmentation start by creating a hierarchy of nested image partitions and then specifying a segmentation from it. Our first contribution is to describe several ways, most of them new, for specifying segmentations using the hierarchy elements. Then, we consider the best hierarchy-induced segmentation specified by a limited number of hierarchy elements. We focus on a common quality measure for binary segmentations, the Jaccard index (also known as IoU). Optimizing the Jaccard index is highly non-trivial, and yet we propose an efficient approach for doing exactly that. This way we get algorithm-independent upper bounds on the quality of any segmentation created from the hierarchy. We found that the obtainable segmentation quality varies significantly depending on the way that the segments are specified by the hierarchy elements, and that representing a segmentation with only a few hierarchy elements is often possible. (Code is available).
This paper presents a general methodology for deriving information-theoretic generalization bounds for learning algorithms. The main technical tool is a probabilistic decorrelation lemma based on a change of measure and a relaxation of Young's inequality in $L_{\psi_p}$ Orlicz spaces. Using the decorrelation lemma in combination with other techniques, such as symmetrization, couplings, and chaining in the space of probability measures, we obtain new upper bounds on the generalization error, both in expectation and in high probability, and recover as special cases many of the existing generalization bounds, including the ones based on mutual information, conditional mutual information, stochastic chaining, and PAC-Bayes inequalities. In addition, the Fernique-Talagrand upper bound on the expected supremum of a subgaussian process emerges as a special case.
We consider the application of finite element exterior calculus (FEEC) methods to a class of canonical Hamiltonian PDE systems involving differential forms. Solutions to these systems satisfy a local multisymplectic conservation law, which generalizes the more familiar symplectic conservation law for Hamiltonian systems of ODEs, and which is connected with physically-important reciprocity phenomena, such as Lorentz reciprocity in electromagnetics. We characterize hybrid FEEC methods whose numerical traces satisfy a version of the multisymplectic conservation law, and we apply this characterization to several specific classes of FEEC methods, including conforming Arnold-Falk-Winther-type methods and various hybridizable discontinuous Galerkin (HDG) methods. Interestingly, the HDG-type and other nonconforming methods are shown, in general, to be multisymplectic in a stronger sense than the conforming FEEC methods. This substantially generalizes previous work of McLachlan and Stern [Found. Comput. Math., 20 (2020), pp. 35-69] on the more restricted class of canonical Hamiltonian PDEs in the de Donder-Weyl "grad-div" form.
Physics informed neural networks (PINNs) have recently been very successfully applied for efficiently approximating inverse problems for PDEs. We focus on a particular class of inverse problems, the so-called data assimilation or unique continuation problems, and prove rigorous estimates on the generalization error of PINNs approximating them. An abstract framework is presented and conditional stability estimates for the underlying inverse problem are employed to derive the estimate on the PINN generalization error, providing rigorous justification for the use of PINNs in this context. The abstract framework is illustrated with examples of four prototypical linear PDEs. Numerical experiments, validating the proposed theory, are also presented.
Many imaging science tasks can be modeled as a discrete linear inverse problem. Solving linear inverse problems is often challenging, with ill-conditioned operators and potentially non-unique solutions. Embedding prior knowledge, such as smoothness, into the solution can overcome these challenges. In this work, we encode prior knowledge using a non-negative patch dictionary, which effectively learns a basis from a training set of natural images. In this dictionary basis, we desire solutions that are non-negative and sparse (i.e., contain many zero entries). With these constraints, standard methods for solving discrete linear inverse problems are not directly applicable. One such approach is the modified residual norm steepest descent (MRNSD), which produces non-negative solutions but does not induce sparsity. In this paper, we provide two methods based on MRNSD that promote sparsity. In our first method, we add an $\ell_1$-regularization term with a new, optimal step size. In our second method, we propose a new non-negative, sparsity-promoting mapping of the solution. We compare the performance of our proposed methods on a number of numerical experiments, including deblurring, image completion, computer tomography, and superresolution. Our results show that these methods effectively solve discrete linear inverse problems with non-negativity and sparsity constraints.
Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.