We consider inverse problems estimating distributed parameters from indirect noisy observations through discretization of continuum models described by partial differential or integral equations. It is well understood that the errors arising from the discretization can be detrimental for ill-posed inverse problems, as discretization error behaves as correlated noise. While this problem can be avoided with a discretization fine enough to suppress the modeling error level below that of the exogenous noise that is addressed, e.g., by regularization, the computational resources needed to deal with the additional degrees of freedom may require high performance computing environment. Following an earlier idea, we advocate the notion that the discretization is one of the unknowns of the inverse problem, and is updated iteratively together with the solution. In this approach, the discretization, defined in terms of an underlying metric, is refined selectively only where the representation power of the current mesh is insufficient. In this paper we allow the metrics and meshes to be anisotropic, and we show that this leads to significant reduction of memory allocation and computing time.
Categorical random variables can faithfully represent the discrete and uncertain aspects of data as part of a discrete latent variable model. Learning in such models necessitates taking gradients with respect to the parameters of the categorical probability distributions, which is often intractable due to their combinatorial nature. A popular technique to estimate these otherwise intractable gradients is the Log-Derivative trick. This trick forms the basis of the well-known REINFORCE gradient estimator and its many extensions. While the Log-Derivative trick allows us to differentiate through samples drawn from categorical distributions, it does not take into account the discrete nature of the distribution itself. Our first contribution addresses this shortcoming by introducing the CatLog-Derivative trick - a variation of the Log-Derivative trick tailored towards categorical distributions. Secondly, we use the CatLog-Derivative trick to introduce IndeCateR, a novel and unbiased gradient estimator for the important case of products of independent categorical distributions with provably lower variance than REINFORCE. Thirdly, we empirically show that IndeCateR can be efficiently implemented and that its gradient estimates have significantly lower bias and variance for the same number of samples compared to the state of the art.
This paper concerns a class of DC composite optimization problems which, as an extension of convex composite optimization problems and DC programs with nonsmooth components, often arises in robust factorization models of low-rank matrix recovery. For this class of nonconvex and nonsmooth problems, we propose an inexact linearized proximal algorithm (iLPA) by computing in each step an inexact minimizer of a strongly convex majorization constructed with a partial linearization of their objective functions at the current iterate, and establish the convergence of the generated iterate sequence under the Kurdyka-\L\"ojasiewicz (KL) property of a potential function. In particular, by leveraging the composite structure, we provide a verifiable condition for the potential function to have the KL property of exponent $1/2$ at the limit point, so for the iterate sequence to have a local R-linear convergence rate. Finally, we apply the proposed iLPA to a robust factorization model for matrix completions with outliers and non-uniform sampling, and numerical comparison with the Polyak subgradient method confirms its superiority in terms of computing time and quality of solutions.
The parallel alternating direction method of multipliers (ADMM) algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined (composite) regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. In this paper, we propose a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems and derive the corresponding parallel ADMM algorithms. Furthermore, we prove that the proposed algorithm not only has global convergence but also exhibits linear convergence rate. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithms can be obtained at //github.com/xfwu1016/CPADMM.
We propose data thinning, an approach for splitting an observation into two or more independent parts that sum to the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter. This very general proposal is applicable to any convolution-closed distribution, a class that includes the Gaussian, Poisson, negative binomial, gamma, and binomial distributions, among others. Data thinning has a number of applications to model selection, evaluation, and inference. For instance, cross-validation via data thinning provides an attractive alternative to the usual approach of cross-validation via sample splitting, especially in settings in which the latter is not applicable. In simulations and in an application to single-cell RNA-sequencing data, we show that data thinning can be used to validate the results of unsupervised learning approaches, such as k-means clustering and principal components analysis, for which traditional sample splitting is unattractive or unavailable.
State transformation problems such as compressing quantum information or breaking quantum commitments are fundamental quantum tasks. However, their computational difficulty cannot easily be characterized using traditional complexity theory, which focuses on tasks with classical inputs and outputs. To study the complexity of such state transformation tasks, we introduce a framework for unitary synthesis problems, including notions of reductions and unitary complexity classes. We use this framework to study the complexity of transforming one entangled state into another via local operations. We formalize this as the Uhlmann Transformation Problem, an algorithmic version of Uhlmann's theorem. Then, we prove structural results relating the complexity of the Uhlmann Transformation Problem, polynomial space quantum computation, and zero knowledge protocols. The Uhlmann Transformation Problem allows us to characterize the complexity of a variety of tasks in quantum information processing, including decoding noisy quantum channels, breaking falsifiable quantum cryptographic assumptions, implementing optimal prover strategies in quantum interactive proofs, and decoding the Hawking radiation of black holes. Our framework for unitary complexity thus provides new avenues for studying the computational complexity of many natural quantum information processing tasks.
We consider isogeometric discretizations of the Poisson model problem, focusing on high polynomial degrees and strong hierarchical refinements. We derive a posteriori error estimates by equilibrated fluxes, i.e., vector-valued mapped piecewise polynomials lying in the $\boldsymbol{H}({\rm div})$ space which appropriately approximate the desired divergence constraint. Our estimates are constant-free in the leading term, locally efficient, and robust with respect to the polynomial degree. They are also robust with respect to the number of hanging nodes arising in adaptive mesh refinement employing hierarchical B-splines. Two partitions of unity are designed, one with larger supports corresponding to the mapped splines, and one with small supports corresponding to mapped piecewise multilinear finite element hat basis functions. The equilibration is only performed on the small supports, avoiding the higher computational price of equilibration on the large supports or even the solution of a global system. Thus, the derived estimates are also as inexpensive as possible. An abstract framework for such a setting is developed, whose application to a specific situation only requests a verification of a few clearly identified assumptions. Numerical experiments illustrate the theoretical developments.
The strong convergence of the semi-implicit Euler-Maruyama (EM) method for stochastic differential equations with non-linear coefficients driven by a class of L\'evy processes is investigated. The dependence of the convergence order of the numerical scheme on the parameters of the class of L\'evy processes is discovered, which is different from existing results. In addition, the existence and uniqueness of numerical invariant measure of the semi-implicit EM method is studied and its convergence to the underlying invariant measure is also proved. Numerical examples are provided to confirm our theoretical results.
The probe and singular sources methods are well-known two classical direct reconstruction methods in inverse obstacle problems governed by partial differential equations. In this paper, by considering an inverse obstacle problem governed by the Laplace equation in a bounded domain as a prototype case, an integrated theory of the probe and singular sources methods is proposed. The theory consists of three parts: (i) introducing the singular sources method combined with the notion of the probe method; (ii) finding {\it a third indicator function} whose two ways decomposition yields the indicator functions in the probe and singular sources methods; (iii) finding the completely integrated version of the probe and singular sources methods.
We consider the problem of detecting causal relationships between discrete time series, in the presence of potential confounders. A hypothesis test is introduced for identifying the temporally causal influence of $(x_n)$ on $(y_n)$, causally conditioned on a possibly confounding third time series $(z_n)$. Under natural Markovian modeling assumptions, it is shown that the null hypothesis, corresponding to the absence of temporally causal influence, is equivalent to the underlying `causal conditional directed information rate' being equal to zero. The plug-in estimator for this functional is identified with the log-likelihood ratio test statistic for the desired test. This statistic is shown to be asymptotically normal under the alternative hypothesis and asymptotically $\chi^2$ distributed under the null, facilitating the computation of $p$-values when used on empirical data. The effectiveness of the resulting hypothesis test is illustrated on simulated data, validating the underlying theory. The test is also employed in the analysis of spike train data recorded from neurons in the V4 and FEF brain regions of behaving animals during a visual attention task. There, the test results are seen to identify interesting and biologically relevant information.
We consider the problem of sequential evaluation, in which an evaluator observes candidates in a sequence and assigns scores to these candidates in an online, irrevocable fashion. Motivated by the psychology literature that has studied sequential bias in such settings -- namely, dependencies between the evaluation outcome and the order in which the candidates appear -- we propose a natural model for the evaluator's rating process that captures the lack of calibration inherent to such a task. We conduct crowdsourcing experiments to demonstrate various facets of our model. We then proceed to study how to correct sequential bias under our model by posing this as a statistical inference problem. We propose a near-linear time, online algorithm for this task and prove guarantees in terms of two canonical ranking metrics. We also prove that our algorithm is information theoretically optimal, by establishing matching lower bounds in both metrics. Finally, we perform a host of numerical experiments to show that our algorithm often outperforms the de facto method of using the rankings induced by the reported scores, both in simulation and on the crowdsourcing data that we collected.