Lifted samplers form a class of Markov chain Monte Carlo methods which has drawn a lot attention in recent years due to superior performance in challenging Bayesian applications. A canonical example of such sampler is the one that is derived from a random walk Metropolis algorithm for a totally-ordered state space such as the integers or the real numbers. The lifted sampler is derived by splitting into two the proposal distribution: one part in the increasing direction, and the other part in the decreasing direction. It keeps following a direction, until a rejection, upon which it flips the direction. In terms of asymptotic variances, it outperforms the random walk Metropolis algorithm, regardless of the target distribution, at no additional computational cost. Other studies show, however, that beyond this simple case, lifted samplers do not always outperform their Metropolis counterparts. In this paper, we leverage the celebrated work of Tierney (1998) to provide an analysis in a general framework encompassing a broad class of lifted samplers. Our finding is that, essentially, the asymptotic variances cannot increase by a factor of more than 2, regardless of the target distribution, the way the directions are induced, and the type of algorithm from which the lifted sampler is derived (be it a Metropolis--Hastings algorithm, a reversible jump algorithm, etc.). This result indicates that, while there is potentially a lot to gain from lifting a sampler, there is not much to lose.
This paper leans on two similar areas so far detached from each other. On the one hand, Dung's pioneering contributions to abstract argumentation, almost thirty years ago, gave rise to a plethora of successors, including abstract dialectical frameworks (ADFs). On the other hand, Boolean networks (BNs), devised as models of gene regulation, have been successful for studying the behavior of molecular processes within cells. ADFs and BNs are similar to each other: both can be viewed as functions from vectors of bits to vectors of bits. As soon as similarities emerge between these two formalisms, however, differences appear. For example, conflict-freedom is prominent in argumentation (where we are interested in a self-consistent, i.e., conflict-free, set of beliefs) but absent in BNs. By contrast, asynchrony (where only one gene is updated at a time) is conspicuous in BNs and lacking in argumentation. Finally, while a monotonicity-based notion occurs in signed reasoning of both argumentation and gene regulation, a different, derivative-based notion only appears in the BN literature. To identify common mathematical structure between both formalisms, these differences need clarification. This contribution is a partial review of both these areas, where we cover enough ground to exhibit their more evident similarities, to then reconcile some of their apparent differences. We highlight a range of avenues of research resulting from ironing out discrepancies between these two fields. Unveiling their common concerns should enable these two areas to cross-fertilize so as to transfer ideas and results between each other.
A nontrivial connected graph is matching covered if each edge belongs to some perfect matching. For most problems pertaining to perfect matchings, one may restrict attention to matching covered graphs; thus, there is extensive literature on them. A cornerstone of this theory is an ear decomposition result due to Lov\'asz and Plummer. Their theorem is a fundamental problem-solving tool, and also yields interesting open problems; we discuss two such problems below, and we solve one of them. A subgraph $H$ of a graph $G$ is conformal if $G-V(H)$ has a perfect matching. This notion is intrinsically related to the aforementioned ear decomposition theorem -- which implies that each matching covered graph (apart from $K_2$ and even cycles) contains a conformal bisubdivision of $\theta$, or a conformal bisubdivision of $K_4$, possibly both. (Here, $\theta$ refers to the graph with two vertices joined by three edges.) This immediately leads to two problems: characterize $\theta$-free (likewise, $K_4$-free) matching covered graphs. A characterization of planar $K_4$-free matching covered graphs was obtained by Kothari and Murty [J. Graph Theory, 82 (1), 2016]; the nonplanar case is open. We provide a characterization of $\theta$-free matching covered graphs that immediately implies a poly-time algorithm for the corresponding decision problem. Our characterization relies heavily on a seminal result due to Edmonds, Lov\'asz and Pulleyblank [Combinatorica, 2, 1982] pertaining to the tight cut decomposition theory of matching covered graphs. As corollaries, we provide two upper bounds on the size of a $\theta$-free graph, namely, $m\leq 2n-1$ and $m\leq \frac{3n}{2}+b-1$, where $b$ denotes the number of bricks obtained in any tight cut decomposition of the graph; for each bound, we provide a characterization of the tight examples. The Petersen graph and $K_4$ play key roles in our results.
Bayesian geoacoustic inversion problems are conventionally solved by Markov chain Monte Carlo methods or its variants, which are computationally expensive. This paper extends the classic Bayesian geoacoustic inversion framework by deriving important geoacoustic statistics of Bayesian geoacoustic inversion from the multidimensional posterior probability density (PPD) using the mixture density network (MDN) theory. These statistics make it convenient to train the network directly on the whole parameter space and get the multidimensional PPD of model parameters. The present approach provides a much more efficient way to solve geoacoustic inversion problems in Bayesian inference framework. The network is trained on a simulated dataset of surface-wave dispersion curves with shear-wave velocities as labels and tested on both synthetic and real data cases. The results show that the network gives reliable predictions and has good generalization performance on unseen data. Once trained, the network can rapidly (within seconds) give a fully probabilistic solution which is comparable to Monte Carlo methods. It provides an promising approach for real-time inversion.
We describe a randomized algorithm for producing a near-optimal hierarchical off-diagonal low-rank (HODLR) approximation to an $n\times n$ matrix $\mathbf{A}$, accessible only though matrix-vector products with $\mathbf{A}$ and $\mathbf{A}^{\mathsf{T}}$. We prove that, for the rank-$k$ HODLR approximation problem, our method achieves a $(1+\beta)^{\log(n)}$-optimal approximation in expected Frobenius norm using $O(k\log(n)/\beta^3)$ matrix-vector products. In particular, the algorithm obtains a $(1+\varepsilon)$-optimal approximation with $O(k\log^4(n)/\varepsilon^3)$ matrix-vector products, and for any constant $c$, an $n^c$-optimal approximation with $O(k \log(n))$ matrix-vector products. Apart from matrix-vector products, the additional computational cost of our method is just $O(n \operatorname{poly}(\log(n), k, \beta))$. We complement the upper bound with a lower bound, which shows that any matrix-vector query algorithm requires at least $\Omega(k\log(n) + k/\varepsilon)$ queries to obtain a $(1+\varepsilon)$-optimal approximation. Our algorithm can be viewed as a robust version of widely used "peeling" methods for recovering HODLR matrices and is, to the best of our knowledge, the first matrix-vector query algorithm to enjoy theoretical worst-case guarantees for approximation by any hierarchical matrix class. To control the propagation of error between levels of hierarchical approximation, we introduce a new perturbation bound for low-rank approximation, which shows that the widely used Generalized Nystr\"om method enjoys inherent stability when implemented with noisy matrix-vector products. We also introduced a novel randomly perforated matrix sketching method to further control the error in the peeling algorithm.
Keyword spotting (KWS) is one of the speech recognition tasks most sensitive to the quality of the feature representation. However, the research on KWS has traditionally focused on new model topologies, putting little emphasis on other aspects like feature extraction. This paper investigates the use of the multitaper technique to create improved features for KWS. The experimental study is carried out for different test scenarios, windows and parameters, datasets, and neural networks commonly used in embedded KWS applications. Experiment results confirm the advantages of using the proposed improved features.
We propose a way to maintain strong consistency and facilitate error analysis in the context of dissipation-based WENO stabilization for continuous and discontinuous Galerkin discretizations of conservation laws. Following Kuzmin and Vedral (J. Comput. Phys. 487:112153, 2023) and Vedral (arXiv preprint arXiv:2309.12019), we use WENO shock detectors to determine appropriate amounts of low-order artificial viscosity. In contrast to existing WENO methods, our approach blends candidate polynomials using residual-based nonlinear weights. The shock-capturing terms of our stabilized Galerkin methods vanish if residuals do. This enables us to achieve improved accuracy compared to weakly consistent alternatives. As we show in the context of steady convection-diffusion-reaction (CDR) equations, nonlinear local projection stabilization terms can be included in a way that preserves the coercivity of local bilinear forms. For the corresponding Galerkin-WENO discretization of a CDR problem, we rigorously derive a priori error estimates. Additionally, we demonstrate the stability and accuracy of the proposed method through one- and two-dimensional numerical experiments for hyperbolic conservation laws and systems thereof. The numerical results for representative test problems are superior to those obtained with traditional WENO schemes, particularly in scenarios involving shocks and steep gradients.
Symbolic regression plays a crucial role in modern scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. A grand challenge lies in the arduous search for parsimonious and generalizable mathematical formulas, in an infinite search space, while intending to fit the training data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity, which essentially hinders the pace of applying symbolic regression for scientific exploration across interdisciplinary domains. To this end, we introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data. Through a series of extensive experiments, we demonstrate the superior accuracy and efficiency of PTS for equation discovery, which greatly outperforms the state-of-the-art baseline models on over 80 synthetic and experimental datasets (e.g., lifting its performance by up to 99% accuracy improvement and one-order of magnitude speed up). PTS represents a key advance in accurate and efficient data-driven discovery of symbolic, interpretable models (e.g., underlying physical laws) and marks a pivotal transition towards scalable symbolic learning.
The automated finite element analysis of complex CAD models using boundary-fitted meshes is rife with difficulties. Immersed finite element methods are intrinsically more robust but usually less accurate. In this work, we introduce an efficient, robust, high-order immersed finite element method for complex CAD models. Our approach relies on three adaptive structured grids: a geometry grid for representing the implicit geometry, a finite element grid for discretising physical fields and a quadrature grid for evaluating the finite element integrals. The geometry grid is a sparse VDB (Volumetric Dynamic B+ tree) grid that is highly refined close to physical domain boundaries. The finite element grid consists of a forest of octree grids distributed over several processors, and the quadrature grid in each finite element cell is an octree grid constructed in a bottom-up fashion. We discretise physical fields on the finite element grid using high-order Lagrange basis functions. The resolution of the quadrature grid ensures that finite element integrals are evaluated with sufficient accuracy and that any sub-grid geometric features, like small holes or corners, are resolved up to a desired resolution. The conceptual simplicity and modularity of our approach make it possible to reuse open-source libraries, i.e. openVDB and p4est for implementing the geometry and finite element grids, respectively, and BDDCML for iteratively solving the discrete systems of equations in parallel using domain decomposition. We demonstrate the efficiency and robustness of the proposed approach by solving the Poisson equation on domains given by complex CAD models and discretised with tens of millions of degrees of freedom.
This work addresses the open question of implementing fault-tolerant QRLCs with feasible computational overhead. We present a new decoder for quantum random linear codes (QRLCs) capable of dealing with imperfect decoding operations. A first approach, introduced by Cruz et al., only considered channel errors, and perfect gates at the decoder. Here, we analyze the fault-tolerant characteristics of QRLCs with a new noise-guessing decoding technique, when considering preparation, measurement, and gate errors in the syndrome extraction procedure, while also accounting for error degeneracy. Our findings indicate a threshold error rate ($\pth$) of approximately $\pnum$ in the asymptotic limit, while considering realistic noise levels in the mentioned physical procedures.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.