$\textbf{Background and aims}$: Artificial Intelligence (AI) Computer-Aided Detection (CADe) is commonly used for polyp detection, but data seen in clinical settings can differ from model training. Few studies evaluate how well CADe detectors perform on colonoscopies from countries not seen during training, and none are able to evaluate performance without collecting expensive and time-intensive labels. $\textbf{Methods}$: We trained a CADe polyp detector on Israeli colonoscopy videos (5004 videos, 1106 hours) and evaluated on Japanese videos (354 videos, 128 hours) by measuring the True Positive Rate (TPR) versus false alarms per minute (FAPM). We introduce a colonoscopy dissimilarity measure called "MAsked mediCal Embedding Distance" (MACE) to quantify differences between colonoscopies, without labels. We evaluated CADe on all Japan videos and on those with the highest MACE. $\textbf{Results}$: MACE correctly quantifies that narrow-band imaging (NBI) and chromoendoscopy (CE) frames are less similar to Israel data than Japan whitelight (bootstrapped z-test, |z| > 690, p < $10^{-8}$ for both). Despite differences in the data, CADe performance on Japan colonoscopies was non-inferior to Israel ones without additional training (TPR at 0.5 FAPM: 0.957 and 0.972 for Israel and Japan; TPR at 1.0 FAPM: 0.972 and 0.989 for Israel and Japan; superiority test t > 45.2, p < $10^{-8}$). Despite not being trained on NBI or CE, TPR on those subsets were non-inferior to Japan overall (non-inferiority test t > 47.3, p < $10^{-8}$, $\delta$ = 1.5% for both). $\textbf{Conclusion}$: Differences that prevent CADe detectors from performing well in non-medical settings do not degrade the performance of our AI CADe polyp detector when applied to data from a new country. MACE can help medical AI models internationalize by identifying the most "dissimilar" data on which to evaluate models.
Exact methods for exponentiation of matrices of dimension $N$ can be computationally expensive in terms of execution time ($N^{3}$) and memory requirements ($N^{2}$) not to mention numerical precision issues. A type of matrix often exponentiated in the sciences is the rate matrix. Here we explore five methods to exponentiate rate matrices some of which apply even more broadly to other matrix types. Three of the methods leverage a mathematical analogy between computing matrix elements of a matrix exponential and computing transition probabilities of a dynamical processes (technically a Markov jump process, MJP, typically simulated using Gillespie). In doing so, we identify a novel MJP-based method relying on restricting the number of "trajectory" jumps based on the magnitude of the matrix elements with favorable computational scaling. We then discuss this method's downstream implications on mixing properties of Monte Carlo posterior samplers. We also benchmark two other methods of matrix exponentiation valid for any matrix (beyond rate matrices and, more generally, positive definite matrices) related to solving differential equations: Runge-Kutta integrators and Krylov subspace methods. Under conditions where both the largest matrix element and the number of non-vanishing elements scale linearly with $N$ -- reasonable conditions for rate matrices often exponentiated -- computational time scaling with the most competitive methods (Krylov and one of the MJP-based methods) reduces to $N^2$ with total memory requirements of $N$.
Input-output conformance simulation (iocos) has been proposed by Gregorio-Rodr\'iguez, Llana and Mart\'inez-Torres as a simulation-based behavioural preorder underlying model-based testing. This relation is inspired by Tretmans' classic ioco relation, but has better worst-case complexity than ioco and supports stepwise refinement. The goal of this paper is to develop the theory of iocos by studying logical characterisations of this relation, rule formats for it and its compositionality. More specifically, this article presents characterisations of iocos in terms of modal logics and compares them with an existing logical characterisation for ioco proposed by Beohar and Mousavi. It also offers a characteristic-formula construction for iocos over finite processes in an extension of the proposed modal logics with greatest fixed points. A precongruence rule format for iocos and a rule format ensuring that operations take quiescence properly into account are also given. Both rule formats are based on the GSOS format by Bloom, Istrail and Meyer. The general modal decomposition methodology of Fokkink and van Glabbeek is used to show how to check the satisfaction of properties expressed in the logic for iocos in a compositional way for operations specified by rules in the precongruence rule format for iocos .
We consider the estimation of the cumulative hazard function, and equivalently the distribution function, with censored data under a setup that preserves the privacy of the survival database. This is done through a $\alpha$-locally differentially private mechanism for the failure indicators and by proposing a non-parametric kernel estimator for the cumulative hazard function that remains consistent under the privatization. Under mild conditions, we also prove lowers bounds for the minimax rates of convergence and show that estimator is minimax optimal under a well-chosen bandwidth.
Vecchia approximation has been widely used to accurately scale Gaussian-process (GP) inference to large datasets, by expressing the joint density as a product of conditional densities with small conditioning sets. We study fixed-domain asymptotic properties of Vecchia-based GP inference for a large class of covariance functions (including Mat\'ern covariances) with boundary conditioning. In this setting, we establish that consistency and asymptotic normality of maximum exact-likelihood estimators imply those of maximum Vecchia-likelihood estimators, and that exact GP prediction can be approximated accurately by Vecchia GP prediction, given that the size of conditioning sets grows polylogarithmically with the data size. Hence, Vecchia-based inference with quasilinear complexity is asymptotically equivalent to exact GP inference with cubic complexity. This also provides a general new result on the screening effect. Our findings are illustrated by numerical experiments, which also show that Vecchia approximation can be more accurate than alternative approaches such as covariance tapering and reduced-rank approximations.
We consider finite element approximations to the optimal constant for the Hardy inequality with exponent $p=2$ in bounded domains of dimension $n=1$ or $n \geq 3$. For finite element spaces of piecewise linear and continuous functions on a mesh of size $h$, we prove that the approximate Hardy constant converges to the optimal Hardy constant at a rate proportional to $1/| \log h |^2$. This result holds in dimension $n=1$, in any dimension $n \geq 3$ if the domain is the unit ball and the finite element discretization exploits the rotational symmetry of the problem, and in dimension $n=3$ for general finite element discretizations of the unit ball. In the first two cases, our estimates show excellent quantitative agreement with values of the discrete Hardy constant obtained computationally.
We explore a linear inhomogeneous elasticity equation with random Lam\'e parameters. The latter are parameterized by a countably infinite number of terms in separated expansions. The main aim of this work is to estimate expected values (considered as an infinite dimensional integral on the parametric space corresponding to the random coefficients) of linear functionals acting on the solution of the elasticity equation. To achieve this, the expansions of the random parameters are truncated, a high-order quasi-Monte Carlo (QMC) is combined with a sparse grid approach to approximate the high dimensional integral, and a Galerkin finite element method (FEM) is introduced to approximate the solution of the elasticity equation over the physical domain. The error estimates from (1) truncating the infinite expansion, (2) the Galerkin FEM, and (3) the QMC sparse grid quadrature rule are all studied. For this purpose, we show certain required regularity properties of the continuous solution with respect to both the parametric and physical variables. To achieve our theoretical regularity and convergence results, some reasonable assumptions on the expansions of the random coefficients are imposed. Finally, some numerical results are delivered.
For a sequence of random structures with $n$-element domains over a relational signature, we define its first order (FO) complexity as a certain subset in the Banach space $\ell^{\infty}/c_0$. The well-known FO zero-one law and FO convergence law correspond to FO complexities equal to $\{0,1\}$ and a subset of $\mathbb{R}$, respectively. We present a hierarchy of FO complexity classes, introduce a stochastic FO reduction that allows to transfer complexity results between different random structures, and deduce using this tool several new logical limit laws for binomial random structures. Finally, we introduce a conditional distribution on graphs, subject to a FO sentence $\varphi$, that generalises certain well-known random graph models, show instances of this distribution for every complexity class, and prove that the set of all $\varphi$ validating 0--1 law is not recursively enumerable.
New low-order $H(\textrm{div})$-conforming finite elements for symmetric tensors are constructed in arbitrary dimension. The space of shape functions is defined by enriching the symmetric quadratic polynomial space with the $(d+1)$-order normal-normal face bubble space. The reduced counterpart has only $d(d+1)^2$ degrees of freedom. Basis functions are explicitly given in terms of barycentric coordinates. Low-order conforming finite element elasticity complexes starting from the Bell element, are developed in two dimensions. These finite elements for symmetric tensors are applied to devise robust mixed finite element methods for the linear elasticity problem, which possess the uniform error estimates with respect to the Lam\'{e} coefficient $\lambda$, and superconvergence for the displacement. Numerical results are provided to verify the theoretical convergence rates.
The joint bidiagonalization (JBD) process iteratively reduces a matrix pair $\{A,L\}$ to two bidiagonal forms simultaneously, which can be used for computing a partial generalized singular value decomposition (GSVD) of $\{A,L\}$. The process has a nested inner-outer iteration structure, where the inner iteration usually can not be computed exactly. In this paper, we study the inaccurately computed inner iterations of JBD by first investigating influence of computational error of the inner iteration on the outer iteration, and then proposing a reorthogonalized JBD (rJBD) process to keep orthogonality of a part of Lanczos vectors. An error analysis of the rJBD is carried out to build up connections with Lanczos bidiagonalizations. The results are then used to investigate convergence and accuracy of the rJBD based GSVD computation. It is shown that the accuracy of computed GSVD components depend on the computing accuracy of inner iterations and condition number of $(A^T,L^T)^T$ while the convergence rate is not affected very much. For practical JBD based GSVD computations, our results can provide a guideline for choosing a proper computing accuracy of inner iterations in order to obtain approximate GSVD components with a desired accuracy. Numerical experiments are made to confirm our theoretical results.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.