We present a non-intrusive gradient algorithm for parameter estimation problems in non-stationary elasticity. To avoid multiple (and potentially expensive) solutions of the underlying partial differential equation (PDE), we approximate the PDE solver by a neural network within the gradient algorithm. The network is trained offline for a given set of parameters. The algorithm is applied to an unsteady linear-elastic contact problem; its convergence and approximation properties are investigated numerically.
We consider the classical problems of interpolating a polynomial given a black box for evaluation, and of multiplying two polynomials, in the setting where the bit-lengths of the coefficients may vary widely, so-called unbalanced polynomials. Writing s for the total bit-length and D for the degree, our new algorithms have expected running time $\tilde{O}(s \log D)$, whereas previous methods for (resp.) dense or sparse arithmetic have at least $\tilde{O}(sD)$ or $\tilde{O}(s^2)$ bit complexity.
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess fundamental conjugacy properties. When used as priors for the vector of parameters in general probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although such a core result has led to important advancements in Bayesian inference and computation, its applicability beyond likelihoods associated with fully-observed, discretized, or censored realizations from multivariate Gaussian models remains yet unexplored. This article covers such an important gap by proving that the wider family of multivariate unified skew-elliptical (SUE) distributions, which extends SUNs to more general perturbations of elliptical densities, guarantees conjugacy for broader classes of models, beyond those relying on fully-observed, discretized or censored Gaussians. Such a result leverages the closure under linear combinations, conditioning and marginalization of SUE to prove that such a family is conjugate to the likelihood induced by general multivariate regression models for fully-observed, censored or dichotomized realizations from skew-elliptical distributions. This advancement substantially enlarges the set of models that enable conjugate Bayesian inference to general formulations arising from elliptical and skew-elliptical families, including the multivariate Student's t and skew-t, among others.
Covariance and Hessian matrices have been analyzed separately in the literature for classification problems. However, integrating these matrices has the potential to enhance their combined power in improving classification performance. We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model to achieve optimal class separability in binary classification tasks. Our approach is substantiated by formal proofs that establish its capability to maximize between-class mean distance and minimize within-class variances. By projecting data into the combined space of the most relevant eigendirections from both matrices, we achieve optimal class separability as per the linear discriminant analysis (LDA) criteria. Empirical validation across neural and health datasets consistently supports our theoretical framework and demonstrates that our method outperforms established methods. Our method stands out by addressing both LDA criteria, unlike PCA and the Hessian method, which predominantly emphasize one criterion each. This comprehensive approach captures intricate patterns and relationships, enhancing classification performance. Furthermore, through the utilization of both LDA criteria, our method outperforms LDA itself by leveraging higher-dimensional feature spaces, in accordance with Cover's theorem, which favors linear separability in higher dimensions. Our method also surpasses kernel-based methods and manifold learning techniques in performance. Additionally, our approach sheds light on complex DNN decision-making, rendering them comprehensible within a 2D space.
Quantization for a Borel probability measure refers to the idea of estimating a given probability by a discrete probability with support containing a finite number of elements. In this paper, we have considered a Borel probability measure $P$ on $\mathbb R^2$, which has support a nonuniform stretched Sierpi\'{n}ski triangle generated by a set of three contractive similarity mappings on $\mathbb R^2$. For this probability measure, we investigate the optimal sets of $n$-means and the $n$th quantization errors for all positive integers $n$.
Mesh-based Graph Neural Networks (GNNs) have recently shown capabilities to simulate complex multiphysics problems with accelerated performance times. However, mesh-based GNNs require a large number of message-passing (MP) steps and suffer from over-smoothing for problems involving very fine mesh. In this work, we develop a multiscale mesh-based GNN framework mimicking a conventional iterative multigrid solver, coupled with adaptive mesh refinement (AMR), to mitigate challenges with conventional mesh-based GNNs. We use the framework to accelerate phase field (PF) fracture problems involving coupled partial differential equations with a near-singular operator due to near-zero modulus inside the crack. We define the initial graph representation using all mesh resolution levels. We perform a series of downsampling steps using Transformer MP GNNs to reach the coarsest graph followed by upsampling steps to reach the original graph. We use skip connectors from the generated embedding during coarsening to prevent over-smoothing. We use Transfer Learning (TL) to significantly reduce the size of training datasets needed to simulate different crack configurations and loading conditions. The trained framework showed accelerated simulation times, while maintaining high accuracy for all cases compared to physics-based PF fracture model. Finally, this work provides a new approach to accelerate a variety of mesh-based engineering multiphysics problems
We present an information-theoretic lower bound for the problem of parameter estimation with time-uniform coverage guarantees. Via a new a reduction to sequential testing, we obtain stronger lower bounds that capture the hardness of the time-uniform setting. In the case of location model estimation, logistic regression, and exponential family models, our $\Omega(\sqrt{n^{-1}\log \log n})$ lower bound is sharp to within constant factors in typical settings.
Karppa & Kaski (2019) proposed a novel ``broken" or ``opportunistic" matrix multiplication algorithm, based on a variant of Strassen's algorithm, and used this to develop new algorithms for Boolean matrix multiplication, among other tasks. Their algorithm can compute Boolean matrix multiplication in $O(n^{2.778})$ time. While asymptotically faster matrix multiplication algorithms exist, most such algorithms are infeasible for practical problems. We describe an alternative way to use the broken multiplication algorithm to approximately compute matrix multiplication, either for real-valued or Boolean matrices. In brief, instead of running multiple iterations of the broken algorithm on the original input matrix, we form a new larger matrix by sampling and run a single iteration of the broken algorithm on it. Asymptotically, our algorithm has runtime $O(n^{2.763})$, a slight improvement over the Karppa-Kaski algorithm. Since the goal is to obtain new practical matrix-multiplication algorithms, we also estimate the concrete runtime for our algorithm for some large-scale sample problems. It appears that for these parameters, further optimizations are still needed to make our algorithm competitive.
We investigate pointwise estimation of the function-valued velocity field of a second-order linear SPDE. Based on multiple spatially localised measurements, we construct a weighted augmented MLE and study its convergence properties as the spatial resolution of the observations tends to zero and the number of measurements increases. By imposing H\"older smoothness conditions, we recover the pointwise convergence rate known to be minimax-optimal in the linear regression framework. The optimality of the rate in the current setting is verified by adapting the lower bound ansatz based on the RKHS of local measurements to the nonparametric situation.
In exploratory factor analysis, model parameters are usually estimated by maximum likelihood method. The maximum likelihood estimate is obtained by solving a complicated multivariate algebraic equation. Since the solution to the equation is usually intractable, it is typically computed with continuous optimization methods, such as Newton-Raphson methods. With this procedure, however, the solution is inevitably dependent on the estimation algorithm and initial value since the log-likelihood function is highly non-concave. Particularly, the estimates of unique variances can result in zero or negative, referred to as improper solutions; in this case, the maximum likelihood estimate can be severely unstable. To delve into the issue of the instability of the maximum likelihood estimate, we compute exact solutions to the multivariate algebraic equation by using algebraic computations. We provide a computationally efficient algorithm based on the algebraic computations specifically optimized for maximum likelihood factor analysis. To be specific, Gr\"oebner basis and cylindrical decomposition are employed, powerful tools for solving the multivariate algebraic equation. Our proposed procedure produces all exact solutions to the algebraic equation; therefore, these solutions are independent of the initial value and estimation algorithm. We conduct Monte Carlo simulations to investigate the characteristics of the maximum likelihood solutions.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.