Post's Correspondence Problem (the PCP) is a classical decision problem in theoretical computer science that asks whether for pairs of free monoid morphisms $g, h\colon\Sigma^*\to\Delta^*$ there exists any non-trivial $x\in\Sigma^*$ such that $g(x)=h(x)$. Post's Correspondence Problem for a group $\Gamma$ takes pairs of group homomorphisms $g, h\colon F(\Sigma)\to \Gamma$ instead, and similarly asks whether there exists an $x$ such that $g(x)=h(x)$ holds for non-elementary reasons. The restrictions imposed on $x$ in order to get non-elementary solutions lead to several interpretations of the problem; we consider the natural restriction asking that $x \notin \ker(g) \cap \ker(h)$ and prove that the resulting interpretation of the PCP is undecidable for arbitrary hyperbolic $\Gamma$, but decidable when $\Gamma$ is virtually nilpotent. We also study this problem for group constructions such as subgroups, direct products and finite extensions. This problem is equivalent to an interpretation due to Myasnikov, Nikolev and Ushakov when one map is injective.
Generalized linear models (GLMs) are popular for data-analysis in almost all quantitative sciences, but the choice of likelihood family and link function is often difficult. This motivates the search for likelihoods and links that minimize the impact of potential misspecification. We perform a large-scale simulation study on double-bounded and lower-bounded response data where we systematically vary both true and assumed likelihoods and links. In contrast to previous studies, we also study posterior calibration and uncertainty metrics in addition to point-estimate accuracy. Our results indicate that certain likelihoods and links can be remarkably robust to misspecification, performing almost on par with their respective true counterparts. Additionally, normal likelihood models with identity link (i.e., linear regression) often achieve calibration comparable to the more structurally faithful alternatives, at least in the studied scenarios. On the basis of our findings, we provide practical suggestions for robust likelihood and link choices in GLMs.
Stochastic PDEs of Fluctuating Hydrodynamics are a powerful tool for the description of fluctuations in many-particle systems. In this paper, we develop and analyze a Multilevel Monte Carlo (MLMC) scheme for the Dean-Kawasaki equation, a pivotal representative of this class of SPDEs. We prove analytically and demonstrate numerically that our MLMC scheme provides a significant speed-up (with respect to a standard Monte Carlo method) in the simulation of the Dean-Kawasaki equation. Specifically, we quantify how the speed-up factor increases as the average particle density increases, and show that sizeable speed-ups can be obtained even in regimes of low particle density. Numerical simulations are provided in the two-dimensional case, confirming our theoretical predictions. Our results are formulated entirely in terms of the law of distributions rather than in terms of strong spatial norms: this crucially allows for MLMC speed-ups altogether despite the Dean-Kawasaki equation being highly singular.
In modern computational materials science, deep learning has shown the capability to predict interatomic potentials, thereby supporting and accelerating conventional simulations. However, existing models typically sacrifice either accuracy or efficiency. Moreover, lightweight models are highly demanded for offering simulating systems on a considerably larger scale at reduced computational costs. A century ago, Felix Bloch demonstrated how leveraging the equivariance of the translation operation on a crystal lattice (with geometric symmetry) could significantly reduce the computational cost of determining wavefunctions and accurately calculate material properties. Here, we introduce a lightweight equivariant interaction graph neural network (LEIGNN) that can enable accurate and efficient interatomic potential and force predictions in crystals. Rather than relying on higher-order representations, LEIGNN employs a scalar-vector dual representation to encode equivariant features. By extracting both local and global structures from vector representations and learning geometric symmetry information, our model remains lightweight while ensuring prediction accuracy and robustness through the equivariance. Our results show that LEIGNN consistently outperforms the prediction performance of the representative baselines and achieves significant efficiency across diverse datasets, which include catalysts, molecules, and organic isomers. Finally, to further validate the predicted interatomic potentials from our model, we conduct classical molecular dynamics (MD) and ab initio MD simulation across various systems, including solid, liquid, and gas. It is found that LEIGNN can achieve the accuracy of ab initio MD and retain the computational efficiency of classical MD across all examined systems, demonstrating its accuracy, efficiency, and universality.
A simple way of obtaining robust estimates of the "center" (or the "location") and of the "spread" of a dataset is to use the maximum likelihood estimate with a class of heavy-tailed distributions, regardless of the "true" distribution generating the data. We observe that the maximum likelihood problem for the Cauchy distributions, which have particularly heavy tails, is geodesically convex and therefore efficiently solvable (Cauchy distributions are parametrized by the upper half plane, i.e. by the hyperbolic plane). Moreover, it has an appealing geometrical meaning: the datapoints, living on the boundary of the hyperbolic plane, are attracting the parameter by unit forces, and we search the point where these forces are in equilibrium. This picture generalizes to several classes of multivariate distributions with heavy tails, including, in particular, the multivariate Cauchy distributions. The hyperbolic plane gets replaced by symmetric spaces of noncompact type. Geodesic convexity gives us an efficient numerical solution of the maximum likelihood problem for these distribution classes. This can then be used for robust estimates of location and spread, thanks to the heavy tails of these distributions.
The convolution quadrature method originally developed for the Riemann-Liouville fractional calculus is extended in this work to the Hadamard fractional calculus by using the exponential type meshes. Local truncation error analysis is presented for singular solutions. By adopting the fractional BDF-$p(1\leq p \leq 6)$ for the Caputo-Hadamard fractional derivative in solving subdiffusion problem with singular source terms, and using the finite element method to discretize the space variable, we carry out the sharp error analysis rigorously and obtain the optimal accuracy by the novel correction technique. Our correction method is a natural generalization of the one developed for subdiffusion problems with smooth source terms. Numerical tests confirm the correctness of our theoretical results.
In this paper we study the convergence of a fully discrete Crank-Nicolson Galerkin scheme for the initial value problem associated with the fractional Korteweg-de Vries (KdV) equation, which involves the fractional Laplacian and non-linear convection terms. Our proof relies on the Kato type local smoothing effect to estimate the localized $H^{\alpha/2}$-norm of the approximated solution, where $\alpha \in [1,2)$. We demonstrate that the scheme converges strongly in $L^2(0,T;L^2_{loc}(\mathbb{R}))$ to a weak solution of the fractional KdV equation provided the initial data in $L^2(\mathbb{R})$. Assuming the initial data is sufficiently regular, we obtain the rate of convergence for the numerical scheme. Finally, the theoretical convergence rates are justified numerically through various numerical illustrations.
In this article, the stabilizer free weak Galerkin (SFWG) finite element method is applied to the Ciarlet-Raviart mixed form of the Biharmonic equation. We utilize the SFWG solutions of the second elliptic problems to define projection operators, build error equations, and further derive the error estimates. Finally, numerical examples support the results reached by the theory.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.