The elastic energy of a bending-resistant interface depends both on its geometry and its material composition. We consider such a heterogeneous interface in the plane, modeled by a curve equipped with an additional density function. The resulting energy captures the complex interplay between curvature and density effects, resembling the Canham-Helfrich functional. We describe the curve by its inclination angle, so that the equilibrium equations reduce to an elliptic system of second order. After a brief variational discussion, we investigate the associated nonlocal $L^2$-gradient flow evolution, a coupled quasilinear parabolic problem. We analyze the (non)preservation of quantities such as convexity, positivity, and symmetry, as well as the asymptotic behavior of the system. The results are illustrated by numerical experiments.
Deep learning has gained immense popularity in the Earth sciences as it enables us to formulate purely data-driven models of complex Earth system processes. Deep learning-based weather prediction (DLWP) models have made significant progress in the last few years, achieving forecast skills comparable to established numerical weather prediction models with comparatively lesser computational costs. In order to train accurate, reliable, and tractable DLWP models with several millions of parameters, the model design needs to incorporate suitable inductive biases that encode structural assumptions about the data and the modelled processes. When chosen appropriately, these biases enable faster learning and better generalisation to unseen data. Although inductive biases play a crucial role in successful DLWP models, they are often not stated explicitly and their contribution to model performance remains unclear. Here, we review and analyse the inductive biases of state-of-the-art DLWP models with respect to five key design elements: data selection, learning objective, loss function, architecture, and optimisation method. We identify the most important inductive biases and highlight potential avenues towards more efficient and probabilistic DLWP models.
The Runge--Kutta discontinuous Galerkin (RKDG) method is a high-order technique for addressing hyperbolic conservation laws, which has been refined over recent decades and is effective in handling shock discontinuities. Despite its advancements, the RKDG method faces challenges, such as stringent constraints on the explicit time-step size and reduced robustness when dealing with strong discontinuities. On the other hand, the Gas-Kinetic Scheme (GKS) based on a high-order gas evolution model also delivers significant accuracy and stability in solving hyperbolic conservation laws through refined spatial and temporal discretizations. Unlike RKDG, GKS allows for more flexible CFL number constraints and features an advanced flow evolution mechanism at cell interfaces. Additionally, GKS' compact spatial reconstruction enhances the accuracy of the method and its ability to capture stable strong discontinuities effectively. In this study, we conduct a thorough examination of the RKDG method using various numerical fluxes and the GKS method employing both compact and non-compact spatial reconstructions. Both methods are applied under the framework of explicit time discretization and are tested solely in inviscid scenarios. We will present numerous numerical tests and provide a comparative analysis of the outcomes derived from these two computational approaches.
In evidence synthesis, effect modifiers are typically described as variables that induce treatment effect heterogeneity at the individual level, through treatment-covariate interactions in an outcome model parametrized at such level. As such, effect modification is defined with respect to a conditional measure, but marginal effect estimates are required for population-level decisions in health technology assessment. For non-collapsible measures, purely prognostic variables that are not determinants of treatment response at the individual level may modify marginal effects, even where there is individual-level treatment effect homogeneity. With heterogeneity, marginal effects for measures that are not directly collapsible cannot be expressed in terms of marginal covariate moments, and generally depend on the joint distribution of conditional effect measure modifiers and purely prognostic variables. There are implications for recommended practices in evidence synthesis. Unadjusted anchored indirect comparisons can be biased in the absence of individual-level treatment effect heterogeneity, or when marginal covariate moments are balanced across studies. Covariate adjustment may be necessary to account for cross-study imbalances in joint covariate distributions involving purely prognostic variables. In the absence of individual patient data for the target, covariate adjustment approaches are inherently limited in their ability to remove bias for measures that are not directly collapsible. Directly collapsible measures would facilitate the transportability of marginal effects between studies by: (1) reducing dependence on model-based covariate adjustment where there is individual-level treatment effect homogeneity or marginal covariate moments are balanced; and (2) facilitating the selection of baseline covariates for adjustment where there is individual-level treatment effect heterogeneity.
The uniform interpolation property in a given logic can be understood as the definability of propositional quantifiers. We mechanise the computation of these quantifiers and prove correctness in the Coq proof assistant for three modal logics, namely: (1) the modal logic K, for which a pen-and-paper proof exists; (2) G\"odel-L\"ob logic GL, for which our formalisation clarifies an important point in an existing, but incomplete, sequent-style proof; and (3) intuitionistic strong L\"ob logic iSL, for which this is the first proof-theoretic construction of uniform interpolants. Our work also yields verified programs that allow one to compute the propositional quantifiers on any formula in this logic.
Fo-bicategories are a categorification of Peirce's calculus of relations. Notably, their laws provide a proof system for first-order logic that is both purely equational and complete. This paper illustrates a correspondence between fo-bicategories and Lawvere's hyperdoctrines. To streamline our proof, we introduce peircean bicategories, which offer a more succinct characterization of fo-bicategories.
Correspondence analysis (CA) is a popular technique to visualize the relationship between two categorical variables. CA uses the data from a two-way contingency table and is affected by the presence of outliers. The supplementary points method is a popular method to handle outliers. Its disadvantage is that the information from entire rows or columns is removed. However, outliers can be caused by cells only. In this paper, a reconstitution algorithm is introduced to cope with such cells. This algorithm can reduce the contribution of cells in CA instead of deleting entire rows or columns. Thus the remaining information in the row and column involved can be used in the analysis. The reconstitution algorithm is compared with two alternative methods for handling outliers, the supplementary points method and MacroPCA. It is shown that the proposed strategy works well.
The consistency of the maximum likelihood estimator for mixtures of elliptically-symmetric distributions for estimating its population version is shown, where the underlying distribution $P$ is nonparametric and does not necessarily belong to the class of mixtures on which the estimator is based. In a situation where $P$ is a mixture of well enough separated but nonparametric distributions it is shown that the components of the population version of the estimator correspond to the well separated components of $P$. This provides some theoretical justification for the use of such estimators for cluster analysis in case that $P$ has well separated subpopulations even if these subpopulations differ from what the mixture model assumes.
Twin nodes in a static network capture the idea of being substitutes for each other for maintaining paths of the same length anywhere in the network. In dynamic networks, we model twin nodes over a time-bounded interval, noted $(\Delta,d)$-twins, as follows. A periodic undirected time-varying graph $\mathcal G=(G_t)_{t\in\mathbb N}$ of period $p$ is an infinite sequence of static graphs where $G_t=G_{t+p}$ for every $t\in\mathbb N$. For $\Delta$ and $d$ two integers, two distinct nodes $u$ and $v$ in $\mathcal G$ are $(\Delta,d)$-twins if, starting at some instant, the outside neighbourhoods of $u$ and $v$ has non-empty intersection and differ by at most $d$ elements for $\Delta$ consecutive instants. In particular when $d=0$, $u$ and $v$ can act during the $\Delta$ instants as substitutes for each other in order to maintain journeys of the same length in time-varying graph $\mathcal G$. We propose a distributed deterministic algorithm enabling each node to enumerate its $(\Delta,d)$-twins in $2p$ rounds, using messages of size $O(\delta_\mathcal G\log n)$, where $n$ is the total number of nodes and $\delta_\mathcal G$ is the maximum degree of the graphs $G_t$'s. Moreover, using randomized techniques borrowed from distributed hash function sampling, we reduce the message size down to $O(\log n)$ w.h.p.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.