Threshold tolerance graphs and their complement graphs ( known as co-TT graphs) were introduced by Monma, Reed and Trotter[24]. Introducing the concept of negative interval Hell et al.[19] defined signed-interval bigraphs/digraphs and have shown that they are equivalent to several seemingly different classes of bigraphs/digraphs. They have also shown that co-TT graphs are equivalent to symmetric signed-interval digraphs. In this paper we characterize signed-interval bigraphs and signed-interval graphs respectively in terms of their biadjacency matrices and adjacency matrices. Finally, based on the geometric representation of signed-interval graphs we have setteled the open problem of forbidden induced subgraph characterization of co-TT graphs posed by Monma, Reed and Trotter in the same paper.
Data on neighbourhood characteristics are not typically collected in epidemiological studies. They are however useful in the study of small-area health inequalities. Neighbourhood characteristics are collected in some surveys and could be linked to the data of other studies. We propose to use kriging based on semi-variogram models to predict values at non-observed locations with the aim of constructing bespoke indices of neighbourhood characteristics to be linked to data from epidemiological studies. We perform a simulation study to assess the feasibility of the method as well as a case study using data from the RECORD study. Apart from having enough observed data at small distances to the non-observed locations, a good fitting semi-variogram, a larger range and the absence of nugget effects for the semi-variogram models are factors leading to a higher reliability.
This paper presents a new approach for training two-stage object detection ensemble models, more specifically, Faster R-CNN models to estimate uncertainty. We propose training one Region Proposal Network(RPN)~\cite{//doi.org/10.48550/arxiv.1506.01497} and multiple Fast R-CNN prediction heads is all you need to build a robust deep ensemble network for estimating uncertainty in object detection. We present this approach and provide experiments to show that this approach is much faster than the naive method of fully training all $n$ models in an ensemble. We also estimate the uncertainty by measuring this ensemble model's Expected Calibration Error (ECE). We then further compare the performance of this model with that of Gaussian YOLOv3, a variant of YOLOv3 that models uncertainty using predicted bounding box coordinates. The source code is released at \url{//github.com/Akola-Mbey-Denis/EfficientEnsemble}
This paper introduces and studies the sequential composition and decomposition of propositional logic programs. We show that acyclic programs can be decomposed into single-rule programs and provide a general decomposition result for arbitrary programs. We show that the immediate consequence operator of a program can be represented via composition which allows us to compute its least model without any explicit reference to operators. This bridges the conceptual gap between the syntax and semantics of a propositional logic program in a mathematically satisfactory way.
The topological entropy of a topological dynamical system, introduced in a foundational paper by Adler, Konheim and McAndrew [Trans. Am. Math. Soc., 1965], is a nonnegative number that measures the uncertainty or disorder of the system. Comparing with positive entropy systems, zero entropy systems are much less understood. In order to distinguish between zero entropy systems, Huang and Ye [Adv. Math., 2009] introduced the concept of maximal pattern entropy of a topological dynamical system. At the heart of their analysis is a Sauer-Shelah type lemma. In the present paper, we provide a shorter and more conceptual proof of a strengthening of this lemma, and discuss its surprising connection between dynamical system, combinatorics and a recent breakthrough in communication complexity. We also improve one of the main results of Huang and Ye on the maximal pattern entropy of zero-dimensional systems, by proving a new Sauer-Shelah type lemma, which unifies and enhances various extremal results on VC-dimension, Natarajan dimension and Steele dimension.
We propose a Monte Carlo method to efficiently find, count, and sample abstract triangulations of a given manifold M. The method is based on a biased random walk through all possible triangulations of M (in the Pachner graph), constructed by combining (bi-stellar) moves with suitable chosen accept/reject probabilities (Metropolis-Hastings). Asymptotically, the method guarantees that samples of triangulations are drawn at random from a chosen probability. This enables us not only to sample (rare) triangulations of particular interest but also to estimate the (extremely small) probability of obtaining them when isomorphism types of triangulations are sampled uniformly at random. We implement our general method for surface triangulations and 1-vertex triangulations of 3-manifolds. To showcase its usefulness, we present a number of experiments: (a) we recover asymptotic growth rates for the number of isomorphism types of simplicial triangulations of the 2-dimensional sphere; (b) we experimentally observe that the growth rate for the number of isomorphism types of 1-vertex triangulations of the 3-dimensional sphere appears to be singly exponential in the number of their tetrahedra; and (c) we present experimental evidence that a randomly chosen isomorphism type of 1-vertex n-tetrahedra 3-sphere triangulation, for n tending to infinity, almost surely shows a fixed edge-degree distribution which decays exponentially for large degrees, but shows non-monotonic behaviour for small degrees.
Structural convergence is a framework for convergence of graphs by Ne\v{s}et\v{r}il and Ossona de Mendez that unifies the dense (left) graph convergence and Benjamini-Schramm convergence. They posed a problem asking whether for a given sequence of graphs $(G_n)$ converging to a limit $L$ and a vertex $r$ of $L$ it is possible to find a sequence of vertices $(r_n)$ such that $L$ rooted at $r$ is the limit of the graphs $G_n$ rooted at $r_n$. A counterexample was found by Christofides and Kr\'{a}l', but they showed that the statement holds for almost all vertices $r$ of $L$. We offer another perspective to the original problem by considering the size of definable sets to which the root $r$ belongs. We prove that if $r$ is an algebraic vertex (i.e. belongs to a finite definable set), the sequence of roots $(r_n)$ always exists.
This paper presents a new weak Galerkin (WG) method for elliptic interface problems on general curved polygonal partitions. The method's key innovation lies in its ability to transform the complex interface jump condition into a more manageable Dirichlet boundary condition, simplifying the theoretical analysis significantly. The numerical scheme is designed by using locally constructed weak gradient on the curved polygonal partitions. We establish error estimates of optimal order for the numerical approximation in both discrete $H^1$ and $L^2$ norms. Additionally, we present various numerical results that serve to illustrate the robust numerical performance of the proposed WG interface method.
The generalized Golub-Kahan bidiagonalization has been used to solve saddle-point systems where the leading block is symmetric and positive definite. We extend this iterative method for the case where the symmetry condition no longer holds. We do so by relying on the known connection the algorithm has with the Conjugate Gradient method and following the line of reasoning that adapts the latter into the Full Orthogonalization Method. We propose appropriate stopping criteria based on the residual and an estimate of the energy norm for the error associated with the primal variable. Numerical comparison with GMRES highlights the advantages of our proposed strategy regarding its low memory requirements and the associated implications.
With the emergence of Transformer architectures and their powerful understanding of textual data, a new horizon has opened up to predict the molecular properties based on text description. While SMILES are the most common form of representation, they are lacking robustness, rich information and canonicity, which limit their effectiveness in becoming generalizable representations. Here, we present GPT-MolBERTa, a self-supervised large language model (LLM) which uses detailed textual descriptions of molecules to predict their properties. A text based description of 326000 molecules were collected using ChatGPT and used to train LLM to learn the representation of molecules. To predict the properties for the downstream tasks, both BERT and RoBERTa models were used in the finetuning stage. Experiments show that GPT-MolBERTa performs well on various molecule property benchmarks, and approaching state of the art performance in regression tasks. Additionally, further analysis of the attention mechanisms show that GPT-MolBERTa is able to pick up important information from the input textual data, displaying the interpretability of the model.
In situations where both extreme and non-extreme data are of interest, modelling the whole data set accurately is important. In a univariate framework, modelling the bulk and tail of a distribution has been extensively studied before. However, when more than one variable is of concern, models that aim specifically at capturing both regions correctly are scarce in the literature. A dependence model that blends two copulas with different characteristics over the whole range of the data support is proposed. One copula is tailored to the bulk and the other to the tail, with a dynamic weighting function employed to transition smoothly between them. Tail dependence properties are investigated numerically and simulation is used to confirm that the blended model is sufficiently flexible to capture a wide variety of structures. The model is applied to study the dependence between temperature and ozone concentration at two sites in the UK and compared with a single copula fit. The proposed model provides a better, more flexible, fit to the data, and is also capable of capturing complex dependence structures.