We prove that the number of edges of a multigraph $G$ with $n$ vertices is at most $O(n^2\log n)$, provided that any two edges cross at most once, parallel edges are noncrossing, and the lens enclosed by every pair of parallel edges in $G$ contains at least one vertex. As a consequence, we prove the following extension of the Crossing Lemma of Ajtai, Chv\'atal, Newborn, Szemer\'edi and Leighton, if $G$ has $e \geq 4n$ edges, in any drawing of $G$ with the above property, the number of crossings is $\Omega\left(\frac{e^3}{n^2\log(e/n)}\right)$. This answers a question of Kaufmann et al. and is tight up to the logarithmic factor.
We present a loosely coupled, non-iterative time-splitting scheme based on Robin-Robin coupling conditions. We apply a novel unified analysis for this scheme applied to both a Parabolic/Parabolic coupled system and a Parabolic/Hyperbolic coupled system. We show for both systems that the scheme is stable, and the error converges as $\mathcal{O}\big(\Delta t \sqrt{T +\log{\frac{1}{\Delta t}}}\big)$, where $\Delta t$ is the time step
How and where proteins interface with one another can ultimately impact the proteins' functions along with a range of other biological processes. As such, precise computational methods for protein interface prediction (PIP) come highly sought after as they could yield significant advances in drug discovery and design as well as protein function analysis. However, the traditional benchmark dataset for this task, Docking Benchmark 5 (DB5), contains only a modest 230 complexes for training, validating, and testing different machine learning algorithms. In this work, we expand on a dataset recently introduced for this task, the Database of Interacting Protein Structures (DIPS), to present DIPS-Plus, an enhanced, feature-rich dataset of 42,112 complexes for geometric deep learning of protein interfaces. The previous version of DIPS contains only the Cartesian coordinates and types of the atoms comprising a given protein complex, whereas DIPS-Plus now includes a plethora of new residue-level features including protrusion indices, half-sphere amino acid compositions, and new profile hidden Markov model (HMM)-based sequence features for each amino acid, giving researchers a large, well-curated feature bank for training protein interface prediction methods. We demonstrate through rigorous benchmarks that training an existing state-of-the-art (SOTA) model for PIP on DIPS-Plus yields SOTA results, surpassing the performance of all other models trained on residue-level and atom-level encodings of protein complexes to date.
We show that it is provable in PA that there is an arithmetically definable sequence $\{\phi_{n}:n \in \omega\}$ of $\Pi^{0}_{2}$-sentences, such that - PRA+$\{\phi_{n}:n \in \omega\}$ is $\Pi^{0}_{2}$-sound and $\Pi^{0}_{1}$-complete - the length of $\phi_{n}$ is bounded above by a polynomial function of $n$ with positive leading coefficient - PRA+$\phi_{n+1}$ always proves 1-consistency of PRA+$\phi_{n}$. One has that the growth in logical strength is in some sense "as fast as possible", manifested in the fact that the total general recursive functions whose totality is asserted by the true $\Pi^{0}_{2}$-sentences in the sequence are cofinal growth-rate-wise in the set of all total general recursive functions. We then develop an argument which makes use of a sequence of sentences constructed by an application of the diagonal lemma, which are generalisations in a broad sense of Hugh Woodin's "Tower of Hanoi" construction as outlined in his essay "Tower of Hanoi" in Chapter 18 of the anthology "Truth in Mathematics". The argument establishes the result that it is provable in PA that $P \neq NP$. We indicate how to pull the argument all the way down into EFA.
The Infection Fatality Rate (IFR) of COVID-19 is difficult to estimate because the number of infections is unknown and there is a lag between each infection and the potentially subsequent death. We introduce a new approach for estimating the IFR by first estimating the entire sequence of daily infections. Unlike prior approaches, we incorporate existing data on the number of daily COVID-19 tests into our estimation; knowing the test rates helps us estimate the ratio between the number of cases and the number of infections. Also unlike prior approaches, rather than determining a constant lag from studying a group of patients, we treat the lag as a random variable, whose parameters we determine empirically by fitting our infections sequence to the sequence of deaths. Our approach allows us to narrow our estimation to smaller time intervals in order to observe how the IFR changes over time. We analyze a 250 day period starting on March 1, 2020. We estimate that the IFR in the U.S. decreases from a high of $0.68\%$ down to $0.24\%$ over the course of this time period. We also provide IFR and lag estimates for Italy, Denmark, and the Netherlands, all of which also exhibit decreasing IFRs but to different degrees.
A connected partition is a partition of the vertices of a graph into sets that induce connected subgraphs. Such partitions naturally occur in many application areas such as road networks, and image processing. We consider Balanced Connected Partitions (BCP), where the two classical objectives for BCP are to maximize the weight of the smallest, or minimize the weight of the largest component. We study BCP on c-claw-free graphs, the class of graphs that do not have $K_{1,c}$ as an induced subgraph, and present efficient (c-1)-approximation algorithms for both objectives. In particular, due to the (3-)claw-freeness of line graphs, this also implies a 2-approximations for the edge-partition version of BCP in general graphs. In the 1970s Gy\H{o}ri and Lov\'{a}sz showed for natural numbers $w_1,\dots,w_k$ where $\sum_i w_i$ is the vertex size, that if $G$ is k-connected, then there exist a connected k-partition with part sizes $w_1,\dots,w_k$. However, to this day no polynomial algorithm to compute such partitions exists for k>4. Towards finding such a partition $T_1,\dots, T_k$, we show how to efficiently compute connected partitions that at least approximately meet the target weights, subject to the mild assumption that each $w_i$ is greater than the weight of the heaviest vertex. In particular, we give a 3-approximation for both the lower and the upper bounded version i.e. we guarantee that each $T_i$ has weight at least $\frac{w_i}{3}$ or that each $T_i$ has weight most $3w_i$, respectively. Also, we present a both-side bounded version that produces a connected partition where each $T_i$ has size at least $\frac{w_i}{3}$ and at most $\max(\{r,3\}) w_i$, where $r \geq 1$ is the ratio between the largest and smallest value in $w_1, \dots, w_k$. In particular for the balanced version, i.e.~$w_1=w_2=, \dots,=w_k$, this gives a partition with $\frac{1}{3}w_i \leq w(T_i) \leq 3w_i$.
Dual-energy X-ray tomography is considered in a context where the target under imaging consists of two distinct materials. The materials are assumed to be possibly intertwined in space, but at any given location there is only one material present. Further, two X-ray energies are chosen so that there is a clear difference in the spectral dependence of the attenuation coefficients of the two materials. A novel regularizer is presented for the inverse problem of reconstructing separate tomographic images for the two materials. A combination of two things, (a) non-negativity constraint, and (b) penalty term containing the inner product between the two material images, promotes the presence of at most one material in a given pixel. A preconditioned interior point method is derived for the minimization of the regularization functional. Numerical tests with digital phantoms suggest that the new algorithm outperforms the baseline method, Joint Total Variation regularization, in terms of correctly material-characterized pixels. While the method is tested only in a two-dimensional setting with two materials and two energies, the approach readily generalizes to three dimensions and more materials. The number of materials just needs to match the number of energies used in imaging.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
Graph Neural Networks (GNN) come in many flavors, but should always be either invariant (permutation of the nodes of the input graph does not affect the output) or equivariant (permutation of the input permutes the output). In this paper, we consider a specific class of invariant and equivariant networks, for which we prove new universality theorems. More precisely, we consider networks with a single hidden layer, obtained by summing channels formed by applying an equivariant linear operator, a pointwise non-linearity and either an invariant or equivariant linear operator. Recently, Maron et al. (2019) showed that by allowing higher-order tensorization inside the network, universal invariant GNNs can be obtained. As a first contribution, we propose an alternative proof of this result, which relies on the Stone-Weierstrass theorem for algebra of real-valued functions. Our main contribution is then an extension of this result to the equivariant case, which appears in many practical applications but has been less studied from a theoretical point of view. The proof relies on a new generalized Stone-Weierstrass theorem for algebra of equivariant functions, which is of independent interest. Finally, unlike many previous settings that consider a fixed number of nodes, our results show that a GNN defined by a single set of parameters can approximate uniformly well a function defined on graphs of varying size.
Generative adversarial nets (GANs) have generated a lot of excitement. Despite their popularity, they exhibit a number of well-documented issues in practice, which apparently contradict theoretical guarantees. A number of enlightening papers have pointed out that these issues arise from unjustified assumptions that are commonly made, but the message seems to have been lost amid the optimism of recent years. We believe the identified problems deserve more attention, and highlight the implications on both the properties of GANs and the trajectory of research on probabilistic models. We recently proposed an alternative method that sidesteps these problems.
Recent years have witnessed the enormous success of low-dimensional vector space representations of knowledge graphs to predict missing facts or find erroneous ones. Currently, however, it is not yet well-understood how ontological knowledge, e.g. given as a set of (existential) rules, can be embedded in a principled way. To address this shortcoming, in this paper we introduce a framework based on convex regions, which can faithfully incorporate ontological knowledge into the vector space embedding. Our technical contribution is two-fold. First, we show that some of the most popular existing embedding approaches are not capable of modelling even very simple types of rules. Second, we show that our framework can represent ontologies that are expressed using so-called quasi-chained existential rules in an exact way, such that any set of facts which is induced using that vector space embedding is logically consistent and deductively closed with respect to the input ontology.