In Linear Logic ($\mathsf{LL}$), the exponential modality $!$ brings forth a distinction between non-linear proofs and linear proofs, where linear means using an argument exactly once. Differential Linear Logic ($\mathsf{DiLL}$) is an extension of Linear Logic which includes additional rules for $!$ which encode differentiation and the ability of linearizing proofs. On the other hand, Graded Linear Logic ($\mathsf{GLL}$) is a variation of Linear Logic in such a way that $!$ is now indexed over a semiring $R$. This $R$-grading allows for non-linear proofs of degree $r \in R$, such that the linear proofs are of degree $1 \in R$. There has been recent interest in combining these two variations of $\mathsf{LL}$ together and developing Graded Differential Linear Logic ($\mathsf{GDiLL}$). In this paper we present a sequent calculus for $\mathsf{GDiLL}$, as well as introduce its categorical semantics, which we call graded differential categories, using both coderelictions and deriving transformations. We prove that symmetric powers always give graded differential categories, and provide other examples of graded differential categories. We also discuss graded versions of (monoidal) coalgebra modalities, additive bialgebra modalities, and the Seely isomorphisms, as well as their implementations in the sequent calculus of $\mathsf{GDiLL}$.
Explicit Runge--Kutta (\rk{}) methods are susceptible to a reduction in the observed order of convergence when applied to initial-boundary value problem with time-dependent boundary conditions. We study conditions on \erk{} methods that guarantee high-order convergence for linear problems; we refer to these conditions as weak stage order conditions. We prove a general relationship between the method's order, weak stage order, and number of stages. We derive \erk{} methods with high weak stage order and demonstrate, through numerical tests, that they avoid the order reduction phenomenon up to any order for linear problems and up to order three for nonlinear problems.
Let $P$ be a set of at most $n$ points and let $R$ be a set of at most $n$ geometric ranges, such as for example disks or rectangles, where each $p \in P$ has an associated supply $s_{p} > 0$, and each $r \in R$ has an associated demand $d_{r} > 0$. An assignment is a set $\mathcal{A}$ of ordered triples $(p,r,a_{pr}) \in P \times R \times \mathbb{R}_{>0}$ such that $p \in r$. We show how to compute a maximum assignment that satisfies the constraints given by the supplies and demands. Using our techniques, we can also solve minimum bottleneck problems, such as computing a perfect matching between a set of $n$ red points~$P$ and $n$ blue points $Q$ that minimizes the length of the longest edge. For the $L_\infty$-metric, we can do this in time $O(n^{1+\varepsilon})$ in any fixed dimension, for the $L_2$-metric in the plane in time $O(n^{4/3 + \varepsilon})$, for any $\varepsilon > 0$.
We examine a method for solving an infinite-dimensional tensor eigenvalue problem $H x = \lambda x$, where the infinite-dimensional symmetric matrix $H$ exhibits a translational invariant structure. We provide a formulation of this type of problem from a numerical linear algebra point of view and describe how a power method applied to $e^{-Ht}$ is used to obtain an approximation to the desired eigenvector. This infinite-dimensional eigenvector is represented in a compact way by a translational invariant infinite Tensor Ring (iTR). Low rank approximation is used to keep the cost of subsequent power iterations bounded while preserving the iTR structure of the approximate eigenvector. We show how the averaged Rayleigh quotient of an iTR eigenvector approximation can be efficiently computed and introduce a projected residual to monitor its convergence. In the numerical examples, we illustrate that the norm of this projected iTR residual can also be used to automatically modify the time step $t$ to ensure accurate and rapid convergence of the power method.
The reconfiguration graph $\mathcal{C}_k(G)$ for the $k$-colourings of a graph $G$ has a vertex for each proper $k$-colouring of $G$, and two vertices of $\mathcal{C}_k(G)$ are adjacent precisely when those $k$-colourings differ on a single vertex of $G$. Much work has focused on bounding the maximum value of ${\rm{diam}}~\mathcal{C}_k(G)$ over all $n$-vertex graphs $G$. We consider the analogous problems for list colourings and for correspondence colourings. We conjecture that if $L$ is a list-assignment for a graph $G$ with $|L(v)|\ge d(v)+2$ for all $v\in V(G)$, then ${\rm{diam}}~\mathcal{C}_L(G)\le n(G)+\mu(G)$. We also conjecture that if $(L,H)$ is a correspondence cover for a graph $G$ with $|L(v)|\ge d(v)+2$ for all $v\in V(G)$, then ${\rm{diam}}~\mathcal{C}_{(L,H)}(G)\le n(G)+\tau(G)$. (Here $\mu(G)$ and $\tau(G)$ denote the matching number and vertex cover number of $G$.) For every graph $G$, we give constructions showing that both conjectures are best possible. Our first main result proves the upper bounds (for the list and correspondence versions, respectively) ${\rm{diam}}~\mathcal{C}_L(G)\le n(G)+2\mu(G)$ and ${\rm{diam}}~\mathcal{C}_{(L,H)}(G)\le n(G)+2\tau(G)$. Our second main result proves that both conjectured bounds hold, whenever all $v$ satisfy $|L(v)|\ge 2d(v)+1$. We conclude by proving one or both conjectures for various classes of graphs such as complete bipartite graphs, subcubic graphs, cactuses, and graphs with bounded maximum average degree.
We give an isomorphism test for graphs of Euler genus $g$ running in time $2^{O(g^4 \log g)}n^{O(1)}$. Our algorithm provides the first explicit upper bound on the dependence on $g$ for an fpt isomorphism test parameterized by the Euler genus of the input graphs. The only previous fpt algorithm runs in time $f(g)n$ for some function $f$ (Kawarabayashi 2015). Actually, our algorithm even works when the input graphs only exclude $K_{3,h}$ as a minor. For such graphs, no fpt isomorphism test was known before. The algorithm builds on an elegant combination of simple group-theoretic, combinatorial, and graph-theoretic approaches. In particular, we introduce $(t,k)$-WL-bounded graphs which provide a powerful tool to combine group-theoretic techniques with the standard Weisfeiler-Leman algorithm. This concept may be of independent interest.
Rejection sampling is a common tool for low dimensional problems ($d \leq 2$), often touted as an "easy" way to obtain valid samples from a distribution $f(\cdot)$ of interest. In practice it is non-trivial to apply, often requiring considerable mathematical effort to devise a good proposal distribution $g(\cdot)$ and select a supremum $C$. More advanced samplers require additional mathematical derivations, limitations on $f(\cdot)$, or even cross-validation, making them difficult to apply. We devise a new approximate baseline approach to rejection sampling that works with less information, requiring only a differentiable $f(\cdot)$ be specified, making it easier to use. We propose a new approach to rejection sampling by refining a parameterized proposal distribution with a loss derived from the acceptance threshold. In this manner we obtain comparable or better acceptance rates on current benchmarks by up to $7.3\times$, while requiring no extra assumptions or any derivations to use: only a differentiable $f(\cdot)$ is required. While approximate, the results are correct with high probability, and in all tests pass a distributional check. This makes our approach easy to use, reproduce, and efficacious.
We propose an efficient $\epsilon$-differentially private algorithm, that given a simple {\em weighted} $n$-vertex, $m$-edge graph $G$ with a \emph{maximum unweighted} degree $\Delta(G) \leq n-1$, outputs a synthetic graph which approximates the spectrum with $\widetilde{O}(\min\{\Delta(G), \sqrt{n}\})$ bound on the purely additive error. To the best of our knowledge, this is the first $\epsilon$-differentially private algorithm with a non-trivial additive error for approximating the spectrum of the graph. One of the subroutines of our algorithm also precisely simulates the exponential mechanism over a non-convex set, which could be of independent interest given the recent interest in sampling from a {\em log-concave distribution} defined over a convex set. Spectral approximation also allows us to approximate all possible $(S,T)$-cuts, but it incurs an error that depends on the maximum degree, $\Delta(G)$. We further show that using our sampler, we can also output a synthetic graph that approximates the sizes of all $(S,T)$-cuts on $n$ vertices weighted graph $G$ with $m$ edges while preserving $(\epsilon,\delta)$-differential privacy and an additive error of $\widetilde{O}(\sqrt{mn}/\epsilon)$. We also give a matching lower bound (with respect to all the parameters) on the private cut approximation for weighted graphs. This removes the gap of $\sqrt{W_{\mathsf{avg}}}$ in the upper and lower bound in Eli{\'a}{\v{s}}, Kapralov, Kulkarni, and Lee (SODA 2020), where $W_{\mathsf{avg}}$ is the average edge weight.
Random feature model with a nonlinear activation function has been shown to perform asymptotically equivalent to a Gaussian model in terms of training and generalization errors. Analysis of the equivalent model reveals an important yet not fully understood role played by the activation function. To address this issue, we study the "parameters" of the equivalent model to achieve improved generalization performance for a given supervised learning problem. We show that acquired parameters from the Gaussian model enable us to define a set of optimal nonlinearities. We provide two example classes from this set, e.g., second-order polynomial and piecewise linear functions. These functions are optimized to improve generalization performance regardless of the actual form. We experiment with regression and classification problems, including synthetic and real (e.g., CIFAR10) data. Our numerical results validate that the optimized nonlinearities achieve better generalization performance than widely-used nonlinear functions such as ReLU. Furthermore, we illustrate that the proposed nonlinearities also mitigate the so-called double descent phenomenon, which is known as the non-monotonic generalization performance regarding the sample size and the model size.
A \emph{mixed interval graph} is an interval graph that has, for every pair of intersecting intervals, either an arc (directed arbitrarily) or an (undirected) edge. We are particularly interested in scenarios where edges and arcs are defined by the geometry of intervals. In a proper coloring of a mixed interval graph $G$, an interval $u$ receives a lower (different) color than an interval $v$ if $G$ contains arc $(u,v)$ (edge $\{u,v\}$). Coloring of mixed graphs has applications, for example, in scheduling with precedence constraints; see a survey by Sotskov [Mathematics, 2020]. For coloring general mixed interval graphs, we present a $\min \{\omega(G), \lambda(G)+1 \}$-approximation algorithm, where $\omega(G)$ is the size of a largest clique and $\lambda(G)$ is the length of a longest directed path in $G$. For the subclass of \emph{bidirectional interval graphs} (introduced recently for an application in graph drawing), we show that optimal coloring is NP-hard. This was known for general mixed interval graphs. We introduce a new natural class of mixed interval graphs, which we call \emph{containment interval graphs}. In such a graph, there is an arc $(u,v)$ if interval $u$ contains interval $v$, and there is an edge $\{u,v\}$ if $u$ and $v$ overlap. We show that these graphs can be recognized in polynomial time, that coloring them with the minimum number of colors is NP-hard, and that there is a 2-approximation algorithm for coloring.
This paper develops the exact linear relationship between the leading eigenvector of the unnormalized modularity matrix and the eigenvectors of the adjacency matrix. We propose a method for approximating the leading eigenvector of the modularity matrix, and we derive the error of the approximation. There is also a complete proof of the equivalence between normalized adjacency clustering and normalized modularity clustering. Numerical experiments show that normalized adjacency clustering can be as twice efficient as normalized modularity clustering.