Determining the matrix multiplication exponent $\omega$ is one of the greatest open problems in theoretical computer science. We show that it is impossible to prove $\omega = 2$ by starting with structure tensors of modules of fixed degree and using arbitrary restrictions. It implies that the same is impossible by starting with $1_A$-generic non-diagonal tensors of fixed size with minimal border rank. This generalizes the work of Bl\"aser and Lysikov [3]. Our methods come from both commutative algebra and complexity theory.
We study the problem of computing the Hamming weight of an $n$-bit string modulo $m$, for any positive integer $m \leq n$ whose only prime factors are 2 and 3. We show that the exact quantum query complexity of this problem is $\left\lceil n(1 - 1/m) \right\rceil$. The upper bound is via an iterative query algorithm whose core components are the well-known 1-query quantum algorithm (essentially due to Deutsch) to compute the Hamming weight a 2-bit string mod 2 (i.e., the parity of the input bits), and a new 2-query quantum algorithm to compute the Hamming weight of a 3-bit string mod 3. We show a matching lower bound (in fact for arbitrary moduli $m$) via a variant of the polynomial method [de Wolf, SIAM J. Comput., 32(3), 2003]. This bound is for the weaker task of deciding whether or not a given $n$-bit input has Hamming weight 0 modulo $m$, and it holds even in the stronger non-deterministic quantum query model where an algorithm must have positive acceptance probability iff its input evaluates to 1. For $m>2$ our lower bound exceeds $n/2$, beating the best lower bound provable using the general polynomial method [Theorem 4.3, Beals et al., J. ACM 48(4), 2001].
The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized by a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Inspired by this dichotomy, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is equivalent to the uniform case of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is solvable in polynomial time, like the reliability problem. We show that being hierarchical is also necessary for this tractability (under conventional complexity assumptions). In fact, we establish a generalization of the dichotomy that covers every restricted case of reliability in which the probabilities of tuples are determined by their relation.
The point of this work is to explore axiomatisations of concurrent computation using the technology of proof theory and realizability. To deal with this problem, we redefine the Concurrent Realizability of Beffara using as realizers a $\pi$-calculus with global fusions. We define a variant of the Conjunctive Structures of \'E Miquey as a general structure where belong realizers and truth values from realizability. As for Secuential Realizability, we encode the realizers into the algebraic structure by means of a combinatory presentation, following the work of Honda & Yoshida. In this first work we restricted to work with the $\pi$-calculus without replication and its corresponding type system is the multiplicative linear logic (MLL).
Let $Q_{n}^{r}$ be the graph with vertex set $\{-1,1\}^{n}$ in which two vertices are joined if their Hamming distance is at most $r$. The edge-isoperimetric problem for $Q_{n}^{r}$ is that: For every $(n,r,M)$ such that $1\le r\le n$ and $1\le M\le2^{n}$, determine the minimum edge-boundary size of a subset of vertices of $Q_{n}^{r}$ with a given size $M$. In this paper, we apply two different approaches to prove bounds for this problem. The first approach is a linear programming approach and the second is a probabilistic approach. Our bound derived by the first approach generalizes the tight bound for $M=2^{n-1}$ derived by Kahn, Kalai, and Linial in 1989. Moreover, our bound is also tight for $M=2^{n-2}$ and $r\le\frac{n}{2}-1$. Our bounds derived by the second approach are expressed in terms of the \emph{noise stability}, and they are shown to be asymptotically tight as $n\to\infty$ when $r=2\lfloor\frac{\beta n}{2}\rfloor+1$ and $M=\lfloor\alpha2^{n}\rfloor$ for fixed $\alpha,\beta\in(0,1)$, and is tight up to a factor $2$ when $r=2\lfloor\frac{\beta n}{2}\rfloor$ and $M=\lfloor\alpha2^{n}\rfloor$. In fact, the edge-isoperimetric problem is equivalent to a ball-noise stability problem which is a variant of the traditional (i.i.d.-) noise stability problem. Our results can be interpreted as bounds for the ball-noise stability problem.
In order to gain a better understanding of the state space of programs, with the aim of making their verification more tractable, models based on directed topological spaces have been introduced, allowing to take in account equivalence between execution traces, as well as translate features of the execution (such as the presence of deadlocks) into geometrical situations. In this context, many algorithms were introduced, based on a description of the geometrical models as regions consisting of unions of rectangles. We explain here that these constructions can actually be performed directly on the syntax of programs, thus resulting in representations which are more natural and easier to implement. In order to do so, we start from the observation that positions in a program can be described as partial explorations of the program. The operational semantics induces a partial order on positions, and regions can be defined as formal unions of intervals in the resulting poset. We then study the structure of such regions and show that, under reasonable conditions, they form a boolean algebra and admit a representation in normal form (which corresponds to covering a space by maximal intervals), thus supporting the constructions needed for the purpose of studying programs. All the operations involved here are given explicit algorithmic descriptions.
Aiming to recover the data from several concurrent node failures, linear $r$-LRC codes with locality $r$ were extended into $(r, \delta)$-LRC codes with locality $(r, \delta)$ which can enable the local recovery of a failed node in case of more than one node failure. Optimal LRC codes are those whose parameters achieve the generalized Singleton bound with equality. In the present paper, we are interested in studying optimal LRC codes over small fields and, more precisely, over $\mathbb{F}_4$. We shall adopt an approach by investigating optimal quaternary $(r,\delta)$-LRC codes through their parity-check matrices. Our study includes determining the structural properties of optimal $(r,\delta)$-LRC codes, their constructions, and their complete classification over $\F_4$ by browsing all possible parameters. We emphasize that the precise structure of optimal quaternary $(r,\delta)$-LRC codes and their classification are obtained via the parity-check matrix approach use proofs-techniques different from those used recently for optimal binary and ternary $(r,\delta)$-LRC codes obtained by Hao et al. in [IEEE Trans. Inf. Theory, 2020, 66(12): 7465-7474].
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
This paper develops a general framework for learning interpretable data representation via Long Short-Term Memory (LSTM) recurrent neural networks over hierarchal graph structures. Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization. We thus call this model the structure-evolving LSTM. In particular, starting with an initial element-level graph representation where each node is a small data element, the structure-evolving LSTM gradually evolves the multi-level graph representations by stochastically merging the graph nodes with high compatibilities along the stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two connected nodes from their corresponding LSTM gate outputs, which is used to generate a merging probability. The candidate graph structures are accordingly generated where the nodes are grouped into cliques with their merging probabilities. We then produce the new graph structure with a Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in local optimums by stochastic sampling with an acceptance probability. Once a graph structure is accepted, a higher-level graph is then constructed by taking the partitioned cliques as its nodes. During the evolving process, representation becomes more abstracted in higher-levels where redundant information is filtered out, allowing more efficient propagation of long-range data dependencies. We evaluate the effectiveness of structure-evolving LSTM in the application of semantic object parsing and demonstrate its advantage over state-of-the-art LSTM models on standard benchmarks.