We consider the problem of computing a function of $n$ variables using noisy queries, where each query is incorrect with some fixed and known probability $p \in (0,1/2)$. Specifically, we consider the computation of the $\mathsf{OR}$ function of $n$ bits (where queries correspond to noisy readings of the bits) and the $\mathsf{MAX}$ function of $n$ real numbers (where queries correspond to noisy pairwise comparisons). We show that an expected number of queries of \[ (1 \pm o(1)) \frac{n\log \frac{1}{\delta}}{D_{\mathsf{KL}}(p \| 1-p)} \] is both sufficient and necessary to compute both functions with a vanishing error probability $\delta = o(1)$, where $D_{\mathsf{KL}}(p \| 1-p)$ denotes the Kullback-Leibler divergence between $\mathsf{Bern}(p)$ and $\mathsf{Bern}(1-p)$ distributions. Compared to previous work, our results tighten the dependence on $p$ in both the upper and lower bounds for the two functions.
Higher-order multiway data is ubiquitous in machine learning and statistics and often exhibits community-like structures, where each component (node) along each different mode has a community membership associated with it. In this paper we propose the tensor mixed-membership blockmodel, a generalization of the tensor blockmodel positing that memberships need not be discrete, but instead are convex combinations of latent communities. We establish the identifiability of our model and propose a computationally efficient estimation procedure based on the higher-order orthogonal iteration algorithm (HOOI) for tensor SVD composed with a simplex corner-finding algorithm. We then demonstrate the consistency of our estimation procedure by providing a per-node error bound, which showcases the effect of higher-order structures on estimation accuracy. To prove our consistency result, we develop the $\ell_{2,\infty}$ tensor perturbation bound for HOOI under independent, possibly heteroskedastic, subgaussian noise that may be of independent interest. Our analysis uses a novel leave-one-out construction for the iterates, and our bounds depend only on spectral properties of the underlying low-rank tensor under nearly optimal signal-to-noise ratio conditions such that tensor SVD is computationally feasible. Whereas other leave-one-out analyses typically focus on sequences constructed by analyzing the output of a given algorithm with a small part of the noise removed, our leave-one-out analysis constructions use both the previous iterates and the additional tensor structure to eliminate a potential additional source of error. Finally, we apply our methodology to real and simulated data, including applications to two flight datasets and a trade network dataset, demonstrating some effects not identifiable from the model with discrete community memberships.
We propose $\mathbb{VD}$-$\mathbb{GR}$ - a novel visual dialog model that combines pre-trained language models (LMs) with graph neural networks (GNNs). Prior works mainly focused on one class of models at the expense of the other, thus missing out on the opportunity of combining their respective benefits. At the core of $\mathbb{VD}$-$\mathbb{GR}$ is a novel integration mechanism that alternates between spatial-temporal multi-modal GNNs and BERT layers, and that covers three distinct contributions: First, we use multi-modal GNNs to process the features of each modality (image, question, and dialog history) and exploit their local structures before performing BERT global attention. Second, we propose hub-nodes that link to all other nodes within one modality graph, allowing the model to propagate information from one GNN (modality) to the other in a cascaded manner. Third, we augment the BERT hidden states with fine-grained multi-modal GNN features before passing them to the next $\mathbb{VD}$-$\mathbb{GR}$ layer. Evaluations on VisDial v1.0, VisDial v0.9, VisDialConv, and VisPro show that $\mathbb{VD}$-$\mathbb{GR}$ achieves new state-of-the-art results across all four datasets.
We construct a monotone continuous $Q^1$ finite element method on the uniform mesh for the anisotropic diffusion problem with a diagonally dominant diffusion coefficient matrix. The monotonicity implies the discrete maximum principle. Convergence of the new scheme is rigorously proven. On quadrilateral meshes, the matrix coefficient conditions translate into specific a mesh constraint.
We describe Bayes factors functions based on z, t, $\chi^2$, and F statistics and the prior distributions used to define alternative hypotheses. The non-local alternative prior distributions are centered on standardized effects, which index the Bayes factor function. The prior densities include a dispersion parameter that models the variation of effect sizes across replicated experiments. We examine the convergence rates of Bayes factor functions under true null and true alternative hypotheses. Several examples illustrate the application of the Bayes factor functions to replicated experimental designs and compare the conclusions from these analyses to other default Bayes factor methods.
We consider various iterative algorithms for solving the linear equation $ax=b$ using a quantum computer operating on the principle of quantum annealing. Assuming that the computer's output is described by the Boltzmann distribution, it is shown under which conditions the equation-solving algorithms converge, and an estimate of their convergence rate is provided. The application of this approach to algorithms using both an infinite number of qubits and a small number of qubits is discussed.
For a hypergraph $H$, the transversal is a subset of vertices whose intersection with every edge is nonempty. The cardinality of a minimum transversal is the transversal number of $H$, denoted by $\tau(H)$. The Tuza constant $c_k$ is defined as $\sup{\tau(H)/ (m+n)}$, where $H$ ranges over all $k$-uniform hypergraphs, with $m$ and $n$ being the number of edges and vertices, respectively. We give an upper bound and a lower bound on $c_k$. The upper bound improves the known ones for $k\geq 7$, and the lower bound improves the known ones for $k\in\{7, 8, 10, 11, 13, 14, 17\}$.
The monotonicity of discrete Laplacian implies discrete maximum principle, which in general does not hold for high order schemes. The $Q^2$ spectral element method has been proven monotone on a uniform rectangular mesh. In this paper we prove the monotonicity of the $Q^2$ spectral element method on quasi-uniform rectangular meshes under certain mesh constraints. In particular, we propose a relaxed Lorenz's condition for proving monotonicity.
Extremal Type II $\mathbb{Z}_4$-codes are a class of self-dual $\mathbb{Z}_4$-codes with Euclidean weights divisible by eight and the largest possible minimum Euclidean weight for a given length. A small number of such codes is known for lengths greater than or equal to $48.$ The doubling method is a method for constructing Type II $\mathbb{Z}_4$-codes from a given Type II $\mathbb{Z}_4$-code. Based on the doubling method, in this paper we develop a method to construct new extremal Type II $\mathbb{Z}_4$-codes starting from an extremal Type II $\mathbb{Z}_4$-code of type $4^k$ with an extremal residue code and length $48, 56$ or $64$. Using this method, we construct three new extremal Type II $\mathbb{Z}_4$-codes of length $64$ and type $4^{31}2^2$. Extremal Type II $\mathbb{Z}_4$-codes of length $64$ of this type were not known before. Moreover, the residue codes of the constructed extremal $\mathbb{Z}_4$-codes are new best known $[64,31]$ binary codes and the supports of the minimum weight codewords of the residue code and the torsion code of one of these codes form self-orthogonal $1$-designs.
Byzantine agreement (BA), the task of $n$ parties to agree on one of their input bits in the face of malicious agents, is a powerful primitive that lies at the core of a vast range of distributed protocols. Interestingly, in protocols with the best overall communication, the demands of the parties are highly unbalanced: the amortized cost is $\tilde O(1)$ bits per party, but some parties must send $\Omega(n)$ bits. In best known balanced protocols, the overall communication is sub-optimal, with each party communicating $\tilde O(\sqrt{n})$. In this work, we ask whether asymmetry is inherent for optimizing total communication. Our contributions in this line are as follows: 1) We define a cryptographic primitive, succinctly reconstructed distributed signatures (SRDS), that suffices for constructing $\tilde O(1)$ balanced BA. We provide two constructions of SRDS from different cryptographic and Public-Key Infrastructure (PKI) assumptions. 2) The SRDS-based BA follows a paradigm of boosting from "almost-everywhere" agreement to full agreement, and does so in a single round. We prove that PKI setup and cryptographic assumptions are necessary for such protocols in which every party sends $o(n)$ messages. 3) We further explore connections between a natural approach toward attaining SRDS and average-case succinct non-interactive argument systems (SNARGs) for a particular type of NP-Complete problems (generalizing Subset-Sum and Subset-Product). Our results provide new approaches forward, as well as limitations and barriers, towards minimizing per-party communication of BA. In particular, we construct the first two BA protocols with $\tilde O(1)$ balanced communication, offering a tradeoff between setup and cryptographic assumptions, and answering an open question presented by King and Saia (DISC'09).
Directional interpolation is a fast and efficient compression technique for high-frequency Helmholtz boundary integral equations, but it requires a very large amount of storage in its original form. Algebraic recompression can significantly reduce the storage requirements and speed up the solution process accordingly. During the recompression process, weight matrices are required to correctly measure the influence of different basis vectors on the final result, and for highly accurate approximations, these weight matrices require more storage than the final compressed matrix. We present a compression method for the weight matrices and demonstrate that it introduces only a controllable error to the overall approximation. Numerical experiments show that the new method leads to a significant reduction in storage requirements.