We propose a detailed proof of the fact that the inverse of Ackermann function is computable in linear time.
We give a deterministic algorithm for finding the minimum (weight) cut of an undirected graph on $n$ vertices and $m$ edges using $\text{polylog}(n)$ calls to any maximum flow subroutine. Using the current best deterministic maximum flow algorithms, this yields an overall running time of $\tilde O(m \cdot \min(\sqrt{m}, n^{2/3}))$ for weighted graphs, and $m^{4/3+o(1)}$ for unweighted (multi)-graphs. This marks the first improvement for this problem since a running time bound of $\tilde O(mn)$ was established by several papers in the early 1990s. To obtain this result, we introduce a new tool for finding minimum cuts of an undirected graph: *isolating cuts*. Given a set of vertices $R$, this entails finding cuts of minimum weight that separate (or isolate) each individual vertex $v\in R$ from the rest of the vertices $R\setminus \{v\}$. Na\"ively, this can be done using $|R|$ maxflow calls, but we show that just $O(\log |R|)$ suffice for finding isolating cuts for any set of vertices $R$. We call this the *isolating cut lemma*.
We consider lattice coding for the Gaussian wiretap channel, where the challenge is to ensure reliable communication between two authorized parties while preventing an eavesdropper from learning the transmitted messages. Recently, a measure called the secrecy function of a lattice coding scheme was proposed as a design criterion to characterize the eavesdropper's probability of correct decision. In this paper, the family of formally unimodular lattices is presented and shown to possess the same secrecy function behavior as unimodular and isodual lattices. Based on Construction A, we provide a universal approach to determine the secrecy gain, i.e., the maximum value of the secrecy function, for formally unimodular lattices obtained from formally self-dual codes. Furthermore, we show that formally unimodular lattices can achieve higher secrecy gain than the best-known unimodular lattices from the literature.
Minimax optimization has been central in addressing various applications in machine learning, game theory, and control theory. Prior literature has thus far mainly focused on studying such problems in the continuous domain, e.g., convex-concave minimax optimization is now understood to a significant extent. Nevertheless, minimax problems extend far beyond the continuous domain to mixed continuous-discrete domains or even fully discrete domains. In this paper, we study mixed continuous-discrete minimax problems where the minimization is over a continuous variable belonging to Euclidean space and the maximization is over subsets of a given ground set. We introduce the class of convex-submodular minimax problems, where the objective is convex with respect to the continuous variable and submodular with respect to the discrete variable. Even though such problems appear frequently in machine learning applications, little is known about how to address them from algorithmic and theoretical perspectives. For such problems, we first show that obtaining saddle points are hard up to any approximation, and thus introduce new notions of (near-) optimality. We then provide several algorithmic procedures for solving convex and monotone-submodular minimax problems and characterize their convergence rates, computational complexity, and quality of the final solution according to our notions of optimally. Our proposed algorithms are iterative and combine tools from both discrete and continuous optimization. Finally, we provide numerical experiments to showcase the effectiveness of our purposed methods.
Inferring inductive invariants is one of the main challenges of formal verification. The theory of abstract interpretation provides a rich framework to devise invariant inference algorithms. One of the latest breakthroughs in invariant inference is property-directed reachability (PDR), but the research community views PDR and abstract interpretation as mostly unrelated techniques. This paper shows that, surprisingly, propositional PDR can be formulated as an abstract interpretation algorithm in a logical domain. More precisely, we define a version of PDR, called $\Lambda$-PDR, in which all generalizations of counterexamples are used to strengthen a frame. In this way, there is no need to refine frames after their creation, because all the possible supporting facts are included in advance. We analyze this algorithm using notions from Bshouty's monotone theory, originally developed in the context of exact learning. We show that there is an inherent overapproximation between the algorithm's frames that is related to the monotone theory. We then define a new abstract domain in which the best abstract transformer performs this overapproximation, and show that it captures the invariant inference process, i.e., $\Lambda$-PDR corresponds to Kleene iterations with the best transformer in this abstract domain. We provide some sufficient conditions for when this process converges in a small number of iterations, with sometimes an exponential gap from the number of iterations required for naive exact forward reachability. These results provide a firm theoretical foundation for the benefits of how PDR tackles forward reachability.
The spatial impulse response (SIR) method is a well-known approach to calculate transient acoustic fields of arbitrary-shape transducers. It involves the evaluation of a time-dependent surface integral. Although analytic expressions of the SIR exist for some geometries, numerical methods based on the discretization of transducer surfaces have become the standard. The proposed method consists of representing the transducer as a non-uniform rational B-spline (NURBS) surface, and decomposing it into smooth B\'ezier patches onto which quadrature rules can be deployed. The evaluation of the SIR can then be expressed in B-spline bases, resulting in a sum of shifted-and-weighted basis functions. Field signals are eventually obtained by a convolution of the basis coefficients, derived from the excitation waveform, and the basis SIR. The use of NURBS enables exact representations of common transducer elements. High-order Gaussian quadrature rules enable high accuracy with few quadrature points. High-order B-spline bases are ideally suited to exploit efficiently the bandlimited property of excitation waveforms. Numerical experiments demonstrate that the proposed approach enables sampling the SIR at low sampling rates, as required by the excitation waveform, without introducing additional errors on simulated field signals.
A major challenge in current optimization research for deep learning is to automatically find optimal step sizes for each update step. The optimal step size is closely related to the shape of the loss in the update step direction. However, this shape has not yet been examined in detail. This work shows empirically that the batch loss over lines in negative gradient direction is mostly convex locally and well suited for one-dimensional parabolic approximations. By exploiting this parabolic property we introduce a simple and robust line search approach, which performs loss-shape dependent update steps. Our approach combines well-known methods such as parabolic approximation, line search and conjugate gradient, to perform efficiently. It surpasses other step size estimating methods and competes with common optimization methods on a large variety of experiments without the need of hand-designed step size schedules. Thus, it is of interest for objectives where step-size schedules are unknown or do not perform well. Our extensive evaluation includes multiple comprehensive hyperparameter grid searches on several datasets and architectures. Finally, we provide a general investigation of exact line searches in the context of batch losses and exact losses, including their relation to our line search approach.
In this work we advance the understanding of the fundamental limits of computation for Binary Polynomial Optimization (BPO), which is the problem of maximizing a given polynomial function over all binary points. In our main result we provide a novel class of BPO that can be solved efficiently both from a theoretical and computational perspective. In fact, we give a strongly polynomial-time algorithm for instances whose corresponding hypergraph is beta-acyclic. We note that the beta-acyclicity assumption is natural in several applications including relational database schemes and the lifted multicut problem on trees. Due to the novelty of our proving technique, we obtain an algorithm which is interesting also from a practical viewpoint. This is because our algorithm is very simple to implement and the running time is a polynomial of very low degree in the number of nodes and edges of the hypergraph. Our result completely settles the computational complexity of BPO over acyclic hypergraphs, since the problem is NP-hard on alpha-acyclic instances. Our algorithm can also be applied to any general BPO problem that contains beta-cycles. For these problems, the algorithm returns a smaller instance together with a rule to extend any optimal solution of the smaller instance to an optimal solution of the original instance.
A \emph{beer graph} is an undirected graph $G$, in which each edge has a positive weight and some vertices have a beer store. A \emph{beer path} between two vertices $u$ and $v$ in $G$ is any path in $G$ between $u$ and $v$ that visits at least one beer store. We show that any outerplanar beer graph $G$ with $n$ vertices can be preprocessed in $O(n)$ time into a data structure of size $O(n)$, such that for any two query vertices $u$ and $v$, (i) the weight of the shortest beer path between $u$ and $v$ can be reported in $O(\alpha(n))$ time (where $\alpha(n)$ is the inverse Ackermann function), and (ii) the shortest beer path between $u$ and $v$ can be reported in $O(L)$ time, where $L$ is the number of vertices on this path. Both results are optimal, even when $G$ is a beer tree (i.e., a beer graph whose underlying graph is a tree).
Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in eb search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.