Given a set $S$ of $n$ points in $\mathbb{R}^d$, a $k$-set is a subset of $k$ points of $S$ that can be strictly separated by a hyperplane from the remaining $n-k$ points. Similarly, one may consider $k$-facets, which are hyperplanes that pass through $d$ points of $S$ and have $k$ points on one side. A notorious open problem is to determine the asymptotics of the maximum number of $k$-sets. In this paper we study a variation on the $k$-set/$k$-facet problem with hyperplanes replaced by algebraic surfaces. In stark contrast to the original $k$-set/$k$-facet problem, there are some natural families of algebraic curves for which the number of $k$-facets can be counted exactly. For example, we show that the number of halving conic sections for any set of $2n+5$ points in general position in the plane is $2\binom{n+2}{2}^2$. To understand the limits of our argument we study a class of maps we call \emph{generally neighborly embeddings}, which map generic point sets into neighborly position. Additionally, we give a simple argument which improves the best known bound on the number of $k$-sets/$k$-facets for point sets in convex position.
Given a fixed finite metric space $(V,\mu)$, the {\em minimum $0$-extension problem}, denoted as ${\tt 0\mbox{-}Ext}[\mu]$, is equivalent to the following optimization problem: minimize function of the form $\min\limits_{x\in V^n} \sum_i f_i(x_i) + \sum_{ij}c_{ij}\mu(x_i,x_j)$ where $c_{ij},c_{vi}$ are given nonnegative costs and $f_i:V\rightarrow \mathbb R$ are functions given by $f_i(x_i)=\sum_{v\in V}c_{vi}\mu(x_i,v)$. The computational complexity of ${\tt 0\mbox{-}Ext}[\mu]$ has been recently established by Karzanov and by Hirai: if metric $\mu$ is {\em orientable modular} then ${\tt 0\mbox{-}Ext}[\mu]$ can be solved in polynomial time, otherwise ${\tt 0\mbox{-}Ext}[\mu]$ is NP-hard. To prove the tractability part, Hirai developed a theory of discrete convex functions on orientable modular graphs generalizing several known classes of functions in discrete convex analysis, such as $L^\natural$-convex functions. We consider a more general version of the problem in which unary functions $f_i(x_i)$ can additionally have terms of the form $c_{uv;i}\mu(x_i,\{u,v\})$ for $\{u,v\}\in F$, where set $F\subseteq\binom{V}{2}$ is fixed. We extend the complexity classification above by providing an explicit condition on $(\mu,F)$ for the problem to be tractable. In order to prove the tractability part, we generalize Hirai's theory and define a larger class of discrete convex functions. It covers, in particular, another well-known class of functions, namely submodular functions on an integer lattice. Finally, we improve the complexity of Hirai's algorithm for solving ${\tt 0\mbox{-}Ext}[\mu]$ on orientable modular graphs.
The well-known middle levels conjecture asserts that for every integer $n\geq 1$, all binary strings of length $2(n+1)$ with exactly $n+1$ many 0s and 1s can be ordered cyclically so that any two consecutive strings differ in swapping the first bit with a complementary bit at some later position. In his book `The Art of Computer Programming Vol. 4A' Knuth raised a stronger form of this conjecture (Problem 56 in Chapter 7, Section 2.1.3), which requires that the sequence of positions with which the first bit is swapped in each step of such an ordering has $2n+1$ blocks of the same length, and each block is obtained by adding $s=1$ (modulo $2n+1$) to the previous block. In this work, we prove Knuth's conjecture in a more general form, allowing for arbitrary shifts $s\geq 1$ that are coprime to $2n+1$. We also present an algorithm to compute this ordering, generating each new bitstring in $\mathcal{O}(n)$ time, using $\mathcal{O}(n)$ memory in total.
Learning constraint networks is known to require a number of membership queries exponential in the number of variables. In this paper, we learn constraint networks by asking the user partial queries. That is, we ask the user to classify assignments to subsets of the variables as positive or negative. We provide an algorithm, called QUACQ, that, given a negative example, focuses onto a constraint of the target network in a number of queries logarithmic in the size of the example. The whole constraint network can then be learned with a polynomial number of partial queries. We give information theoretic lower bounds for learning some simple classes of constraint networks and show that our generic algorithm is optimal in some cases.
Symmetric quantum signal processing provides a parameterized representation of a real polynomial, which can be translated into an efficient quantum circuit for performing a wide range of computational tasks on quantum computers. For a given polynomial, the parameters (called phase factors) can be obtained by solving an optimization problem. However, the cost function is non-convex, and has a very complex energy landscape with numerous global and local minima. It is therefore surprising that the solution can be robustly obtained in practice, starting from a fixed initial guess $\Phi^0$ that contains no information of the input polynomial. To investigate this phenomenon, we first explicitly characterize all the global minima of the cost function. We then prove that one particular global minimum (called the maximal solution) belongs to a neighborhood of $\Phi^0$, on which the cost function is strongly convex under suitable conditions. This explains the aforementioned success of optimization algorithms, and solves the open problem of finding phase factors using only standard double precision arithmetic operations.
For a graph whose vertex set is a finite set of points in $\mathbb R^d$, consider the closed (open) balls with diameters induced by its edges. The graph is called a (an open) Tverberg graph if these closed (open) balls intersect. Using the idea of halving lines, we show that (i) for any finite set of points in the plane, there exists a Hamiltonian cycle that is a Tverberg graph; (ii) for any $ n $ red and $ n $ blue points in the plane, there exists a perfect red-blue matching that is a Tverberg graph. Also, we prove that (iii) for any even set of points in $ \mathbb R^d $, there exists a perfect matching that is an open Tverberg graph; (iv) for any $ n $ red and $ n $ blue points in $ \mathbb R^d $, there exists a perfect red-blue matching that is a Tverberg graph.
We present a uniform description of sets of $m$ linear forms in $n$ variables over the field of rational numbers whose computation requires $m(n - 1)$ additions.
We study the problem of finding approximate first-order stationary points in optimization problems of the form $\min_{x \in X} \max_{y \in Y} f(x,y)$, where the sets $X,Y$ are convex and $Y$ is compact. The objective function $f$ is smooth, but assumed neither convex in $x$ nor concave in $y$. Our approach relies upon replacing the function $f(x,\cdot)$ with its $k$th order Taylor approximation (in $y$) and finding a near-stationary point in the resulting surrogate problem. To guarantee its success, we establish the following result: let the Euclidean diameter of $Y$ be small in terms of the target accuracy $\varepsilon$, namely $O(\varepsilon^{\frac{2}{k+1}})$ for $k \in \mathbb{N}$ and $O(\varepsilon)$ for $k = 0$, with the constant factors controlled by certain regularity parameters of $f$; then any $\varepsilon$-stationary point in the surrogate problem remains $O(\varepsilon)$-stationary for the initial problem. Moreover, we show that these upper bounds are nearly optimal: the aforementioned reduction provably fails when the diameter of $Y$ is larger. For $0 \le k \le 2$ the surrogate function can be efficiently maximized in $y$; our general approximation result then leads to efficient algorithms for finding a near-stationary point in nonconvex-nonconcave min-max problems, for which we also provide convergence guarantees.
Domain generalization (DG), i.e., out-of-distribution generalization, has attracted increased interests in recent years. Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain. For years, great progress has been achieved. This paper presents the first review for recent advances in domain generalization. First, we provide a formal definition of domain generalization and discuss several related fields. Next, we thoroughly review the theories related to domain generalization and carefully analyze the theory behind generalization. Then, we categorize recent algorithms into three classes and present them in detail: data manipulation, representation learning, and learning strategy, each of which contains several popular algorithms. Third, we introduce the commonly used datasets and applications. Finally, we summarize existing literature and present some potential research topics for the future.
Generative Adversarial Nets (GAN) have received considerable attention since the 2014 groundbreaking work by Goodfellow et al. Such attention has led to an explosion in new ideas, techniques and applications of GANs. To better understand GANs we need to understand the mathematical foundation behind them. This paper attempts to provide an overview of GANs from a mathematical point of view. Many students in mathematics may find the papers on GANs more difficulty to fully understand because most of them are written from computer science and engineer point of view. The aim of this paper is to give more mathematically oriented students an introduction to GANs in a language that is more familiar to them.
Two types of knowledge, triples from knowledge graphs and texts from documents, have been studied for knowledge aware open-domain conversation generation, in which graph paths can narrow down vertex candidates for knowledge selection decision, and texts can provide rich information for response generation. Fusion of a knowledge graph and texts might yield mutually reinforcing advantages, but there is less study on that. To address this challenge, we propose a knowledge aware chatting machine with three components, an augmented knowledge graph with both triples and texts, knowledge selector, and knowledge aware response generator. For knowledge selection on the graph, we formulate it as a problem of multi-hop graph reasoning to effectively capture conversation flow, which is more explainable and flexible in comparison with previous work. To fully leverage long text information that differentiates our graph from others, we improve a state of the art reasoning algorithm with machine reading comprehension technology. We demonstrate the effectiveness of our system on two datasets in comparison with state-of-the-art models.