A cap set is a subset of $\mathbb{F}_3^n$ with no solutions to $x+y+z=0$ other than when $x=y=z$. In this paper, we provide a new lower bound on the size of a maximal cap set. Building on a construction of Edel, we use improved computational methods and new theoretical ideas to show that, for large enough $n$, there is always a cap set in $\mathbb{F}_3^n$ of size at least $2.218^n$.
This is an expository note explaining how the geometric notions of local connectedness and properness are related to the $\Sigma$-type and $\Pi$-type constructors of dependent type theory.
Let $X$ be a $d$-dimensional simplicial complex. A function $F\colon X(k)\to \{0,1\}^k$ is said to be a direct product function if there exists a function $f\colon X(1)\to \{0,1\}$ such that $F(\sigma) = (f(\sigma_1), \ldots, f(\sigma_k))$ for each $k$-face $\sigma$. In an effort to simplify components of the PCP theorem, Goldreich and Safra introduced the problem of direct product testing, which asks whether one can test if $F\colon X(k)\to \{0,1\}^k$ is correlated with a direct product function by querying $F$ on only $2$ inputs. Dinur and Kaufman conjectured that there exist bounded degree complexes with a direct product test in the small soundness regime. We resolve their conjecture by showing that for all $\delta>0$, there exists a family of high-dimensional expanders with degree $O_{\delta}(1)$ and a $2$-query direct product tester with soundness $\delta$. We use the characterization given by a subset of the authors and independently by Dikstein and Dinur, who showed that some form of non-Abelian coboundary expansion (which they called "Unique-Games coboundary expansion") is a necessary and sufficient condition for a complex to admit such direct product testers. Our main technical contribution is a general technique for showing coboundary expansion of complexes with coefficients in a non-Abelian group. This allows us to prove that the high dimensional expanders constructed by Chapman and Lubotzky satisfies the necessary conditions, thus admitting a 2-query direct product tester with small soundness.
This paper is concerned with sampling from probability distributions $\pi$ on $\mathbb{R}^d$ admitting a density of the form $\pi(x) \propto e^{-U(x)}$, where $U(x)=F(x)+G(Kx)$ with $K$ being a linear operator and $G$ being non-differentiable. Two different methods are proposed, both employing a subgradient step with respect to $G\circ K$, but, depending on the regularity of $F$, either an explicit or an implicit gradient step with respect to $F$ can be implemented. For both methods, non-asymptotic convergence proofs are provided, with improved convergence results for more regular $F$. Further, numerical experiments are conducted for simple 2D examples, illustrating the convergence rates, and for examples of Bayesian imaging, showing the practical feasibility of the proposed methods for high dimensional data.
A fundamental issue in the $\lambda$-calculus is to find appropriate notions for meaningfulness. It is well-known that in the call-by-name $\lambda$-calculus (CbN) the meaningful terms can be identified with the solvable ones, and that this notion is not appropriate in the call-by-value $\lambda$-calculus (CbV). This paper validates the challenging claim that yet another notion, previously introduced in the literature as potential valuability, appropriately represents meaningfulness in CbV. Akin to CbN, this claim is corroborated by proving two essential properties. The first one is genericity, stating that meaningless subterms have no bearing on evaluating normalizing terms. To prove this (which was an open problem), we use a novel approach based on stratified reduction, indifferently applicable to CbN and CbV, and in a quantitative way. The second property concerns consistency of the smallest congruence relation resulting from equating all meaningless terms. While the consistency result is not new, we provide the first direct operational proof of it. We also show that such a congruence has a unique consistent and maximal extension, which coincides with a well-known notion of observational equivalence. Our results thus supply the formal concepts and tools that validate the informal notion of meaningfulness underlying CbN and CbV.
In Linear Logic ($\mathsf{LL}$), the exponential modality $!$ brings forth a distinction between non-linear proofs and linear proofs, where linear means using an argument exactly once. Differential Linear Logic ($\mathsf{DiLL}$) is an extension of Linear Logic which includes additional rules for $!$ which encode differentiation and the ability of linearizing proofs. On the other hand, Graded Linear Logic ($\mathsf{GLL}$) is a variation of Linear Logic in such a way that $!$ is now indexed over a semiring $R$. This $R$-grading allows for non-linear proofs of degree $r \in R$, such that the linear proofs are of degree $1 \in R$. There has been recent interest in combining these two variations of $\mathsf{LL}$ together and developing Graded Differential Linear Logic ($\mathsf{GDiLL}$). In this paper we present a sequent calculus for $\mathsf{GDiLL}$, as well as introduce its categorical semantics, which we call graded differential categories, using both coderelictions and deriving transformations. We prove that symmetric powers always give graded differential categories, and provide other examples of graded differential categories. We also discuss graded versions of (monoidal) coalgebra modalities, additive bialgebra modalities, and the Seely isomorphisms, as well as their implementations in the sequent calculus of $\mathsf{GDiLL}$.
If $G$ is a group, we say a subset $S$ of $G$ is product-free if the equation $xy=z$ has no solutions with $x,y,z \in S$. For $D \in \mathbb{N}$, a group $G$ is said to be $D$-quasirandom if the minimal dimension of a nontrivial complex irreducible representation of $G$ is at least $D$. Gowers showed that in a $D$-quasirandom finite group $G$, the maximal size of a product-free set is at most $|G|/D^{1/3}$. This disproved a longstanding conjecture of Babai and S\'os from 1985. For the special unitary group, $G=SU(n)$, Gowers observed that his argument yields an upper bound of $n^{-1/3}$ on the measure of a measurable product-free subset. In this paper, we improve Gowers' upper bound to $\exp(-cn^{1/3})$, where $c>0$ is an absolute constant. In fact, we establish something stronger, namely, product-mixing for measurable subsets of $SU(n)$ with measure at least $\exp(-cn^{1/3})$; for this product-mixing result, the $n^{1/3}$ in the exponent is sharp. Our approach involves introducing novel hypercontractive inequalities, which imply that the non-Abelian Fourier spectrum of the indicator function of a small set concentrates on high-dimensional irreducible representations. Our hypercontractive inequalities are obtained via methods from representation theory, harmonic analysis, random matrix theory and differential geometry. We generalize our hypercontractive inequalities from $SU(n)$ to an arbitrary $D$-quasirandom compact connected Lie group for $D$ at least an absolute constant, thereby extending our results on product-free sets to such groups. We also demonstrate various other applications of our inequalities to geometry (viz., non-Abelian Brunn-Minkowski type inequalities), mixing times, and the theory of growth in compact Lie groups.
In open-set recognition (OSR), a promising strategy is exploiting pseudo-unknown data outside given $K$ known classes as an additional $K$+$1$-th class to explicitly model potential open space. However, treating unknown classes without distinction is unequal for them relative to known classes due to the category-agnostic and scale-agnostic of the unknowns. This inevitably not only disrupts the inherent distributions of unknown classes but also incurs both class-wise and instance-wise imbalances between known and unknown classes. Ideally, the OSR problem should model the whole class space as $K$+$\infty$, but enumerating all unknowns is impractical. Since the core of OSR is to effectively model the boundaries of known classes, this means just focusing on the unknowns nearing the boundaries of targeted known classes seems sufficient. Thus, as a compromise, we convert the open classes from infinite to $K$, with a novel concept Target-Aware Universum (TAU) and propose a simple yet effective framework Dual Contrastive Learning with Target-Aware Universum (DCTAU). In details, guided by the targeted known classes, TAU automatically expands the unknown classes from the previous $1$ to $K$, effectively alleviating the distribution disruption and the imbalance issues mentioned above. Then, a novel Dual Contrastive (DC) loss is designed, where all instances irrespective of known or TAU are considered as positives to contrast with their respective negatives. Experimental results indicate DCTAU sets a new state-of-the-art.
The Ulam distance of two permutations on $[n]$ is $n$ minus the length of their longest common subsequence. In this paper, we show that for every $\varepsilon>0$, there exists some $\alpha>0$, and an infinite set $\Gamma\subseteq \mathbb{N}$, such that for all $n\in\Gamma$, there is an explicit set $C_n$ of $(n!)^{\alpha}$ many permutations on $[n]$, such that every pair of permutations in $C_n$ has pairwise Ulam distance at least $(1-\varepsilon)\cdot n$. Moreover, we can compute the $i^{\text{th}}$ permutation in $C_n$ in poly$(n)$ time and can also decode in poly$(n)$ time, a permutation $\pi$ on $[n]$ to its closest permutation $\pi^*$ in $C_n$, if the Ulam distance of $\pi$ and $\pi^*$ is less than $ \frac{(1-\varepsilon)\cdot n}{4} $. Previously, it was implicitly known by combining works of Goldreich and Wigderson [Israel Journal of Mathematics'23] and Farnoud, Skachek, and Milenkovic [IEEE Transactions on Information Theory'13] in a black-box manner, that it is possible to explicitly construct $(n!)^{\Omega(1)}$ many permutations on $[n]$, such that every pair of them have pairwise Ulam distance at least $\frac{n}{6}\cdot (1-\varepsilon)$, for any $\varepsilon>0$, and the bound on the distance can be improved to $\frac{n}{4}\cdot (1-\varepsilon)$ if the construction of Goldreich and Wigderson is directly analyzed in the Ulam metric.
Earlier papers \cite{VB2022,VB2023a} introduced the notions of a core and an index of a relation (an index being a special case of a core). A limited form of the axiom of choice was postulated -- specifically that all partial equivalence relations (pers) have an index -- and the consequences of adding the axiom to axiom systems for point-free reasoning were explored. In this paper, we define a partial ordering on relations, which we call the \textsf{thins} ordering. We show that our axiom of choice is equivalent to the property that core relations are the minimal elements of the \textsf{thins} ordering. We also postulate a novel axiom that guarantees that, when \textsf{thins} is restricted to non-empty pers, equivalence relations are maximal. This and other properties of \textsf{thins} provide further evidence that our axiom of choice is a desirable means of strengthening point-free reasoning on relations. Although our novel axiom is valid for concrete relations and is a sufficient condition for characterising maximality, we show that it is not a necessary condition in the abstract point-free algebra. This leaves open the problem of deriving a necessary and sufficient condition.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax