Kalai's $3^d$-conjecture states that every centrally symmetric $d$-polytope has at least $3^d$ faces. We give short proofs for two special cases: if $P$ is unconditional (that is, invariant w.r.t. reflection in any coordinate hyperplane), and more generally, if $P$ is locally anti-blocking (that is, looks like an unconditional polytope in every orthant). In both cases we show that the minimum is attained exactly for the Hanner polytopes.
I propose an alternative algorithm to compute the MMS voting rule. Instead of using linear programming, in this new algorithm the maximin support value of a committee is computed using a sequence of maximum flow problems.
The action of a noise operator on a code transforms it into a distribution on the respective space. Some common examples from information theory include Bernoulli noise acting on a code in the Hamming space and Gaussian noise acting on a lattice in the Euclidean space. We aim to characterize the cases when the output distribution is close to the uniform distribution on the space, as measured by R\'enyi divergence of order $\alpha \in (1,\infty]$. A version of this question is known as the channel resolvability problem in information theory, and it has implications for security guarantees in wiretap channels, error correction, discrepancy, worst-to-average case complexity reductions, and many other problems. Our work quantifies the requirements for asymptotic uniformity (perfect smoothing) and identifies explicit code families that achieve it under the action of the Bernoulli and ball noise operators on the code. We derive expressions for the minimum rate of codes required to attain asymptotically perfect smoothing. In proving our results, we leverage recent results from harmonic analysis of functions on the Hamming space. Another result pertains to the use of code families in Wyner's transmission scheme on the binary wiretap channel. We identify explicit families that guarantee strong secrecy when applied in this scheme, showing that nested Reed-Muller codes can transmit messages reliably and securely over a binary symmetric wiretap channel with a positive rate. Finally, we establish a connection between smoothing and error correction in the binary symmetric channel.
Angular Minkowski $p$-distance is a dissimilarity measure that is obtained by replacing Euclidean distance in the definition of cosine dissimilarity with other Minkowski $p$-distances. Cosine dissimilarity is frequently used with datasets containing token frequencies, and angular Minkowski $p$-distance may potentially be an even better choice for certain tasks. In a case study based on the 20-newsgroups dataset, we evaluate clasification performance for classical weighted nearest neighbours, as well as fuzzy rough nearest neighbours. In addition, we analyse the relationship between the hyperparameter $p$, the dimensionality $m$ of the dataset, the number of neighbours $k$, the choice of weights and the choice of classifier. We conclude that it is possible to obtain substantially higher classification performance with angular Minkowski $p$-distance with suitable values for $p$ than with classical cosine dissimilarity.
Dynamic Ensemble Selection (DES) is a Multiple Classifier Systems (MCS) approach that aims to select an ensemble for each query sample during the selection phase. Even with the proposal of several DES approaches, no particular DES technique is the best choice for different problems. Thus, we hypothesize that selecting the best DES approach per query instance can lead to better accuracy. To evaluate this idea, we introduce the Post-Selection Dynamic Ensemble Selection (PS-DES) approach, a post-selection scheme that evaluates ensembles selected by several DES techniques using different metrics. Experimental results show that using accuracy as a metric to select the ensembles, PS-DES performs better than individual DES techniques. PS-DES source code is available in a GitHub repository
This article presents two new algebraic algorithms to perform fast matrix-vector product for $N$-body problems in $d$ dimensions, namely nHODLR$d$D (nested algorithm) and s-nHODLR$d$D (semi-nested or partially nested algorithm). The nHODLR$d$D and s-nHODLR$d$D algorithms are the nested and semi-nested version of our previously proposed fast algorithm, the hierarchically off-diagonal low-rank matrix in $d$ dimensions (HODLR$d$D), respectively, where the admissible clusters are the certain far-field and the vertex-sharing clusters. We rely on algebraic low-rank approximation techniques (ACA and NCA) and develop both algorithms in a black-box (kernel-independent) fashion. The initialization time of the proposed hierarchical structures scales quasi-linearly. Using the nHODLR$d$D and s-nHODLR$d$D hierarchical structures, one can perform the multiplication of a dense matrix (arising out of $N$-body problems) with a vector that scales as $\mathcal{O}(pN)$ and $\mathcal{O}(pN \log(N))$, respectively, where $p$ grows at most poly logarithmically with $N$. The numerical results in $2$D and $3$D $(d=2,3)$ show that the proposed nHODLR$d$D algorithm is competitive to the algebraic Fast Multipole Method in $d$ dimensions with respect to the matrix-vector product time and space complexity. The C++ implementation with OpenMP parallelization of the proposed algorithms is available at \url{//github.com/riteshkhan/nHODLRdD/}.
We identify a family of $O(|E(G)|^2)$ nontrivial facets of the connected matching polytope of a graph $G$, that is, the convex hull of incidence vectors of matchings in $G$ whose covered vertices induce a connected subgraph. Accompanying software to further inspect the polytope of an input graph is available.
In this paper, we provide a geometric interpretation of the structure of Deep Learning (DL) networks, characterized by $L$ hidden layers, a ramp activation function, an ${\mathcal L}^2$ Schatten class (or Hilbert-Schmidt) cost function, and input and output spaces ${\mathbb R}^Q$ with equal dimension $Q\geq1$. The hidden layers are defined on spaces ${\mathbb R}^{Q}$, as well. We apply our recent results on shallow neural networks to construct an explicit family of minimizers for the global minimum of the cost function in the case $L\geq Q$, which we show to be degenerate. In the context presented here, the hidden layers of the DL network "curate" the training inputs by recursive application of a truncation map that minimizes the noise to signal ratio of the training inputs. Moreover, we determine a set of $2^Q-1$ distinct degenerate local minima of the cost function.
The $2$-adic complexity has been well-analyzed in the periodic case. However, we are not aware of any theoretical results on the $N$th $2$-adic complexity of any promising candidate for a pseudorandom sequence of finite length $N$ or results on a part of the period of length $N$ of a periodic sequence, respectively. Here we introduce the first method for this aperiodic case. More precisely, we study the relation between $N$th maximum-order complexity and $N$th $2$-adic complexity of binary sequences and prove a lower bound on the $N$th $2$-adic complexity in terms of the $N$th maximum-order complexity. Then any known lower bound on the $N$th maximum-order complexity implies a lower bound on the $N$th $2$-adic complexity of the same order of magnitude. In the periodic case, one can prove a slightly better result. The latter bound is sharp which is illustrated by the maximum-order complexity of $\ell$-sequences. The idea of the proof helps us to characterize the maximum-order complexity of periodic sequences in terms of the unique rational number defined by the sequence. We also show that a periodic sequence of maximal maximum-order complexity must be also of maximal $2$-adic complexity.
We prove tight bounds on the site percolation threshold for $k$-uniform hypergraphs of maximum degree $\Delta$ and for $k$-uniform hypergraphs of maximum degree $\Delta$ in which any pair of edges overlaps in at most $r$ vertices. The hypergraphs that achieve these bounds are hypertrees, but unlike in the case of graphs, there are many different $k$-uniform, $\Delta$-regular hypertrees. Determining the extremal tree for a given $k, \Delta, r$ involves an optimization problem, and our bounds arise from a convex relaxation of this problem. By combining our percolation bounds with the method of disagreement percolation we obtain improved bounds on the uniqueness threshold for the hard-core model on hypergraphs satisfying the same constraints. Our uniqueness conditions imply exponential weak spatial mixing, and go beyond the known bounds for rapid mixing of local Markov chains and existence of efficient approximate counting and sampling algorithms. Our results lead to natural conjectures regarding the aforementioned algorithmic tasks, based on the intuition that uniqueness thresholds for the extremal hypertrees for percolation determine computational thresholds.
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7\% relative improvement on the instance segmentation and 7.1\% on the object detection of small objects, compared to the current state of the art method on MS COCO.