We solve a long-standing open problem about the optimal codebook structure of codes in $n$-dimensional Euclidean space that consist of $n+1$ codewords subject to a codeword energy constraint, in terms of minimizing the average decoding error probability. The conjecture states that optimal codebooks are formed by the $n+1$ vertices of a regular simplex (the $n$-dimensional generalization of a regular tetrahedron) inscribed in the unit sphere. A self-contained proof of this conjecture is provided that hinges on symmetry arguments and leverages a relaxation approach that consists in jointly optimizing the codebook and the decision regions, rather than the codeword locations alone.
The Noisy Max mechanism and its variations are fundamental private selection algorithms that are used to select items from a set of candidates (such as the most common diseases in a population), while controlling the privacy leakage in the underlying data. A recently proposed extension, Noisy Top-k with Gap, provides numerical information about how much better the selected items are compared to the non-selected items (e.g., how much more common are the selected diseases). This extra information comes at no privacy cost but crucially relies on infinite precision for the privacy guarantees. In this paper, we provide a finite-precision secure implementation of this algorithm that takes advantage of integer arithmetic.
In Bayesian inference, a widespread technique to compute integrals against a high-dimensional posterior is to use a Gaussian proxy to the posterior known as the Laplace approximation. We address the question of accuracy of the approximation in terms of TV distance, in the regime in which dimension $d$ grows with sample size $n$. Multiple prior works have shown the requirement $d^3\ll n$ is sufficient for accuracy of the approximation. But in a recent breakthrough, Kasprzak et al, 2022 derived an upper bound scaling as $d/\sqrt n$. In this work, we further refine our understanding of the Laplace approximation error by decomposing the TV error into an $O(d/\sqrt n)$ leading order term, and an $O(d^2/n)$ remainder. This decomposition has far reaching implications: first, we use it to prove that the requirement $d^2\ll n$ cannot in general be improved by showing TV$\gtrsim d/\sqrt n$ for a posterior stemming from logistic regression with Gaussian design. Second, the decomposition provides tighter and more easily computable upper bounds on the TV error. Our result also opens the door to proving the BvM in the $d^2\ll n$ regime, and correcting the Laplace approximation to account for skew; this is pursued in two follow-up works.
Let $X$ be a set of items of size $n$ that contains some defective items, denoted by $I$, where $I \subseteq X$. In group testing, a {\it test} refers to a subset of items $Q \subset X$. The outcome of a test is $1$ if $Q$ contains at least one defective item, i.e., $Q\cap I \neq \emptyset$, and $0$ otherwise. We give a novel approach to obtaining lower bounds in non-adaptive randomized group testing. The technique produced lower bounds that are within a factor of $1/{\log\log\stackrel{k}{\cdots}\log n}$ of the existing upper bounds for any constant~$k$. Employing this new method, we can prove the following result. For any fixed constants $k$, any non-adaptive randomized algorithm that, for any set of defective items $I$, with probability at least $2/3$, returns an estimate of the number of defective items $|I|$ to within a constant factor requires at least $$\Omega\left(\frac{\log n}{\log\log\stackrel{k}{\cdots}\log n}\right)$$ tests. Our result almost matches the upper bound of $O(\log n)$ and solves the open problem posed by Damaschke and Sheikh Muhammad [COCOA 2010 and Discrete Math., Alg. and Appl., 2010]. Additionally, it improves upon the lower bound of $\Omega(\log n/\log\log n)$ previously established by Bshouty [ISAAC 2019].
Equipping the rototranslation group $SE(2)$ with a sub-Riemannian structure inspired by the visual cortex V1, we propose algorithms for image inpainting and enhancement based on hypoelliptic diffusion. We innovate on previous implementations of the methods by Citti, Sarti and Boscain et al., by proposing an alternative that prevents fading and capable of producing sharper results in a procedure that we call WaxOn-WaxOff. We also exploit the sub-Riemannian structure to define a completely new unsharp using $SE(2)$, analogous of the classical unsharp filter for 2D image processing, with applications to image enhancement. We demonstrate our method on blood vessels enhancement in retinal scans.
We establish that constructive continued fraction dimension originally defined using $s$-gales is robust, but surprisingly, that the effective continued fraction dimension and effective (base-$b$) Hausdorff dimension of the same real can be unequal in general. We initially provide an equivalent characterization of continued fraction dimension using Kolmogorov complexity. In the process, we construct an optimal lower semi-computable $s$-gale for continued fractions. We also prove new bounds on the Lebesgue measure of continued fraction cylinders, which may be of independent interest. We apply these bounds to reveal an unexpected behavior of continued fraction dimension. It is known that feasible dimension is invariant with respect to base conversion. We also know that Martin-L\"of randomness and computable randomness are invariant not only with respect to base conversion, but also with respect to the continued fraction representation. In contrast, for any $0 < \varepsilon < 0.5$, we prove the existence of a real whose effective Hausdorff dimension is less than $\varepsilon$, but whose effective continued fraction dimension is greater than or equal to $0.5$. This phenomenon is related to the ``non-faithfulness'' of certain families of covers, investigated by Peres and Torbin and by Albeverio, Ivanenko, Lebid and Torbin. We also establish that for any real, the constructive Hausdorff dimension is at most its effective continued fraction dimension.
We present Self-Driven Strategy Learning ($\textit{sdsl}$), a lightweight online learning methodology for automated reasoning tasks that involve solving a set of related problems. $\textit{sdsl}$ does not require offline training, but instead automatically constructs a dataset while solving earlier problems. It fits a machine learning model to this data which is then used to adjust the solving strategy for later problems. We formally define the approach as a set of abstract transition rules. We describe a concrete instance of the sdsl calculus which uses conditional sampling for generating data and random forests as the underlying machine learning model. We implement the approach on top of the Kissat solver and show that the combination of Kissat+$\textit{sdsl}$ certifies larger bounds and finds more counter-examples than other state-of-the-art bounded model checking approaches on benchmarks obtained from the latest Hardware Model Checking Competition.
Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.
We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.
Deep Convolutional Neural Networks have pushed the state-of-the art for semantic segmentation provided that a large amount of images together with pixel-wise annotations is available. Data collection is expensive and a solution to alleviate it is to use transfer learning. This reduces the amount of annotated data required for the network training but it does not get rid of this heavy processing step. We propose a method of transfer learning without annotations on the target task for datasets with redundant content and distinct pixel distributions. Our method takes advantage of the approximate content alignment of the images between two datasets when the approximation error prevents the reuse of annotation from one dataset to another. Given the annotations for only one dataset, we train a first network in a supervised manner. This network autonomously learns to generate deep data representations relevant to the semantic segmentation. Then the images in the new dataset, we train a new network to generate a deep data representation that matches the one from the first network on the previous dataset. The training consists in a regression between feature maps and does not require any annotations on the new dataset. We show that this method reaches performances similar to a classic transfer learning on the PASCAL VOC dataset with synthetic transformations.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.