We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] showed that in the incremental setting, where deletions are not allowed, one can obtain $k \cdot \textrm{polylog}(n) / n$ amortized recourse for both $k$-center and $k$-median, and demonstrated a matching lower bound. However, no algorithm for the fully dynamic setting achieves less than the trivial $O(k)$ changes per update, which can be obtained by simply reclustering the full dataset after every update. In this work, we give the first algorithm for consistent $k$-center clustering for the fully dynamic setting, i.e., when both point insertions and deletions are allowed, and improves upon a trivial $O(k)$ recourse bound. Specifically, our algorithm maintains a constant factor approximate solution while ensuring worst-case constant recourse per update, which is optimal in the fully dynamic setting. Moreover, our algorithm is deterministic and is therefore correct even if an adaptive adversary chooses the insertions and deletions.
The Ising $p$-spin glass and the random $k$-SAT models exhibit symmetric multi Overlap Gap Property ($m$-OGP), an intricate geometrical property which is a rigorous barrier against many important classes of algorithms. We establish that for both models, the symmetric $m$-OGP undergoes a sharp phase transition and we identify the phase transition point for each model: for any $m\in\mathbb{N}$, there exists $\gamma_m$ (depending on the model) such that the model exhibits $m$-OGP for $\gamma>\gamma_m$ and the $m$-OGP is provably absent for $\gamma<\gamma_m$, both with high probability, where $\gamma$ is some natural parameter of the model. Our results for the Ising $p$-spin glass model are valid for all large enough $p$ that remain constant as the number $n$ of spins grows, $p=O(1)$; our results for the random $k$-SAT are valid for all $k$ that grow mildly in the number $n$ of Boolean variables, $k=\Omega(\ln n)$. To the best of our knowledge, these are the first sharp phase transition results regarding the $m$-OGP. Our proofs are based on an application of the second moment method combined with a concentration property regarding a suitable random variable. While a standard application of the second moment method fails, we circumvent this issue by leveraging an elegant argument of Frieze~\cite{frieze1990independence} together with concentration.
Two lines of work are taking the central stage in AI research. On the one hand, the community is making increasing efforts to build models that discard spurious correlations and generalize better in novel test environments. Unfortunately, the bitter lesson so far is that no proposal convincingly outperforms a simple empirical risk minimization baseline. On the other hand, large language models (LLMs) have erupted as algorithms able to learn in-context, generalizing on-the-fly to the eclectic contextual circumstances that users enforce by means of prompting. In this paper, we argue that context $\approx$ environment, and posit that in-context learning holds the key to better domain generalization. Via extensive theory and experiments, we show that paying attention to context$\unicode{x2013}\unicode{x2013}$unlabeled examples as they arrive$\unicode{x2013}\unicode{x2013}$allows our proposed In-Context Risk Minimization (ICRM) algorithm to zoom-in on the test environment risk minimizer, leading to significant out-of-distribution performance improvements. From all of this, two messages are worth taking home. Researchers in domain generalization should consider environment as context, and harness the adaptive power of in-context learning. Researchers in LLMs should consider context as environment to better structure data towards generalization.
Consider the problem of estimating a random variable $X$ from noisy observations $Y = X+ Z$, where $Z$ is standard normal, under the $L^1$ fidelity criterion. It is well known that the optimal Bayesian estimator in this setting is the conditional median. This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian. Along the way, several other results are presented. In particular, it is demonstrated that if the conditional distribution $P_{X|Y=y}$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution. Additionally, we consider other $L^p$ losses and observe the following phenomenon: for $p \in [1,2]$, Gaussian is the only prior distribution that induces a linear optimal Bayesian estimator, and for $p \in (2,\infty)$, infinitely many prior distributions on $X$ can induce linearity. Finally, extensions are provided to encompass noise models leading to conditional distributions from certain exponential families.
Matrix product codes are generalizations of some well-known constructions of codes, such as Reed-Muller codes, $[u+v,u-v]$-construction, etc. Recently, a bound for the symbol-pair distance of a matrix product code was given in \cite{LEL}, and new families of MDS symbol-pair codes were constructed by using this bound. In this paper, we generalize this bound to the $b$-symbol distance of a matrix product code and determine all minimum $b$-symbol distances of Reed-Muller codes. We also give a bound for the minimum $b$-symbol distance of codes obtained from the $[u+v,u-v]$-construction, and use this bound to construct some $[2n,2n-2]_q$-linear $b$-symbol almost MDS codes with arbitrary length. All the minimum $b$-symbol distances of $[n,n-1]_q$-linear codes and $[n,n-2]_q$-linear codes for $1\leq b\leq n$ are determined. Some examples are presented to illustrate these results.
We study the problem of Out-of-Distribution (OOD) detection, that is, detecting whether a learning algorithm's output can be trusted at inference time. While a number of tests for OOD detection have been proposed in prior work, a formal framework for studying this problem is lacking. We propose a definition for the notion of OOD that includes both the input distribution and the learning algorithm, which provides insights for the construction of powerful tests for OOD detection. We propose a multiple hypothesis testing inspired procedure to systematically combine any number of different statistics from the learning algorithm using conformal p-values. We further provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that threshold-based tests proposed in prior work perform well in specific settings, but not uniformly well across different types of OOD instances. In contrast, our proposed method that combines multiple statistics performs uniformly well across different datasets and neural networks.
Secure multi-party computation using a physical deck of cards, often called card-based cryptography, has been extensively studied during the past decade. Card-based protocols to compute various Boolean functions have been developed. As each input bit is typically encoded by two cards, computing an $n$-variable Boolean function requires at least $2n$ cards. We are interested in optimal protocols that use exactly $2n$ cards. In particular, we focus on symmetric functions. In this paper, we formulate the problem of developing $2n$-card protocols to compute $n$-variable symmetric Boolean functions by classifying all such functions into several NPN-equivalence classes. We then summarize existing protocols that can compute some representative functions from these classes, and also solve some open problems in the cases $n=4$, 5, 6, and 7. In particular, we develop a protocol to compute a function $k$Mod3, which determines whether the sum of all inputs is congruent to $k$ modulo 3 ($k \in \{0,1,2\}$).
The structure learning problem consists of fitting data generated by a Directed Acyclic Graph (DAG) to correctly reconstruct its arcs. In this context, differentiable approaches constrain or regularize the optimization problem using a continuous relaxation of the acyclicity property. The computational cost of evaluating graph acyclicity is cubic on the number of nodes and significantly affects scalability. In this paper we introduce COSMO, a constraint-free continuous optimization scheme for acyclic structure learning. At the core of our method, we define a differentiable approximation of an orientation matrix parameterized by a single priority vector. Differently from previous work, our parameterization fits a smooth orientation matrix and the resulting acyclic adjacency matrix without evaluating acyclicity at any step. Despite the absence of explicit constraints, we prove that COSMO always converges to an acyclic solution. In addition to being asymptotically faster, our empirical analysis highlights how COSMO performance on graph reconstruction compares favorably with competing structure learning methods.
Most existing salient object detection methods mostly use U-Net or feature pyramid structure, which simply aggregates feature maps of different scales, ignoring the uniqueness and interdependence of them and their respective contributions to the final prediction. To overcome these, we propose the M$^3$Net, i.e., the Multilevel, Mixed and Multistage attention network for Salient Object Detection (SOD). Firstly, we propose Multiscale Interaction Block which innovatively introduces the cross-attention approach to achieve the interaction between multilevel features, allowing high-level features to guide low-level feature learning and thus enhancing salient regions. Secondly, considering the fact that previous Transformer based SOD methods locate salient regions only using global self-attention while inevitably overlooking the details of complex objects, we propose the Mixed Attention Block. This block combines global self-attention and window self-attention, aiming at modeling context at both global and local levels to further improve the accuracy of the prediction map. Finally, we proposed a multilevel supervision strategy to optimize the aggregated feature stage-by-stage. Experiments on six challenging datasets demonstrate that the proposed M$^3$Net surpasses recent CNN and Transformer-based SOD arts in terms of four metrics. Codes are available at //github.com/I2-Multimedia-Lab/M3Net.
In many fields of robotics, knowing the relative position and orientation between two sensors is a mandatory precondition to operate with multiple sensing modalities. In this context, the pair LiDAR-RGB cameras offer complementary features: LiDARs yield sparse high quality range measurements, while RGB cameras provide a dense color measurement of the environment. Existing techniques often rely either on complex calibration targets that are expensive to obtain, or extracted virtual correspondences that can hinder the estimate's accuracy. In this paper we address the problem of LiDAR-RGB calibration using typical calibration patterns (i.e. A3 chessboard) with minimal human intervention. Our approach exploits the planarity of the target to find correspondences between the sensors measurements, leading to features that are robust to LiDAR noise. Moreover, we estimate a solution by solving a joint non-linear optimization problem. We validated our approach by carrying on quantitative and comparative experiments with other state-of-the-art approaches. Our results show that our simple schema performs on par or better than other approches using complex calibration targets. Finally, we release an open-source C++ implementation at \url{//github.com/srrg-sapienza/ca2lib}
This work proposes $\texttt{NePhi}$, a neural deformation model which results in approximately diffeomorphic transformations. In contrast to the predominant voxel-based approaches, $\texttt{NePhi}$ represents deformations functionally which allows for memory-efficient training and inference. This is of particular importance for large volumetric registrations. Further, while medical image registration approaches representing transformation maps via multi-layer perceptrons have been proposed, $\texttt{NePhi}$ facilitates both pairwise optimization-based registration $\textit{as well as}$ learning-based registration via predicted or optimized global and local latent codes. Lastly, as deformation regularity is a highly desirable property for most medical image registration tasks, $\texttt{NePhi}$ makes use of gradient inverse consistency regularization which empirically results in approximately diffeomorphic transformations. We show the performance of $\texttt{NePhi}$ on two 2D synthetic datasets as well as on real 3D lung registration. Our results show that $\texttt{NePhi}$ can achieve similar accuracies as voxel-based representations in a single-resolution registration setting while using less memory and allowing for faster instance-optimization.