We consider the $k$-min-sum-radii ($k$-MSR) clustering problem with fairness constraints. The $k$-min-sum-radii problem is a mixture of the classical $k$-center and $k$-median problems. We are given a set of points $P$ in a metric space and a number $k$ and aim to partition the points into $k$ clusters, each of the clusters having one designated center. The objective to minimize is the sum of the radii of the $k$ clusters (where in $k$-center we would only consider the maximum radius and in $k$-median we would consider the sum of the individual points' costs). Various notions of fair clustering have been introduced lately, and we follow the definitions due to Chierichetti, Kumar, Lattanzi and Vassilvitskii [NeurIPS 2017] which demand that cluster compositions shall follow the proportions of the input point set with respect to some given sensitive attribute. For the easier case where the sensitive attribute only has two possible values and each is equally frequent in the input, the aim is to compute a clustering where all clusters have a 1:1 ratio with respect to this attribute. We call this the 1:1 case. There has been a surge of FPT-approximation algorithms for the $k$-MSR problem lately, solving the problem both in the unconstrained case and in several constrained problem variants. We add to this research area by designing an FPT $(6+\epsilon)$-approximation that works for $k$-MSR under the mentioned general fairness notion. For the special 1:1 case, we improve our algorithm to achieve a $(3+\epsilon)$-approximation.
Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with high-dimensional data presents a formidable challenge, due to its size and the presence of intricate relationships. Recently proposed neural methods employing variational lower bounds on the mutual information have gained prominence. However, these approaches suffer from either high bias or high variance, as the sample size and the structure of the loss function directly influence the training process. In this paper, we propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible since it exhibits an excellent bias/variance trade-off. The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity.
In this paper, we establish anti-concentration inequalities for additive noise mechanisms which achieve $f$-differential privacy ($f$-DP), a notion of privacy phrased in terms of a tradeoff function $f$ which limits the ability of an adversary to determine which individuals were in the database. We show that canonical noise distributions (CNDs), proposed by Awan and Vadhan (2023), match the anti-concentration bounds at half-integer values, indicating that their tail behavior is near-optimal. We also show that all CNDs are sub-exponential, regardless of the $f$-DP guarantee. In the case of log-concave CNDs, we show that they are the stochastically smallest noise compared to any other noise distributions with the same privacy guarantee. In terms of integer-valued noise, we propose a new notion of discrete CND and prove that a discrete CND always exists, can be constructed by rounding a continuous CND, and that the discrete CND is unique when designed for a statistic with sensitivity 1. We further show that the discrete CND at sensitivity 1 is stochastically smallest compared to other integer-valued noises. Our theoretical results shed light on the different types of privacy guarantees possible in the $f$-DP framework and can be incorporated in more complex mechanisms to optimize performance.
In previous work, a description of the result of applying the Householder tridiagonalization algorithm to a G$\beta$E random matrix is provided by Edelman and Dumitriu. The resulting tridiagonal ensemble makes sense for all $\beta>0$, and has spectrum given by the $\beta$-ensemble for all $\beta>0$. Moreover, the tridiagonal model has useful stochastic operator limits which was introduced and analyzed in subsequent studies. In this work, we analogously study the result of applying the Householder tridiagonalization algorithm to a G$\beta$E process which has eigenvalues governed by $\beta$-Dyson Brownian motion. We propose an explicit limit of the upper left $k \times k$ minor of the $n \times n$ tridiagonal process as $n \to \infty$ and $k$ remains fixed. We prove the result for $\beta=1$, and also provide numerical evidence for $\beta=1,2,4$. This leads us to conjecture the form of a dynamical $\beta$-stochastic Airy operator with smallest $k$ eigenvalues evolving according to the $n \to \infty$ limit of the largest, centered and re-scaled, $k$ eigenvalues of $\beta$-Dyson Brownian motion.
This paper presents a novel approach to one-class classifier fusion through locally adaptive learning with dynamic $\ell$p-norm constraints. We introduce a framework that dynamically adjusts fusion weights based on local data characteristics, addressing fundamental challenges in ensemble-based anomaly detection. Our method incorporates an interior-point optimization technique that significantly improves computational efficiency compared to traditional Frank-Wolfe approaches, achieving up to 19-fold speed improvements in complex scenarios. The framework is extensively evaluated on standard UCI benchmark datasets and specialized temporal sequence datasets, demonstrating superior performance across diverse anomaly types. Statistical validation through Skillings-Mack tests confirms our method's significant advantages over existing approaches, with consistent top rankings in both pure and non-pure learning scenarios. The framework's ability to adapt to local data patterns while maintaining computational efficiency makes it particularly valuable for real-time applications where rapid and accurate anomaly detection is crucial.
Implicit surface representations such as the signed distance function (SDF) have emerged as a promising approach for image-based surface reconstruction. However, existing optimization methods assume solid surfaces and are therefore unable to properly reconstruct semi-transparent surfaces and thin structures, which also exhibit low opacity due to the blending effect with the background. While neural radiance field (NeRF) based methods can model semi-transparency and achieve photo-realistic quality in synthesized novel views, their volumetric geometry representation tightly couples geometry and opacity, and therefore cannot be easily converted into surfaces without introducing artifacts. We present $\alpha$Surf, a novel surface representation with decoupled geometry and opacity for the reconstruction of semi-transparent and thin surfaces where the colors mix. Ray-surface intersections on our representation can be found in closed-form via analytical solutions of cubic polynomials, avoiding Monte-Carlo sampling and is fully differentiable by construction. Our qualitative and quantitative evaluations show that our approach can accurately reconstruct surfaces with semi-transparent and thin parts with fewer artifacts, achieving better reconstruction quality than state-of-the-art SDF and NeRF methods. Website: //alphasurf.netlify.app/
Submodular optimization has become increasingly prominent in machine learning and fairness has drawn much attention. In this paper, we propose to study the fair $k$-submodular maximization problem and develop a $\frac{1}{3}$-approximation greedy algorithm with a running time of $\mathcal{O}(knB)$. To the best of our knowledge, our work is the first to incorporate fairness in the context of $k$-submodular maximization, and our theoretical guarantee matches the best-known $k$-submodular maximization results without fairness constraints. In addition, we have developed a faster threshold-based algorithm that achieves a $(\frac{1}{3} - \epsilon)$ approximation with $\mathcal{O}(\frac{kn}{\epsilon} \log \frac{B}{\epsilon})$ evaluations of the function $f$. Furthermore, for both algorithms, we provide approximation guarantees when the $k$-submodular function is not accessible but only can be approximately accessed. We have extensively validated our theoretical findings through empirical research and examined the practical implications of fairness. Specifically, we have addressed the question: ``What is the price of fairness?" through case studies on influence maximization with $k$ topics and sensor placement with $k$ types. The experimental results show that the fairness constraints do not significantly undermine the quality of solutions.
QAC$^0$ is the class of constant-depth quantum circuits with polynomially many ancillary qubits, where Toffoli gates on arbitrarily many qubits are allowed. In this work, we show that the parity function cannot be computed in QAC$^0$, resolving a long-standing open problem in quantum circuit complexity more than twenty years old. As a result, this proves ${\rm QAC}^0 \subsetneqq {\rm QAC}_{\rm wf}^0$. We also show that any QAC circuit of depth $d$ that approximately computes parity on $n$ bits requires $2^{\widetilde{\Omega}(n^{1/d})}$ ancillary qubits, which is close to tight. This implies a similar lower bound on approximately preparing cat states using QAC circuits. Finally, we prove a quantum analog of the Linial-Mansour-Nisan theorem for QAC$^0$. This implies that, for any QAC$^0$ circuit $U$ with $a={\rm poly}(n)$ ancillary qubits, and for any $x\in\{0,1\}^n$, the correlation between $Q(x)$ and the parity function is bounded by ${1}/{2} + 2^{-\widetilde{\Omega}(n^{1/d})}$, where $Q(x)$ denotes the output of measuring the output qubit of $U|x,0^a\rangle$. All the above consequences rely on the following technical result. If $U$ is a QAC$^0$ circuit with $a={\rm poly}(n)$ ancillary qubits, then there is a distribution $\mathcal{D}$ of bounded polynomials of degree polylog$(n)$ such that with high probability, a random polynomial from $\mathcal{D}$ approximates the function $\langle x,0^a| U^\dag Z_{n+1} U |x,0^a\rangle$ for a large fraction of $x\in \{0,1\}^n$. This result is analogous to the Razborov-Smolensky result on the approximation of AC$^0$ circuits by random low-degree polynomials.
While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our main contribution is to propose an $SE(3)$-equivariant model that maps each point in the cloud to a continuous grasp quality function over the 2-sphere $S^2$ using spherical harmonic basis functions. Compared with reasoning about a finite set of samples, this formulation improves the accuracy and efficiency of our model when a large number of samples would otherwise be needed. In order to accomplish this, we propose a novel variation on EquiFormerV2 that leverages a UNet-style encoder-decoder architecture to enlarge the number of points the model can handle. Our resulting method, which we name $\textit{OrbitGrasp}$, significantly outperforms baselines in both simulation and physical experiments.
Depth estimation is a crucial technology in robotics. Recently, self-supervised depth estimation methods have demonstrated great potential as they can efficiently leverage large amounts of unlabelled real-world data. However, most existing methods are designed under the assumption of static scenes, which hinders their adaptability in dynamic environments. To address this issue, we present D$^3$epth, a novel method for self-supervised depth estimation in dynamic scenes. It tackles the challenge of dynamic objects from two key perspectives. First, within the self-supervised framework, we design a reprojection constraint to identify regions likely to contain dynamic objects, allowing the construction of a dynamic mask that mitigates their impact at the loss level. Second, for multi-frame depth estimation, we introduce a cost volume auto-masking strategy that leverages adjacent frames to identify regions associated with dynamic objects and generate corresponding masks. This provides guidance for subsequent processes. Furthermore, we propose a spectral entropy uncertainty module that incorporates spectral entropy to guide uncertainty estimation during depth fusion, effectively addressing issues arising from cost volume computation in dynamic environments. Extensive experiments on KITTI and Cityscapes datasets demonstrate that the proposed method consistently outperforms existing self-supervised monocular depth estimation baselines. Code is available at \url{//github.com/Csyunling/D3epth}.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.