The geometric median of a tuple of vectors is the vector that minimizes the sum of Euclidean distances to the vectors of the tuple. Classically called the Fermat-Weber problem and applied to facility location, it has become a major component of the robust learning toolbox. It is typically used to aggregate the (processed) inputs of different data providers, whose motivations may diverge, especially in applications like content moderation. Interestingly, as a voting system, the geometric median has well-known desirable properties: it is a provably good average approximation, it is robust to a minority of malicious voters, and it satisfies the "one voter, one unit force" fairness principle. However, what was not known is the extent to which the geometric median is strategyproof. Namely, can a strategic voter significantly gain by misreporting their preferred vector? We prove in this paper that, perhaps surprisingly, the geometric median is not even $\alpha$-strategyproof, where $\alpha$ bounds what a voter can gain by deviating from truthfulness. But we also prove that, in the limit of a large number of voters with i.i.d. preferred vectors, the geometric median is asymptotically $\alpha$-strategyproof. We show how to compute this bound $\alpha$. We then generalize our results to voters who care more about some dimensions. Roughly, we show that, if some dimensions are more polarized and regarded as more important, then the geometric median becomes less strategyproof. Interestingly, we also show how the skewed geometric medians can improve strategyproofness. Nevertheless, if voters care differently about different dimensions, we prove that no skewed geometric median can achieve strategyproofness for all. Overall, our results constitute a coherent set of insights into the extent to which the geometric median is suitable to aggregate high-dimensional disagreements.
In this paper, we reveal that metric learning would suffer from serious inseparable problem if without informative sample mining. Since the inseparable samples are often mixed with hard samples, current informative sample mining strategies used to deal with inseparable problem may bring up some side-effects, such as instability of objective function, etc. To alleviate this problem, we propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML). In ANML, we design two thresholds to adaptively identify the inseparable similar and dissimilar samples in the training procedure, thus inseparable sample removing and metric parameter learning are implemented in the same procedure. Due to the non-continuity of the proposed ANML, we develop an ingenious function, named \emph{log-exp mean function} to construct a continuous formulation to surrogate it, which can be efficiently solved by the gradient descent method. Similar to Triplet loss, ANML can be used to learn both the linear and deep embeddings. By analyzing the proposed method, we find it has some interesting properties. For example, when ANML is used to learn the linear embedding, current famous metric learning algorithms such as the large margin nearest neighbor (LMNN) and neighbourhood components analysis (NCA) are the special cases of the proposed ANML by setting the parameters different values. When it is used to learn deep features, the state-of-the-art deep metric learning algorithms such as Triplet loss, Lifted structure loss, and Multi-similarity loss become the special cases of ANML. Furthermore, the \emph{log-exp mean function} proposed in our method gives a new perspective to review the deep metric learning methods such as Prox-NCA and N-pairs loss. At last, promising experimental results demonstrate the effectiveness of the proposed method.
In this paper, we prove a local limit theorem for the chi-square distribution with $r > 0$ degrees of freedom and noncentrality parameter $\lambda \geq 0$. We use it to develop refined normal approximations for the survival function. Our maximal errors go down to an order of $r^{-2}$, which is significantly smaller than the maximal error bounds of order $r^{-1/2}$ recently found by Horgan & Murphy (2013) and Seri (2015). Our results allow us to drastically reduce the number of observations required to obtain negligible errors in the energy detection problem, from $250$, as recommended in the seminal work of Urkowitz (1967), to only $8$ here with our new approximations. We also obtain an upper bound on several probability metrics between the central and noncentral chi-square distributions and the standard normal distribution, and we obtain an approximation for the median that improves the lower bound previously obtained by Robert (1990).
We develop a category-theoretic criterion for determining the equivalence of causal models having different but homomorphic directed acyclic graphs over discrete variables. Following Jacobs et al. (2019), we define a causal model as a probabilistic interpretation of a causal string diagram, i.e., a functor from the ``syntactic'' category $\textsf{Syn}_G$ of graph $G$ to the category $\textsf{Stoch}$ of finite sets and stochastic matrices. The equivalence of causal models is then defined in terms of a natural transformation or isomorphism between two such functors, which we call a $\Phi$-abstraction and $\Phi$-equivalence, respectively. It is shown that when one model is a $\Phi$-abstraction of another, the intervention calculus of the former can be consistently translated into that of the latter. We also identify the condition under which a model accommodates a $\Phi$-abstraction, when transformations are deterministic.
While many works exploiting an existing Lie group structure have been proposed for state estimation, in particular the Invariant Extended Kalman Filter (IEKF), few papers address the construction of a group structure that allows casting a given system into the framework of invariant filtering. In this paper we introduce a large class of systems encompassing most problems involving a navigating vehicle encountered in practice. For those systems we introduce a novel methodology that systematically provides a group structure for the state space, including vectors of the body frame such as biases. We use it to derive observers having properties akin to those of linear observers or filters. The proposed unifying and versatile framework encompasses all systems where IEKF has proved successful, improves state-of-the art "imperfect" IEKF for inertial navigation with sensor biases, and allows addressing novel examples, like GNSS antenna lever arm estimation.
The commonly quoted error rates for QMC integration with an infinite low discrepancy sequence is $O(n^{-1}\log(n)^r)$ with $r=d$ for extensible sequences and $r=d-1$ otherwise. Such rates hold uniformly over all $d$ dimensional integrands of Hardy-Krause variation one when using $n$ evaluation points. Implicit in those bounds is that for any sequence of QMC points, the integrand can be chosen to depend on $n$. In this paper we show that rates with any $r<(d-1)/2$ can hold when $f$ is held fixed as $n\to\infty$. This is accomplished following a suggestion of Erich Novak to use some unpublished results of Trojan from the 1980s as given in the information based complexity monograph of Traub, Wasilkowski and Wo\'zniakowski. The proof is made by applying a technique of Roth with the theorem of Trojan. The proof is non constructive and we do not know of any integrand of bounded variation in the sense of Hardy and Krause for which the QMC error exceeds $(\log n)^{1+\epsilon}/n$ for infinitely many $n$ when using a digital sequence such as one of Sobol's. An empirical search when $d=2$ for integrands designed to exploit known weaknesses in certain point sets showed no evidence that $r>1$ is needed. An example with $d=3$ and $n$ up to $2^{100}$ might possibly require $r>1$.
The s-Club problem asks, for a given undirected graph $G$, whether $G$ contains a vertex set $S$ of size at least $k$ such that $G[S]$, the subgraph of $G$ induced by $S$, has diameter at most $s$. We consider variants of $s$-Club where one additionally demands that each vertex of $G[S]$ is contained in at least $\ell$ triangles in $G[S]$, that each edge of $G[S]$ is contained in at least $\ell$~triangles in $G[S]$, or that $S$ contains a given set $W$ of seed vertices. We show that in general these variants are W[1]-hard when parameterized by the solution size $k$, making them significantly harder than the unconstrained $s$-Club problem. On the positive side, we obtain some FPT algorithms for the case when $\ell=1$ and for the case when $G[W]$, the graph induced by the set of seed vertices, is a clique.
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
Metric learning learns a metric function from training data to calculate the similarity or distance between samples. From the perspective of feature learning, metric learning essentially learns a new feature space by feature transformation (e.g., Mahalanobis distance metric). However, traditional metric learning algorithms are shallow, which just learn one metric space (feature transformation). Can we further learn a better metric space from the learnt metric space? In other words, can we learn metric progressively and nonlinearly like deep learning by just using the existing metric learning algorithms? To this end, we present a hierarchical metric learning scheme and implement an online deep metric learning framework, namely ODML. Specifically, we take one online metric learning algorithm as a metric layer, followed by a nonlinear layer (i.e., ReLU), and then stack these layers modelled after the deep learning. The proposed ODML enjoys some nice properties, indeed can learn metric progressively and performs superiorly on some datasets. Various experiments with different settings have been conducted to verify these properties of the proposed ODML.
We propose an Active Learning approach to image segmentation that exploits geometric priors to streamline the annotation process. We demonstrate this for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels or voxels are most in need of annotation. For multi-class settings, we additionally introduce two novel criteria for uncertainty. In the 3D case, we use the resulting uncertainty measure to show the annotator voxels lying on the same planar patch, which makes batch annotation much easier than if they were randomly distributed in the volume. The planar patch is found using a branch-and-bound algorithm that finds a patch with the most informative instances. We evaluate our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on regular images of horses and faces. We demonstrate a substantial performance increase over state-of-the-art approaches.
In this paper we introduce a covariance framework for the analysis of EEG and MEG data that takes into account observed temporal stationarity on small time scales and trial-to-trial variations. We formulate a model for the covariance matrix, which is a Kronecker product of three components that correspond to space, time and epochs/trials, and consider maximum likelihood estimation of the unknown parameter values. An iterative algorithm that finds approximations of the maximum likelihood estimates is proposed. We perform a simulation study to assess the performance of the estimator and investigate the influence of different assumptions about the covariance factors on the estimated covariance matrix and on its components. Apart from that, we illustrate our method on real EEG and MEG data sets. The proposed covariance model is applicable in a variety of cases where spontaneous EEG or MEG acts as source of noise and realistic noise covariance estimates are needed for accurate dipole localization, such as in evoked activity studies, or where the properties of spontaneous EEG or MEG are themselves the topic of interest, such as in combined EEG/fMRI experiments in which the correlation between EEG and fMRI signals is investigated.