We investigate the high-dimensional properties of robust regression estimators in the presence of heavy-tailed contamination of both the covariates and response functions. In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist. We show that, despite being consistent, the Huber loss with optimally tuned location parameter $\delta$ is suboptimal in the high-dimensional regime in the presence of heavy-tailed noise, highlighting the necessity of further regularisation to achieve optimal performance. This result also uncovers the existence of a curious transition in $\delta$ as a function of the sample complexity and contamination. Moreover, we derive the decay rates for the excess risk of ridge regression. We show that, while it is both optimal and universal for noise distributions with finite second moment, its decay rate can be considerably faster when the covariates' second moment does not exist. Finally, we show that our formulas readily generalise to a richer family of models and data distributions, such as generalised linear estimation with arbitrary convex regularisation trained on mixture models.
This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent selects the update order of the sources and transmits the packet to a remote destination over an unreliable delay channel. The destination is tasked with source reconstruction for the purpose of actuation. We utilize the metric cost of actuation error (CAE) to capture the significance (semantics) of error at the point of actuation. We aim to find an optimal sampling policy that minimizes the time-averaged CAE subject to average resource constraints. We formulate this problem as an average-cost constrained Markov Decision Process (CMDP) and transform it into an unconstrained MDP by utilizing Lyapunov drift techniques. Then, we propose a low-complexity drift-plus-penalty(DPP) policy for systems with known source/channel statistics and a Lyapunov optimization-based deep reinforcement learning (LO-DRL) policy for unknown environments. Our policies achieve near-optimal performance in CAE minimization and significantly reduce the number of uninformative transmissions.
Optimal allocation of resources across sub-units in the context of centralized decision-making systems such as bank branches or supermarket chains is a classical application of operations research and management science. In this paper, we develop quantile allocation models to examine how much the output and productivity could potentially increase if the resources were efficiently allocated between units. We increase robustness to random noise and heteroscedasticity by utilizing the local estimation of multiple production functions using convex quantile regression. The quantile allocation models then rely on the estimated shadow prices instead of detailed data of units and allow the entry and exit of units. Our empirical results on Finland's business sector reveal a large potential for productivity gains through better allocation, keeping the current technology and resources fixed.
Robots operating in human-centric environments require the integration of visual grounding and grasping capabilities to effectively manipulate objects based on user instructions. This work focuses on the task of referring grasp synthesis, which predicts a grasp pose for an object referred through natural language in cluttered scenes. Existing approaches often employ multi-stage pipelines that first segment the referred object and then propose a suitable grasp, and are evaluated in private datasets or simulators that do not capture the complexity of natural indoor scenes. To address these limitations, we develop a challenging benchmark based on cluttered indoor scenes from OCID dataset, for which we generate referring expressions and connect them with 4-DoF grasp poses. Further, we propose a novel end-to-end model (CROG) that leverages the visual grounding capabilities of CLIP to learn grasp synthesis directly from image-text pairs. Our results show that vanilla integration of CLIP with pretrained models transfers poorly in our challenging benchmark, while CROG achieves significant improvements both in terms of grounding and grasping. Extensive robot experiments in both simulation and hardware demonstrate the effectiveness of our approach in challenging interactive object grasping scenarios that include clutter.
In this paper, we study the problem of estimating the autocovariance sequence resulting from a reversible Markov chain. A motivating application for studying this problem is the estimation of the asymptotic variance in central limit theorems for Markov chains. We propose a novel shape-constrained estimator of the autocovariance sequence, which is based on the key observation that the representability of the autocovariance sequence as a moment sequence imposes certain shape constraints. We examine the theoretical properties of the proposed estimator and provide strong consistency guarantees for our estimator. In particular, for geometrically ergodic reversible Markov chains, we show that our estimator is strongly consistent for the true autocovariance sequence with respect to an $\ell_2$ distance, and that our estimator leads to strongly consistent estimates of the asymptotic variance. Finally, we perform empirical studies to illustrate the theoretical properties of the proposed estimator as well as to demonstrate the effectiveness of our estimator in comparison with other current state-of-the-art methods for Markov chain Monte Carlo variance estimation, including batch means, spectral variance estimators, and the initial convex sequence estimator.
The pursuit of autonomous driving technology hinges on the sophisticated integration of perception, decision-making, and control systems. Traditional approaches, both data-driven and rule-based, have been hindered by their inability to grasp the nuance of complex driving environments and the intentions of other road users. This has been a significant bottleneck, particularly in the development of common sense reasoning and nuanced scene understanding necessary for safe and reliable autonomous driving. The advent of Visual Language Models (VLM) represents a novel frontier in realizing fully autonomous vehicle driving. This report provides an exhaustive evaluation of the latest state-of-the-art VLM, \modelnamefull, and its application in autonomous driving scenarios. We explore the model's abilities to understand and reason about driving scenes, make decisions, and ultimately act in the capacity of a driver. Our comprehensive tests span from basic scene recognition to complex causal reasoning and real-time decision-making under varying conditions. Our findings reveal that \modelname demonstrates superior performance in scene understanding and causal reasoning compared to existing autonomous systems. It showcases the potential to handle out-of-distribution scenarios, recognize intentions, and make informed decisions in real driving contexts. However, challenges remain, particularly in direction discernment, traffic light recognition, vision grounding, and spatial reasoning tasks. These limitations underscore the need for further research and development. Project is now available on GitHub for interested parties to access and utilize: \url{//github.com/PJLab-ADG/GPT4V-AD-Exploration}
The deployment of machine learning solutions in real-world scenarios often involves addressing the challenge of out-of-distribution (OOD) detection. While significant efforts have been devoted to OOD detection in classical supervised settings, the context of weakly supervised learning, particularly the Multiple Instance Learning (MIL) framework, remains under-explored. In this study, we tackle this challenge by adapting post-hoc OOD detection methods to the MIL setting while introducing a novel benchmark specifically designed to assess OOD detection performance in weakly supervised scenarios. Across extensive experiments based on diverse public datasets, KNN emerges as the best-performing method overall. However, it exhibits significant shortcomings on some datasets, emphasizing the complexity of this under-explored and challenging topic. Our findings shed light on the complex nature of OOD detection under the MIL framework, emphasizing the importance of developing novel, robust, and reliable methods that can generalize effectively in a weakly supervised context. The code for the paper is available here: //github.com/loic-lb/OOD_MIL.
We propose a way to split a given bivariate P-recursive sequence into a summable part and a non-summable part in such a way that the non-summable part is minimal in some sense. This decomposition gives rise to a new reduction-based creative telescoping algorithm based on the concept of integral bases.
This study investigates the influence of varying illumination levels on architectural experiences by employing a comprehensive approach that combines self-reported assessments and neurophysiological measurements. Thirty participants were exposed to nine distinct illumination conditions in a controlled virtual reality environment. Subjective assessments, collected through questionnaires in which participants were asked to rate how pleasant, interesting, exciting, calming, complex, bright and spacious they found the space. Objective measurements of brain activity were collected by electroencephalogram (EEG). Data analysis demonstrated that illumination levels significantly influenced cognitive engagement and different architectural experience indicators. This alignment between subjective assessment and EEG data underscores the relationship between illuminance and architectural experiences. The study bridges the gap between quantitative and qualitative assessments, providing a deeper understanding of the intricate connection between lighting conditions and human responses. These findings contribute to the enhancement of environmental design based on neuroscientific insights, emphasizing the critical role of well-considered daylighting design in positively influencing occupants' cognitive and emotional states within built environments.
Defensive deception is a promising approach for cyberdefense. Although defensive deception is increasingly popular in the research community, there has not been a systematic investigation of its key components, the underlying principles, and its tradeoffs in various problem settings. This survey paper focuses on defensive deception research centered on game theory and machine learning, since these are prominent families of artificial intelligence approaches that are widely employed in defensive deception. This paper brings forth insights, lessons, and limitations from prior work. It closes with an outline of some research directions to tackle major gaps in current defensive deception research.
The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.