We study the excess minimum risk in statistical inference, defined as the difference between the minimum expected loss in estimating a random variable from an observed feature vector and the minimum expected loss in estimating the same random variable from a transformation (statistic) of the feature vector. After characterizing lossless transformations, i.e., transformations for which the excess risk is zero for all loss functions, we construct a partitioning test statistic for the hypothesis that a given transformation is lossless and show that for i.i.d. data the test is strongly consistent. More generally, we develop information-theoretic upper bounds on the excess risk that uniformly hold over fairly general classes of loss functions. Based on these bounds, we introduce the notion of a delta-lossless transformation and give sufficient conditions for a given transformation to be universally delta-lossless. Applications to classification, nonparametric regression, portfolio strategies, information bottleneck, and deep learning, are also surveyed.
We propose a method for analyzing the distributed random coordinate descent algorithm for solving separable resource allocation problems in the context of an open multiagent system, where agents can be replaced during the process. In particular, we characterize the evolution of the distance to the minimizer in expectation by following a time-varying optimization approach which builds on two components. First, we establish the linear convergence of the algorithm in closed systems, in terms of the estimate towards the minimizer, for general graphs and appropriate step-size. Second, we estimate the change of the optimal solution after a replacement, in order to evaluate its effect on the distance between the current estimate and the minimizer. From these two elements, we derive stability conditions in open systems and establish the linear convergence of the algorithm towards a steady-state expected error. Our results enable to characterize the trade-off between speed of convergence and robustness to agent replacements, under the assumptions that local functions are smooth, strongly convex, and have their minimizers located in a given ball. The approach proposed in this paper can moreover be extended to other algorithms guaranteeing linear convergence in closed system.
Despite recognition of the relationship between infrastructure resilience and community recovery, very limited empirical evidence exists regarding the extent to which the disruptions in and restoration of infrastructure services contribute to the speed of community recovery. To address this gap, this study investigates the relationship between community and infrastructure systems in the context of hurricane impacts, focusing on the recovery dynamics of population activity and power infrastructure restoration. Empirical observational data were utilized to analyze the extent of impact, recovery duration, and recovery types of both systems in the aftermath of Hurricane Ida. The study reveals three key findings. First, power outage duration positively correlates with outage extent until a certain impact threshold is reached. Beyond this threshold, restoration time remains relatively stable regardless of outage magnitude. This finding underscores the need to strengthen power infrastructure, particularly in extreme weather conditions, to minimize outage restoration time. Second, power was fully restored in 70\% of affected areas before population activity levels normalized. This finding suggests the role infrastructure functionality plays in post-disaster community recovery. Interestingly, quicker power restoration did not equate to rapid population activity recovery due to other possible factors such as transportation, housing damage, and business interruptions. Finally, if power outages last beyond two weeks, community activity resumes before complete power restoration, indicating adaptability in prolonged outage scenarios. This implies the capacity of communities to adapt to ongoing power outages and continue daily life activities...
This study explores the concept of prominence as a candidate trait, understood as the perceived worthiness of attention candidates elicit from regular citizens in the context of low information elections. It proposes two dimensions of candidate prominence, political and public, operationalized as having held high visibility roles within the party and having social influence through social media presence. Employing a conjoint analysis experimental design, the study tests whether political and public prominence serve as heuristic mechanisms in low-information electoral settings by estimating conditional effects on respondents' self-assessed interest in politics, educational level and self-assessed ideological placement. The results contribute experimental evidence to support the hypothesis of differential heuristic choices by voters based on varying levels of perceived public and political prominence, conditional on voters' characteristics.
We consider the problem of learning error covariance matrices for robotic state estimation. The convergence of a state estimator to the correct belief over the robot state is dependent on the proper tuning of noise models. During inference, these models are used to weigh different blocks of the Jacobian and error vector resulting from linearization and hence, additionally affect the stability and convergence of the non-linear system. We propose a gradient-based method to estimate well-conditioned covariance matrices by formulating the learning process as a constrained bilevel optimization problem over factor graphs. We evaluate our method against baselines across a range of simulated and real-world tasks and demonstrate that our technique converges to model estimates that lead to better solutions as evidenced by the improved tracking accuracy on unseen test trajectories.
Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or improve sequential recommendation approaches. Specifically, we devise and evaluate three approaches to leverage the power of LLMs in different ways. Our results from experiments on two datasets show that initializing the state-of-the-art sequential recommendation model BERT4Rec with embeddings obtained from an LLM improves NDCG by 15-20% compared to the vanilla BERT4Rec model. Furthermore, we find that a simple approach that leverages LLM embeddings for producing recommendations, can provide competitive performance by highlighting semantically related items. We publicly share the code and data of our experiments to ensure reproducibility.
We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that DFSP significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.
We consider Upper Domination, the problem of finding the minimal dominating set of maximum cardinality. Very few exact algorithms have been described for solving Upper Domination. In particular, no binary programming formulations for Upper Domination have been described in literature, although such formulations have proved quite successful for other kinds of domination problems. We introduce two such binary programming formulations, and show that both can be improved with the addition of extra constraints which reduce the number of feasible solutions. We compare the performance of the formulations on various kinds of graphs, and demonstrate that (a) the additional constraints improve the performance of both formulations, and (b) the first formulation outperforms the second in most cases, although the second performs better for very sparse graphs. Also included is a short proof that the upper domination number of any generalized Petersen graph P(n,k) is equal to n.
Loop closing and relocalization are crucial techniques to establish reliable and robust long-term SLAM by addressing pose estimation drift and degeneration. This article begins by formulating loop closing and relocalization within a unified framework. Then, we propose a novel multi-head network LCR-Net to tackle both tasks effectively. It exploits novel feature extraction and pose-aware attention mechanism to precisely estimate similarities and 6-DoF poses between pairs of LiDAR scans. In the end, we integrate our LCR-Net into a SLAM system and achieve robust and accurate online LiDAR SLAM in outdoor driving environments. We thoroughly evaluate our LCR-Net through three setups derived from loop closing and relocalization, including candidate retrieval, closed-loop point cloud registration, and continuous relocalization using multiple datasets. The results demonstrate that LCR-Net excels in all three tasks, surpassing the state-of-the-art methods and exhibiting a remarkable generalization ability. Notably, our LCR-Net outperforms baseline methods without using a time-consuming robust pose estimator, rendering it suitable for online SLAM applications. To our best knowledge, the integration of LCR-Net yields the first LiDAR SLAM with the capability of deep loop closing and relocalization. The implementation of our methods will be made open-source.
This paper delves into the intersection of computational theory and music, examining the concept of undecidability and its significant, yet overlooked, implications within the realm of modern music composition and production. It posits that undecidability, a principle traditionally associated with theoretical computer science, extends its relevance to the music industry. The study adopts a multidimensional approach, focusing on five key areas: (1) the Turing completeness of Ableton, a widely used digital audio workstation, (2) the undecidability of satisfiability in sound creation utilizing an array of effects, (3) the undecidability of constraints on polymeters in musical compositions, (4) the undecidability of satisfiability in just intonation harmony constraints, and (5) the undecidability of "new ordering systems". In addition to providing theoretical proof for these assertions, the paper elucidates the practical relevance of these concepts for practitioners outside the field of theoretical computer science. The ultimate aim is to foster a new understanding of undecidability in music, highlighting its broader applicability and potential to influence contemporary computer-assisted (and traditional) music making.
Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.