This paper concerns the convergence of empirical measures in high dimensions. We propose a new class of probability metrics and show that under such metrics, the convergence is free of the curse of dimensionality (CoD). Such a feature is critical for high-dimensional analysis and stands in contrast to classical metrics ({\it e.g.}, the Wasserstein metric). The proposed metrics fall into the category of integral probability metrics, for which we specify criteria of test function spaces to guarantee the property of being free of CoD. Examples of the selected test function spaces include the reproducing kernel Hilbert spaces, Barron space, and flow-induced function spaces. Three applications of the proposed metrics are presented: 1. The convergence of empirical measure in the case of random variables; 2. The convergence of $n$-particle system to the solution to McKean-Vlasov stochastic differential equation; 3. The construction of an $\varepsilon$-Nash equilibrium for a homogeneous $n$-player game by its mean-field limit. As a byproduct, we prove that, given a distribution close to the target distribution measured by our metric and a certain representation of the target distribution, we can generate a distribution close to the target one in terms of the Wasserstein metric and relative entropy. Overall, we show that the proposed class of metrics is a powerful tool to analyze the convergence of empirical measures in high dimensions without CoD.
A lot of recent machine learning research papers have "Open-ended learning" in their title. But very few of them attempt to define what they mean when using the term. Even worse, when looking more closely there seems to be no consensus on what distinguishes open-ended learning from related concepts such as continual learning, lifelong learning or autotelic learning. In this paper, we contribute to fixing this situation. After illustrating the genealogy of the concept and more recent perspectives about what it truly means, we outline that open-ended learning is generally conceived as a composite notion encompassing a set of diverse properties. In contrast with these previous approaches, we propose to isolate a key elementary property of open-ended processes, which is to always produce novel elements from time to time over an infinite horizon. From there, we build the notion of open-ended learning problems and focus in particular on the subset of open-ended goal-conditioned reinforcement learning problems, as this framework facilitates the definition of learning a growing repertoire of skills. Finally, we highlight the work that remains to be performed to fill the gap between our elementary definition and the more involved notions of open-ended learning that developmental AI researchers may have in mind.
Consumers frequently interact with reputation systems to rate products, services, and deliveries. While past research extensively studied different conceptual approaches to realize such systems securely and privacy-preservingly, these concepts are not yet in use in business-to-business environments. In this paper, (1) we thus outline which specific challenges privacy-cautious stakeholders in volatile supply chain networks introduce, (2) give an overview of the diverse landscape of privacy-preserving reputation systems and their properties, and (3) based on well-established concepts from supply chain information systems and cryptography, we further propose an initial concept that accounts for the aforementioned challenges by utilizing fully homomorphic encryption. For future work, we identify the need of evaluating whether novel systems address the supply chain-specific privacy and confidentiality needs.
This paper is a compilation of well-known results about Zadoff-Chu sequences, including all proofs with a consistent mathematical notation, for easy reference. Moreover, for a Zadoff-Chu sequence $x_u[n]$ of prime length $N_{\text{ZC}}$ and root index $u$, a formula is derived that allows computing the first term (frequency zero) of its discrete Fourier transform, $X_u[0]$, with constant complexity independent of the sequence length, as opposed to accumulating all its $N_{\text{ZC}}$ terms. The formula stems from a famous result in analytic number theory and is an interesting complement to the fact that the discrete Fourier transform of a Zadoff-Chu sequence is itself a Zadoff-Chu sequence whose terms are scaled by $X_u[0]$. Finally, the paper concludes with a brief analysis of time-continuous signals derived from Zadoff-Chu sequences, especially those obtained by OFDM-modulating a Zadoff-Chu sequence.
A recent development in Bayesian optimization is the use of local optimization strategies, which can deliver strong empirical performance on high-dimensional problems compared to traditional global strategies. The "folk wisdom" in the literature is that the focus on local optimization sidesteps the curse of dimensionality; however, little is known concretely about the expected behavior or convergence of Bayesian local optimization routines. We first study the behavior of the local approach, and find that the statistics of individual local solutions of Gaussian process sample paths are surprisingly good compared to what we would expect to recover from global methods. We then present the first rigorous analysis of such a Bayesian local optimization algorithm recently proposed by M\"uller et al. (2021), and derive convergence rates in both the noisy and noiseless settings.
Because most of the scientific literature data is unmarked, it makes semantic representation learning based on unsupervised graph become crucial. At the same time, in order to enrich the features of scientific literature, a learning method of semantic representation of scientific literature based on adaptive features and graph neural network is proposed. By introducing the adaptive feature method, the features of scientific literature are considered globally and locally. The graph attention mechanism is used to sum the features of scientific literature with citation relationship, and give each scientific literature different feature weights, so as to better express the correlation between the features of different scientific literature. In addition, an unsupervised graph neural network semantic representation learning method is proposed. By comparing the mutual information between the positive and negative local semantic representation of scientific literature and the global graph semantic representation in the potential space, the graph neural network can capture the local and global information, thus improving the learning ability of the semantic representation of scientific literature. The experimental results show that the proposed learning method of semantic representation of scientific literature based on adaptive feature and graph neural network is competitive on the basis of scientific literature classification, and has achieved good results.
Robotic systems that are intended to augment human capabilities commonly require the use of semi-autonomous control and artificial sensing, while at the same time aiming to empower the user to make decisions and take actions. This work identifies principles and techniques from the literature that can help to resolve this apparent contradiction. It is postulated that augmentative robots must function as tools that have partial agency, as collaborative agents that provide conditional transparency, and ideally, serve as extensions of the human body.
This paper presents a distributed rule-based Lloyd algorithm (RBL) for multi-robot motion planning and control. The main limitations of the basic Loyd-based algorithm (LB) concern deadlock issues and the failure to address dynamic constraints effectively. Our contribution is twofold. First, we show how RBL is able to provide safety and convergence to the goal region without relying on communication between robots, nor neighbors control inputs, nor synchronization between the robots. We considered both case of holonomic and non-holonomic robots with control inputs saturation. Second, we show that the Lloyd-based algorithm (without rules) can be successfully used as a safety layer for learning-based approaches, leading to non-negligible benefits. We further prove the soundness, reliability, and scalability of RBL through extensive simulations, an updated comparison with the state of the art, and experimental validations on small-scale car-like robots.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.
Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.