This paper describes a purely functional library for computing level-$p$-complexity of Boolean functions, and applies it to two-level iterated majority. Boolean functions are simply functions from $n$ bits to one bit, and they can describe digital circuits, voting systems, etc. An example of a Boolean function is majority, which returns the value that has majority among the $n$ input bits for odd $n$. The complexity of a Boolean function $f$ measures the cost of evaluating it: how many bits of the input are needed to be certain about the result of $f$. There are many competing complexity measures but we focus on level-$p$-complexity -- a function of the probability $p$ that a bit is 1. The level-$p$-complexity $D_p(f)$ is the minimum expected cost when the input bits are independent and identically distributed with Bernoulli($p$) distribution. We specify the problem as choosing the minimum expected cost of all possible decision trees -- which directly translates to a clearly correct, but very inefficient implementation. The library uses thinning and memoization for efficiency and type classes for separation of concerns. The complexity is represented using (sets of) polynomials, and the order relation used for thinning is implemented using polynomial factorisation and root-counting. Finally we compute the complexity for two-level iterated majority and improve on an earlier result by J.~Jansson.
We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to two tasks: speech resynthesis and speech-to-speech translation. In both cases, the benchmark evaluates the ability of the model to encode emphasis in the speech input and accurately reproduce it in the output, potentially across a change of speaker and language. As part of the evaluation pipeline, we introduce EmphaClass, a new model that classifies emphasis at the frame or word level.
In this paper, we define two particular forms of non-termination, namely loops and binary chains, in an abstract framework that encompasses term rewriting and logic programming. The definition of loops relies on the notion of compatibility of binary relations. We also present a syntactic criterion for the detection of a special case of binary chains. Moreover, we describe our implementation NTI and compare its results at the Termination Competition 2023 with those of leading analyzers.
This paper introduces a deep learning approach to dynamic spectrum access, leveraging the synergy of multi-modal image and spectrum data for the identification of potential transmitters. We consider an edge device equipped with a camera that is taking images of potential objects such as vehicles that may harbor transmitters. Recognizing the computational constraints and trust issues associated with on-device computation, we propose a collaborative system wherein the edge device communicates selectively processed information to a trusted receiver acting as a fusion center, where a decision is made to identify whether a potential transmitter is present, or not. To achieve this, we employ task-oriented communications, utilizing an encoder at the transmitter for joint source coding, channel coding, and modulation. This architecture efficiently transmits essential information of reduced dimension for object classification. Simultaneously, the transmitted signals may reflect off objects and return to the transmitter, allowing for the collection of target sensing data. Then the collected sensing data undergoes a second round of encoding at the transmitter, with the reduced-dimensional information communicated back to the fusion center through task-oriented communications. On the receiver side, a decoder performs the task of identifying a transmitter by fusing data received through joint sensing and task-oriented communications. The two encoders at the transmitter and the decoder at the receiver are jointly trained, enabling a seamless integration of image classification and wireless signal detection. Using AWGN and Rayleigh channel models, we demonstrate the effectiveness of the proposed approach, showcasing high accuracy in transmitter identification across diverse channel conditions while sustaining low latency in decision making.
Metric embeddings traditionally study how to map $n$ items to a target metric space such that distance lengths are not heavily distorted; but what if we only care to preserve the relative order of the distances (and not their length)? In this paper, we are motivated by the following basic question: given triplet comparisons of the form ``item $i$ is closer to item $j$ than to item $k$,'' can we find low-dimensional Euclidean representations for the $n$ items that respect those distance comparisons? Such order-preserving embeddings naturally arise in important applications and have been studied since the 1950s, under the name of ordinal or non-metric embeddings. Our main results are: 1. Nearly-Tight Bounds on Triplet Dimension: We introduce the natural concept of triplet dimension of a dataset, and surprisingly, we show that in order for an ordinal embedding to be triplet-preserving, its dimension needs to grow as $\frac n2$ in the worst case. This is optimal (up to constant) as $n-1$ dimensions always suffice. 2. Tradeoffs for Dimension vs (Ordinal) Relaxation: We then relax the requirement that every triplet should be exactly preserved and present almost tight lower bounds for the maximum ratio between distances whose relative order was inverted by the embedding; this ratio is known as (ordinal) relaxation in the literature and serves as a counterpart to (metric) distortion. 3. New Bounds on Terminal and Top-$k$-NNs Embeddings: Going beyond triplets, we then study two well-motivated scenarios where we care about preserving specific sets of distances (not necessarily triplets). The first scenario is Terminal Ordinal Embeddings and the second scenario is top-$k$-NNs Ordinal Embeddings. To the best of our knowledge, these are some of the first tradeoffs on triplet-preserving ordinal embeddings and the first study of Terminal and Top-$k$-NNs Ordinal Embeddings.
This paper jointly considers privacy preservation and Byzantine-robustness in decentralized learning. In a decentralized network, honest-but-curious agents faithfully follow the prescribed algorithm, but expect to infer their neighbors' private data from messages received during the learning process, while dishonest-and-Byzantine agents disobey the prescribed algorithm, and deliberately disseminate wrong messages to their neighbors so as to bias the learning process. For this novel setting, we investigate a generic privacy-preserving and Byzantine-robust decentralized stochastic gradient descent (SGD) framework, in which Gaussian noise is injected to preserve privacy and robust aggregation rules are adopted to counteract Byzantine attacks. We analyze its learning error and privacy guarantee, discovering an essential tradeoff between privacy preservation and Byzantine-robustness in decentralized learning -- the learning error caused by defending against Byzantine attacks is exacerbated by the Gaussian noise added to preserve privacy. For a class of state-of-the-art robust aggregation rules, we give unified analysis of the "mixing abilities". Building upon this analysis, we reveal how the "mixing abilities" affect the tradeoff between privacy preservation and Byzantine-robustness. The theoretical results provide guidelines for achieving a favorable tradeoff with proper design of robust aggregation rules. Numerical experiments are conducted and corroborate our theoretical findings.
This paper proposes well-conditioned boundary integral equations based on the Burton-Miller method for solving transmission problems. The Burton-Miller method is widely accepted as a highly accurate numerical method based on the boundary integral equation for solving exterior wave problems. While this method can also be applied to solve the transmission problems, a straightforward formulation may yield ill-conditioned integral equations. Consequently, the resulting linear algebraic equations may involve a coefficient matrix with a huge condition number, leading to poor convergence of Krylov-based linear solvers. To address this challenge, our study enhances Burton-Miller-type boundary integral equations tailored for transmission problems by exploiting the Calderon formula. In cases where a single material exists in an unbounded host medium, we demonstrate the formulation of the boundary integral equation such that the underlying integral operator ${\cal A}$ is spectrally well-conditioned. Specifically, ${\cal A}$ can be designed in a simple manner that ensures ${\cal A}^2$ is bounded and has only a single eigenvalue accumulation point. Furthermore, we extend our analysis to the multi-material case, proving that the square of the proposed operator has only a few eigenvalues except for a compact perturbation. Through numerical examples of several benchmark problems, we illustrate that our formulation reduces the iteration number required by iterative linear solvers, even in the presence of material junction points; locations where three or more sub-domains meet on the boundary.
Agent-based modeling and simulation has evolved as a powerful tool for modeling complex systems, offering insights into emergent behaviors and interactions among diverse agents. Integrating large language models into agent-based modeling and simulation presents a promising avenue for enhancing simulation capabilities. This paper surveys the landscape of utilizing large language models in agent-based modeling and simulation, examining their challenges and promising future directions. In this survey, since this is an interdisciplinary field, we first introduce the background of agent-based modeling and simulation and large language model-empowered agents. We then discuss the motivation for applying large language models to agent-based simulation and systematically analyze the challenges in environment perception, human alignment, action generation, and evaluation. Most importantly, we provide a comprehensive overview of the recent works of large language model-empowered agent-based modeling and simulation in multiple scenarios, which can be divided into four domains: cyber, physical, social, and hybrid, covering simulation of both real-world and virtual environments. Finally, since this area is new and quickly evolving, we discuss the open problems and promising future directions.
Meta-reinforcement learning algorithms can enable robots to acquire new skills much more quickly, by leveraging prior experience to learn how to learn. However, much of the current research on meta-reinforcement learning focuses on task distributions that are very narrow. For example, a commonly used meta-reinforcement learning benchmark uses different running velocities for a simulated robot as different tasks. When policies are meta-trained on such narrow task distributions, they cannot possibly generalize to more quickly acquire entirely new tasks. Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors. In this paper, we propose an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. Our aim is to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks. We evaluate 6 state-of-the-art meta-reinforcement learning and multi-task learning algorithms on these tasks. Surprisingly, while each task and its variations (e.g., with different object positions) can be learned with reasonable success, these algorithms struggle to learn with multiple tasks at the same time, even with as few as ten distinct training tasks. Our analysis and open-source environments pave the way for future research in multi-task learning and meta-learning that can enable meaningful generalization, thereby unlocking the full potential of these methods.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.