We consider the message complexity of verifying whether a given subgraph of the communication network forms a tree with specific properties both in the KT-$\rho$ (nodes know their $\rho$-hop neighborhood, including node IDs) and the KT-$0$ (nodes do not have this knowledge) models. We develop a rather general framework that helps in establishing tight lower bounds for various tree verification problems. We also consider two different verification requirements: namely that every node detects in the case the input is incorrect, as well as the requirement that at least one node detects. The results are stronger than previous ones in the sense that we assume that each node knows the number $n$ of nodes in the graph (in some cases) or an $\alpha$ approximation of $n$ (in other cases). For spanning tree verification, we show that the message complexity inherently depends on the quality of the given approximation of $n$: We show a tight lower bound of $\Omega(n^2)$ for the case $\alpha \ge \sqrt{2}$ and a much better upper bound (i.e., $O(n \log n)$) when nodes are given a tighter approximation. On the other hand, our framework also yields an $\Omega(n^2)$ lower bound on the message complexity of verifying a minimum spanning tree (MST), which reveals a polynomial separation between ST verification and MST verification. This result holds for randomized algorithms with perfect knowledge of the network size, and even when just one node detects illegal inputs, thus improving over the work of Kor, Korman, and Peleg (2013). For verifying a $d$-approximate BFS tree, we show that the same lower bound holds even if nodes know $n$ exactly, however, the lower bound is sensitive to $d$, which is the stretch parameter.
Population protocols are a model of distributed computation in which an arbitrary number of indistinguishable finite-state agents interact in pairs to decide some property of their initial configuration. We investigate the behaviour of population protocols under adversarial faults that cause agents to silently crash and no longer interact with other agents. As a starting point, we consider the property ``the number of agents exceeds a given threshold $t$'', represented by the predicate $x \geq t$, and show that the standard protocol for $x \geq t$ is very fragile: one single crash in a computation with $x:=2t-1$ agents can already cause the protocol to answer incorrectly that $x \geq t$ does not hold. However, a slightly less known protocol is robust: for any number $t' \geq t$ of agents, at least $t' - t+1$ crashes must occur for the protocol to answer that the property does not hold. We formally define robustness for arbitrary population protocols, and investigate the question whether every predicate computable by population protocols has a robust protocol. Angluin et al. proved in 2007 that population protocols decide exactly the Presburger predicates, which can be represented as Boolean combinations of threshold predicates of the form $\sum_{i=1}^n a_i \cdot x_i \geq t$ for $a_1,...,a_n, t \in \mathbb{Z}$ and modulo prdicates of the form $\sum_{i=1}^n a_i \cdot x_i \bmod m \geq t $ for $a_1, \ldots, a_n, m, t \in \mathbb{N}$. We design robust protocols for all threshold and modulo predicates. We also show that, unfortunately, the techniques in the literature that construct a protocol for a Boolean combination of predicates given protocols for the conjuncts do not preserve robustness. So the question remains open.
We propose an instrumental variable framework for identifying and estimating causal effects of discrete and continuous treatments with binary instruments. The basis of our approach is a local copula representation of the joint distribution of the potential outcomes and unobservables determining treatment assignment. This representation allows us to introduce an identifying assumption, so-called copula invariance, that restricts the local dependence of the copula with respect to the treatment propensity. We show that copula invariance identifies treatment effects for the entire population and other subpopulations such as the treated. The identification results are constructive and lead to practical estimation and inference procedures based on distribution regression. An application to estimating the effect of sleep on well-being uncovers interesting patterns of heterogeneity.
The study of biomechanics during locomotion provides valuable insights into the effects of varying conditions on specific movement patterns. This research focuses on examining the influence of different shoe parameters on walking biomechanics, aiming to understand their impact on gait patterns. To achieve this, various methodologies are explored to estimate human body biomechanics, including computer vision techniques and wearable devices equipped with advanced sensors. Given privacy considerations and the need for robust, accurate measurements, this study employs wearable devices with Inertial Measurement Unit (IMU) sensors. These devices offer a non-invasive, precise, and high-resolution approach to capturing biomechanical data during locomotion. Raw sensor data collected from wearable devices is processed using an Extended Kalman Filter to reduce noise and extract meaningful information. This includes calculating joint angles throughout the gait cycle, enabling a detailed analysis of movement dynamics. The analysis identifies correlations between shoe parameters and key gait characteristics, such as stability, mobility, step time, and propulsion forces. The findings provide deeper insights into how footwear design influences walking efficiency and biomechanics. This study paves the way for advancements in footwear technology and contributes to the development of personalized solutions for enhancing gait performance and mobility.
We study decentralized multiagent optimization over networks, modeled as undirected graphs. The optimization problem consists of minimizing a nonconvex smooth function plus a convex extended-value function, which enforces constraints or extra structure on the solution (e.g., sparsity, low-rank). We further assume that the objective function satisfies the Kurdyka-{\L}ojasiewicz (KL) property, with given exponent $\theta\in [0,1)$. The KL property is satisfied by several (nonconvex) functions of practical interest, e.g., arising from machine learning applications; in the centralized setting, it permits to achieve strong convergence guarantees. Here we establish convergence of the same type for the notorious decentralized gradient-tracking-based algorithm SONATA. Specifically, $\textbf{(i)}$ when $\theta\in (0,1/2]$, the sequence generated by SONATA converges to a stationary solution of the problem at R-linear rate;$ \textbf{(ii)} $when $\theta\in (1/2,1)$, sublinear rate is certified; and finally $\textbf{(iii)}$ when $\theta=0$, the iterates will either converge in a finite number of steps or converges at R-linear rate. This matches the convergence behavior of centralized proximal-gradient algorithms except when $\theta=0$. Numerical results validate our theoretical findings.
Neural networks are commonly known to be vulnerable to adversarial attacks mounted through subtle perturbation on the input data. Recent development in voice-privacy protection has shown the positive use cases of the same technique to conceal speaker's voice attribute with additive perturbation signal generated by an adversarial network. This paper examines the reversibility property where an entity generating the adversarial perturbations is authorized to remove them and restore original speech (e.g., the speaker him/herself). A similar technique could also be used by an investigator to deanonymize a voice-protected speech to restore criminals' identities in security and forensic analysis. In this setting, the perturbation generative module is assumed to be known in the removal process. To this end, a joint training of perturbation generation and removal modules is proposed. Experimental results on the LibriSpeech dataset demonstrated that the subtle perturbations added to the original speech can be predicted from the anonymized speech while achieving the goal of privacy protection. By removing these perturbations from the anonymized sample, the original speech can be restored. Audio samples can be found in \url{//voiceprivacy.github.io/Perturbation-Generation-Removal/}.
Recent improvements in the quality of the generations by large language models have spurred research into identifying machine-generated text. Such work often presents high-performing detectors. However, humans and machines can produce text in different styles and domains, yet the performance impact of such on machine generated text detection systems remains unclear. In this paper, we audit the classification performance for detecting machine-generated text by evaluating on texts with varying writing styles. We find that classifiers are highly sensitive to stylistic changes and differences in text complexity, and in some cases degrade entirely to random classifiers. We further find that detection systems are particularly susceptible to misclassify easy-to-read texts while they have high performance for complex texts, leading to concerns about the reliability of detection systems. We recommend that future work attends to stylistic factors and reading difficulty levels of human-written and machine-generated text.
A multichannel extension to the RVQGAN neural coding method is proposed, and realized for data-driven compression of third-order Ambisonics audio. The input- and output layers of the generator and discriminator models are modified to accept multiple (16) channels without increasing the model bitrate. We also propose a loss function for accounting for spatial perception in immersive reproduction, and transfer learning from single-channel models. Listening test results with 7.1.4 immersive playback show that the proposed extension is suitable for coding scene-based, 16-channel Ambisonics content with good quality at 16 kbps when trained and tested on the EigenScape database. The model has potential applications for learning other types of content and multichannel formats.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.
While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.