Neuro-symbolic reinforcement learning (NS-RL) has emerged as a promising paradigm for explainable decision-making, characterized by the interpretability of symbolic policies. NS-RL entails structured state representations for tasks with visual observations, but previous methods are unable to refine the structured states with rewards due to a lack of efficiency. Accessibility also remains to be an issue, as extensive domain knowledge is required to interpret symbolic policies. In this paper, we present a framework for learning structured states and symbolic policies jointly, whose key idea is to distill vision foundation models into a scalable perception module and refine it during policy learning. Moreover, we design a pipeline to generate language explanations for policies and decisions using large language models. In experiments on nine Atari tasks, we verify the efficacy of our approach, and we also present explanations for policies and decisions.
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data while preserving user privacy. However, the typical paradigm of FL faces challenges of both privacy and robustness: the transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates. Current solutions attempting to address both problems under the one-server FL setting fall short in the following aspects: 1) designed for simple validity checks that are insufficient against advanced attacks (e.g., checking norm of individual update); and 2) partial privacy leakage for more complicated robust aggregation algorithms (e.g., distances between model updates are leaked for multi-Krum). In this work, we formalize a novel security notion of aggregated privacy that characterizes the minimum amount of user information, in the form of some aggregated statistics of users' updates, that is necessary to be revealed to accomplish more advanced robust aggregation. We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy. As concrete instantiations of PriRoAgg, we construct two secure and robust protocols based on state-of-the-art robust algorithms, for which we provide full theoretical analyses on security and complexity. Extensive experiments are conducted for these protocols, demonstrating their robustness against various model integrity attacks, and their efficiency advantages over baselines.
Tensor processing units (TPUs) are one of the most well-known machine learning (ML) accelerators utilized at large scale in data centers as well as in tiny ML applications. TPUs offer several improvements and advantages over conventional ML accelerators, like graphical processing units (GPUs), being designed specifically to perform the multiply-accumulate (MAC) operations required in the matrix-matrix and matrix-vector multiplies extensively present throughout the execution of deep neural networks (DNNs). Such improvements include maximizing data reuse and minimizing data transfer by leveraging the temporal dataflow paradigms provided by the systolic array architecture. While this design provides a significant performance benefit, the current implementations are restricted to a single dataflow consisting of either input, output, or weight stationary architectures. This can limit the achievable performance of DNN inference and reduce the utilization of compute units. Therefore, the work herein consists of developing a reconfigurable dataflow TPU, called the Flex-TPU, which can dynamically change the dataflow per layer during run-time. Our experiments thoroughly test the viability of the Flex-TPU comparing it to conventional TPU designs across multiple well-known ML workloads. The results show that our Flex-TPU design achieves a significant performance increase of up to 2.75x compared to conventional TPU, with only minor area and power overheads.
In federated learning (FL), the common paradigm that FedAvg proposes and most algorithms follow is that clients train local models with their private data, and the model parameters are shared for central aggregation, mostly averaging. In this paradigm, the communication cost is often a challenge, as modern massive neural networks can contain millions to billions parameters. We suggest that clients do not share model parameters but local data summaries, to decrease the cost of sharing. We develop a new algorithm FedLog with Bayesian inference, which shares only sufficient statistics of local data. FedLog transmits messages as small as the last layer of the original model. We conducted comprehensive experiments to show we outperform other FL algorithms that aim at decreasing the communication cost. To provide formal privacy guarantees, we further extend FedLog with differential privacy and show the trade-off between privacy budget and accuracy.
Preference-based reinforcement learning (PbRL) is emerging as a promising approach to teaching robots through human comparative feedback, sidestepping the need for complex reward engineering. However, the substantial volume of feedback required in existing PbRL methods often lead to reliance on synthetic feedback generated by scripted teachers. This approach necessitates intricate reward engineering again and struggles to adapt to the nuanced preferences particular to human-robot interaction (HRI) scenarios, where users may have unique expectations toward the same task. To address these challenges, we introduce PrefCLM, a novel framework that utilizes crowdsourced large language models (LLMs) as simulated teachers in PbRL. We utilize Dempster-Shafer Theory to fuse individual preferences from multiple LLM agents at the score level, efficiently leveraging their diversity and collective intelligence. We also introduce a human-in-the-loop pipeline that facilitates collective refinements based on user interactive feedback. Experimental results across various general RL tasks show that PrefCLM achieves competitive performance compared to traditional scripted teachers and excels in facilitating more more natural and efficient behaviors. A real-world user study (N=10) further demonstrates its capability to tailor robot behaviors to individual user preferences, significantly enhancing user satisfaction in HRI scenarios.
Parameter-efficient transfer learning (PETL) has emerged as a flourishing research field for adapting large pre-trained models to downstream tasks, greatly reducing trainable parameters while grappling with memory challenges during fine-tuning. To address it, memory-efficient series (METL) avoid backpropagating gradients through the large backbone. However, they compromise by exclusively relying on frozen intermediate outputs and limiting the exhaustive exploration of prior knowledge from pre-trained models. Moreover, the dependency and redundancy between cross-layer features are frequently overlooked, thereby submerging more discriminative representations and causing an inherent performance gap (vs. conventional PETL methods). Hence, we propose an innovative METL strategy called SHERL for resource-limited scenarios to decouple the entire adaptation into two successive and complementary processes. In the early route, intermediate outputs are consolidated via an anti-redundancy operation, enhancing their compatibility for subsequent interactions; thereby in the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead and regulate these fairly flexible features into more adaptive and powerful representations for new domains. Extensive ablations on vision-and-language and language-only tasks show that SHERL combines the strengths of both parameter and memory-efficient techniques, performing on-par or better across diverse architectures with lower memory during fine-tuning. Our code is publicly available at: //github.com/Paranioar/SHERL.
Federated learning is one popular paradigm to train a joint model in a distributed, privacy-preserving environment. But partial annotations pose an obstacle meaning that categories of labels are heterogeneous over clients. We propose to learn a joint backbone in a federated manner, while each site receives its own multi-label segmentation head. By using Bayesian techniques we observe that the different segmentation heads although only trained on the individual client's labels also learn information about the other labels not present at the respective site. This information is encoded in their predictive uncertainty. To obtain a final prediction we leverage this uncertainty and perform a weighted averaging of the ensemble of distributed segmentation heads, which allows us to segment "locally unknown" structures. With our method, which we refer to as FUNAvg, we are even on-par with the models trained and tested on the same dataset on average. The code is publicly available at //github.com/Cardio-AI/FUNAvg.
Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids (in Lagrangian descriptions). Existing approaches, however, require the supervision of consecutive particle properties, including positions and velocities. In this paper, we consider a partially observable scenario known as fluid dynamics grounding, that is, inferring the state transitions and interactions within the fluid particle systems from sequential visual observations of the fluid surface. We propose a differentiable two-stage network named NeuroFluid. Our approach consists of (i) a particle-driven neural renderer, which involves fluid physical properties into the volume rendering function, and (ii) a particle transition model optimized to reduce the differences between the rendered and the observed images. NeuroFluid provides the first solution to unsupervised learning of particle-based fluid dynamics by training these two models jointly. It is shown to reasonably estimate the underlying physics of fluids with different initial shapes, viscosity, and densities. It is a potential alternative approach to understanding complex fluid mechanics, such as turbulence, that are difficult to model using traditional methods of mathematical physics.
There recently has been a surge of interest in developing a new class of deep learning (DL) architectures that integrate an explicit time dimension as a fundamental building block of learning and representation mechanisms. In turn, many recent results show that topological descriptors of the observed data, encoding information on the shape of the dataset in a topological space at different scales, that is, persistent homology of the data, may contain important complementary information, improving both performance and robustness of DL. As convergence of these two emerging ideas, we propose to enhance DL architectures with the most salient time-conditioned topological information of the data and introduce the concept of zigzag persistence into time-aware graph convolutional networks (GCNs). Zigzag persistence provides a systematic and mathematically rigorous framework to track the most important topological features of the observed data that tend to manifest themselves over time. To integrate the extracted time-conditioned topological descriptors into DL, we develop a new topological summary, zigzag persistence image, and derive its theoretical stability guarantees. We validate the new GCNs with a time-aware zigzag topological layer (Z-GCNETs), in application to traffic forecasting and Ethereum blockchain price prediction. Our results indicate that Z-GCNET outperforms 13 state-of-the-art methods on 4 time series datasets.
Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. While existing semi-supervised learning (SSL) methods are known to perform poorly on minority classes, we find that they still generate high precision pseudo-labels on minority classes. By exploiting this property, in this work, we propose Class-Rebalancing Self-Training (CReST), a simple yet effective framework to improve existing SSL methods on class-imbalanced data. CReST iteratively retrains a baseline SSL model with a labeled set expanded by adding pseudo-labeled samples from an unlabeled set, where pseudo-labeled samples from minority classes are selected more frequently according to an estimated class distribution. We also propose a progressive distribution alignment to adaptively adjust the rebalancing strength dubbed CReST+. We show that CReST and CReST+ improve state-of-the-art SSL algorithms on various class-imbalanced datasets and consistently outperform other popular rebalancing methods.
Graph-based semi-supervised learning (SSL) is an important learning problem where the goal is to assign labels to initially unlabeled nodes in a graph. Graph Convolutional Networks (GCNs) have recently been shown to be effective for graph-based SSL problems. GCNs inherently assume existence of pairwise relationships in the graph-structured data. However, in many real-world problems, relationships go beyond pairwise connections and hence are more complex. Hypergraphs provide a natural modeling tool to capture such complex relationships. In this work, we explore the use of GCNs for hypergraph-based SSL. In particular, we propose HyperGCN, an SSL method which uses a layer-wise propagation rule for convolutional neural networks operating directly on hypergraphs. To the best of our knowledge, this is the first principled adaptation of GCNs to hypergraphs. HyperGCN is able to encode both the hypergraph structure and hypernode features in an effective manner. Through detailed experimentation, we demonstrate HyperGCN's effectiveness at hypergraph-based SSL.