Recent quadrotor vehicles transcended conventional designs, emphasizing more on foldable and reconfigurable bodies. However, the state of the art still focuses on the mechanical feasibility of such designs with limited discussions on the tracking performance of the vehicle during configuration switching. In this paper, we propose a complete control and planning framework for attitude tracking during configuration switching and curbs any switch-based disturbances, which can lead to violation of safety constraints and cause crashes. The control framework includes a morphology-aware adaptive controller with a estimator to account for parameter variation and a minimum-jerk trajectory planner to achieve stable flights while switching. Stability analysis for attitude tracking is presented by employing the theory of switched systems and simulation results validate the proposed framework for a foldable quadrotor's flight through a passageway.
In this work, we formulate a novel framework of adversarial robustness using the manifold hypothesis. Our framework provides sufficient conditions for defending against adversarial examples. We develop a test-time defense method with our formulation and variational inference. The developed approach combines manifold learning with the Bayesian framework to provide adversarial robustness without the need for adversarial training. We show that our proposed approach can provide adversarial robustness even if attackers are aware of existence of test-time defense. In additions, our approach can also serve as a test-time defense mechanism for variational autoencoders.
The current dominated wearable body motion sensor is IMU. This work presented an alternative wearable motion-sensing approach: human body capacitance (HBC, also commonly defined as body-area electric field). While being less robust in tracking the posture and trajectory, HBC has two properties that make it an attractive. First, the deployment of the sensing node on the being tracked body part is not a requirement for HBC sensing approach. Second, HBC is sensitive to the body's interaction with its surroundings, including both touching and being in the immediate proximity of people and objects. We first described the sensing principle for HBC, sensor architecture and implementation, and methods for evaluation. We then presented two case studies demonstrating the usefulness of HBC as a complement/alternative to IMUs. First, we explored the exercise recognition and repetition counting of seven machine-free leg-only exercises and eleven general gym workouts with the signal source of HBC and IMU. The HBC sensing shows significant advantages over the IMU signals in classification(0.89 vs 0.78 in F-score) and counting(0.982 vs 0.938 in accuracy) of the leg-only exercises. For the general gym workouts, HBC only shows recognition improvement for certain workouts like adductor where legs alone complete the movement. And it also supplies better results over the IMU for workouts counting(0.800 vs. 0.756 when wearing the sensors on the wrist). In the second case, we tried to recognize actions related to manipulating objects and physical collaboration between users by using a wrist-worn HBC sensing unit. We detected collaboration between the users with 0.69 F-score when receiving data from a single user and 0.78 when receiving data from both users. The capacitive sensor can improve the recognition of collaborative activities with an F-score over a single wrist accelerometer approach by 16\%.
The exponential growth of machine learning (ML) has prompted a great deal of interest in quantifying the uncertainty of each prediction for a user-defined level of confidence since nowadays ML is increasingly being used in high-stakes settings. Reliable ML via prediction intervals (PIs) that take into account jointly the epistemic and aleatory uncertainty is therefore imperative and is a step towards increased trust in model forecasts. Conformal prediction (CP) is a lightweight distribution-free uncertainty quantification framework that works for any black-box model, yielding PIs that are valid under the mild assumption of exchangeability. CP-type methods are gaining popularity due to being easy to implement and computationally cheap; however, the exchangeability assumption immediately excludes time series forecasting from the stage. Although recent papers tackle distribution shift and asymptotic versions of CP, this is not enough for the general time series forecasting problem of producing H-step ahead valid PIs. To attain such a goal, we propose a new method called AEnbMIMOCQR (Adaptive ensemble batch multi-input multi-output conformalized quantile regression), which produces valid PIs asymptotically and is appropriate for heteroscedastic time series. We compare the proposed method against state-of-the-art competitive methods in the NN5 forecasting competition dataset. All the code and data to reproduce the experiments are made available.
Contact-rich manipulation is challenging due to dynamically-changing physical constraints by the contact mode changes undergone during manipulation. This paper proposes a versatile local planning and control framework for contact-rich manipulation that determines the continuous control action under variable contact modes online. We model the physical characteristics of contact-rich manipulation by quasistatic dynamics and complementarity constraints. We then propose a linear complementarity quadratic program (LCQP) to efficiently determine the control action that implicitly includes the decisions on the contact modes under these constraints. In the LCQP, we relax the complementarity constraints to alleviate ill-conditioned problems that are typically caused by measure noises or model miss-matches. We conduct dynamical simulations on a 3D physical simulator and demonstrate that the proposed method can achieve various contact-rich manipulation tasks by determining the control action including the contact modes in real-time.
Most previous neural text-to-speech (TTS) methods are mainly based on supervised learning methods, which means they depend on a large training dataset and hard to achieve comparable performance under low-resource conditions. To address this issue, we propose a semi-supervised learning method for neural TTS in which labeled target data is limited, which can also resolve the problem of exposure bias in the previous auto-regressive models. Specifically, we pre-train the reference model based on Fastspeech2 with much source data, fine-tuned on a limited target dataset. Meanwhile, pseudo labels generated by the original reference model are used to guide the fine-tuned model's training further, achieve a regularization effect, and reduce the overfitting of the fine-tuned model during training on the limited target data. Experimental results show that our proposed semi-supervised learning scheme with limited target data significantly improves the voice quality for test data to achieve naturalness and robustness in speech synthesis.
Knowledge Graph Question Answering (KGQA) involves retrieving entities as answers from a Knowledge Graph (KG) using natural language queries. The challenge is to learn to reason over question-relevant KG facts that traverse KG entities and lead to the question answers. To facilitate reasoning, the question is decoded into instructions, which are dense question representations used to guide the KG traversals. However, if the derived instructions do not exactly match the underlying KG information, they may lead to reasoning under irrelevant context. Our method, termed ReaRev, introduces a new way to KGQA reasoning with respect to both instruction decoding and execution. To improve instruction decoding, we perform reasoning in an adaptive manner, where KG-aware information is used to iteratively update the initial instructions. To improve instruction execution, we emulate breadth-first search (BFS) with graph neural networks (GNNs). The BFS strategy treats the instructions as a set and allows our method to decide on their execution order on the fly. Experimental results on three KGQA benchmarks demonstrate the ReaRev's effectiveness compared with previous state-of-the-art, especially when the KG is incomplete or when we tackle complex questions. Our code is publicly available at //github.com/cmavro/ReaRev_KGQA.
Distributed stochastic gradient descent (SGD) with gradient compression has emerged as a communication-efficient solution to accelerate distributed learning. Top-K sparsification is one of the most popular gradient compression methods that sparsifies the gradient in a fixed degree during model training. However, there lacks an approach to adaptively adjust the degree of sparsification to maximize the potential of model performance or training speed. This paper addresses this issue by proposing a novel adaptive Top-K SGD framework, enabling adaptive degree of sparsification for each gradient descent step to maximize the convergence performance by exploring the trade-off between communication cost and convergence error. Firstly, we derive an upper bound of the convergence error for the adaptive sparsification scheme and the loss function. Secondly, we design the algorithm by minimizing the convergence error under the communication cost constraints. Finally, numerical results show that the proposed adaptive Top-K in SGD achieves a significantly better convergence rate compared with the state-of-the-art methods.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.
While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. In this work, we consider semi-supervised learning and transfer learning jointly, leading to a more practical and competitive paradigm that can utilize both powerful pre-trained models from source domain as well as labeled/unlabeled data in the target domain. To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization that consists of two complementary components: Adaptive Knowledge Consistency (AKC) on the examples between the source and target model, and Adaptive Representation Consistency (ARC) on the target model between labeled and unlabeled examples. Examples involved in the consistency regularization are adaptively selected according to their potential contributions to the target task. We conduct extensive experiments on several popular benchmarks including CUB-200-2011, MIT Indoor-67, MURA, by fine-tuning the ImageNet pre-trained ResNet-50 model. Results show that our proposed adaptive consistency regularization outperforms state-of-the-art semi-supervised learning techniques such as Pseudo Label, Mean Teacher, and MixMatch. Moreover, our algorithm is orthogonal to existing methods and thus able to gain additional improvements on top of MixMatch and FixMatch. Our code is available at //github.com/SHI-Labs/Semi-Supervised-Transfer-Learning.
The previous work for event extraction has mainly focused on the predictions for event triggers and argument roles, treating entity mentions as being provided by human annotators. This is unrealistic as entity mentions are usually predicted by some existing toolkits whose errors might be propagated to the event trigger and argument role recognition. Few of the recent work has addressed this problem by jointly predicting entity mentions, event triggers and arguments. However, such work is limited to using discrete engineering features to represent contextual information for the individual tasks and their interactions. In this work, we propose a novel model to jointly perform predictions for entity mentions, event triggers and arguments based on the shared hidden representations from deep learning. The experiments demonstrate the benefits of the proposed method, leading to the state-of-the-art performance for event extraction.