Multi-robot system for manufacturing is an Industry Internet of Things (IIoT) paradigm with significant operational cost savings and productivity improvement, where Unmanned Aerial Vehicles (UAVs) are employed to control and implement collaborative productions without human intervention. This mission-critical system relies on 3-Dimension (3-D) scene recognition to improve operation accuracy in the production line and autonomous piloting. However, implementing 3-D point cloud learning, such as Pointnet, is challenging due to limited sensing and computing resources equipped with UAVs. Therefore, we propose a Digital Twin (DT) empowered Knowledge Distillation (KD) method to generate several lightweight learning models and select the optimal model to deploy on UAVs. With a digital replica of the UAVs preserved at the edge server, the DT system controls the model sharing network topology and learning model structure to improve recognition accuracy further. Moreover, we employ network calculus to formulate and solve the model sharing configuration problem toward minimal resource consumption, as well as convergence. Simulation experiments are conducted over a popular point cloud dataset to evaluate the proposed scheme. Experiment results show that the proposed model sharing scheme outperforms the individual model in terms of computing resource consumption and recognition accuracy.
Incomplete multi-view clustering is a hot and emerging topic. It is well known that unavoidable data incompleteness greatly weakens the effective information of multi-view data. To date, existing incomplete multi-view clustering methods usually bypass unavailable views according to prior missing information, which is considered as a second-best scheme based on evasion. Other methods that attempt to recover missing information are mostly applicable to specific two-view datasets. To handle these problems, in this paper, we propose an information recovery-driven deep incomplete multi-view clustering network, termed as RecFormer. Concretely, a two-stage autoencoder network with the self-attention structure is built to synchronously extract high-level semantic representations of multiple views and recover the missing data. Besides, we develop a recurrent graph reconstruction mechanism that cleverly leverages the restored views to promote the representation learning and the further data reconstruction. Visualization of recovery results are given and sufficient experimental results confirm that our RecFormer has obvious advantages over other top methods.
Robots can greatly alleviate physical demands on construction workers while enhancing both the productivity and safety of construction projects. Leveraging a Building Information Model (BIM) offers a natural and promising approach to drive a robotic construction workflow. However, because of uncertainties inherent on construction sites, such as discrepancies between the designed and as-built workpieces, robots cannot solely rely on the BIM to guide field construction work. Human workers are adept at improvising alternative plans with their creativity and experience and thus can assist robots in overcoming uncertainties and performing construction work successfully. This research introduces an interactive closed-loop digital twin system that integrates a BIM into human-robot collaborative construction workflows. The robot is primarily driven by the BIM, but it adaptively adjusts its plan based on actual site conditions while the human co-worker supervises the process. If necessary, the human co-worker intervenes in the robot's plan by changing the task sequence or target position, requesting a new motion plan, or modifying the construction component(s)/material(s) to help the robot navigate uncertainties. To investigate the physical deployment of the system, a drywall installation case study is conducted with an industrial robotic arm in a laboratory. In addition, a block pick-and-place experiment is carried out to evaluate system performance. Integrating the flexibility of human workers and the autonomy and accuracy afforded by the BIM, the system significantly increases the robustness of construction robots in the performance of field construction work.
The problem of generalization and transportation of treatment effect estimates from a study sample to a target population is central to empirical research and statistical methodology. In both randomized experiments and observational studies, weighting methods are often used with this objective. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. In this work, we provide a justification and an implementation for weighting in a single step. We show a formal connection between this one-step method and inverse probability and inverse odds weighting. We demonstrate that the resulting estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We evaluate the performance of the one-step estimator in a simulation study. We illustrate its use in a case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California. We provide R code implementing the methodology.
Finding multiple temporal relationships among locations can benefit a bunch of urban applications, such as dynamic offline advertising and smart public transport planning. While some efforts have been made on finding static relationships among locations, little attention is focused on studying time-aware location relationships. Indeed, abundant location-based human activities are time-varying and the availability of these data enables a new paradigm for understanding the dynamic relationships in a period among connective locations. To this end, we propose to study a new problem, namely multi-Temporal relationship inference among locations (Trial for short), where the major challenge is how to integrate dynamic and geographical influence under the relationship sparsity constraint. Specifically, we propose a solution to Trial with a graph learning scheme, which includes a spatially evolving graph neural network (SEENet) with two collaborative components: spatially evolving graph convolution module (SEConv) and spatially evolving self-supervised learning strategy (SE-SSL). SEConv performs the intra-time aggregation and inter-time propagation to capture the multifaceted spatially evolving contexts from the view of location message passing. In addition, SE-SSL designs time-aware self-supervised learning tasks in a global-local manner with additional evolving constraint to enhance the location representation learning and further handle the relationship sparsity. Finally, experiments on four real-world datasets demonstrate the superiority of our method over several state-of-the-art approaches.
Deep neural network (DNN) models have become increasingly crucial components in intelligent software systems. However, training a DNN model is typically expensive in terms of both time and money. To address this issue, researchers have recently focused on reusing existing DNN models - borrowing the idea of code reuse in software engineering. However, reusing an entire model could cause extra overhead or inherits the weakness from the undesired functionalities. Hence, existing work proposes to decompose an already trained model into modules, i.e., modularizing-after-training, and enable module reuse. Since trained models are not built for modularization, modularizing-after-training incurs huge overhead and model accuracy loss. In this paper, we propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT). We train a model to be structurally modular through two loss functions that optimize intra-module cohesion and inter-module coupling. We have implemented the proposed approach for modularizing Convolutional Neural Network (CNN) models in this work. The evaluation results on representative models demonstrate that MwT outperforms the state-of-the-art approach. Specifically, the accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline. The kernel retention rate of the modules generated by MwT is only 14.58%, with a reduction of 74.31% over the state-of-the-art approach. Furthermore, the total time cost required for training and modularizing is only 108 minutes, half of the baseline.
In this paper, we present a comprehensive investigation of the challenges of Monocular Visual Simultaneous Localization and Mapping (vSLAM) methods for underwater robots. While significant progress has been made in state estimation methods that utilize visual data in the past decade, most evaluations have been limited to controlled indoor and urban environments, where impressive performance was demonstrated. However, these techniques have not been extensively tested in extremely challenging conditions, such as underwater scenarios where factors such as water and light conditions, robot path, and depth can greatly impact algorithm performance. Hence, our evaluation is conducted in real-world AUV scenarios as well as laboratory settings which provide precise external reference. A focus is laid on understanding the impact of environmental conditions, such as optical properties of the water and illumination scenarios, on the performance of monocular vSLAM methods. To this end, we first show that all methods perform very well in in-air settings and subsequently show the degradation of their performance in challenging underwater environments. The final goal of this study is to identify techniques that can improve accuracy and robustness of SLAM methods in such conditions. To achieve this goal, we investigate the potential of image enhancement techniques to improve the quality of input images used by the SLAM methods, specifically in low visibility and extreme lighting scenarios in scattering media. We present a first evaluation on calibration maneuvers and simple image restoration techniques to determine their ability to enable or enhance the performance of monocular SLAM methods in underwater environments.
The agricultural sector currently faces significant challenges in water resource conservation and crop yield optimization, primarily due to concerns over freshwater scarcity. Traditional irrigation scheduling methods often prove inadequate in meeting the needs of large-scale irrigation systems. To address this issue, this paper proposes a predictive irrigation scheduler that leverages the three paradigms of machine learning to optimize irrigation schedules. The proposed scheduler employs the k-means clustering approach to divide the field into distinct irrigation management zones based on soil hydraulic parameters and topology information. Furthermore, a long short-term memory network is employed to develop dynamic models for each management zone, enabling accurate predictions of soil moisture dynamics. Formulated as a mixed-integer model predictive control problem, the scheduler aims to maximize water uptake while minimizing overall water consumption and irrigation costs. To tackle the mixed-integer optimization challenge, the proximal policy optimization algorithm is utilized to train a reinforcement learning agent responsible for making daily irrigation decisions. To evaluate the performance of the proposed scheduler, a 26.4-hectare field in Lethbridge, Canada, was chosen as a case study for the 2015 and 2022 growing seasons. The results demonstrate the superiority of the proposed scheduler compared to a traditional irrigation scheduling method in terms of water use efficiency and crop yield improvement for both growing seasons. Notably, the proposed scheduler achieved water savings ranging from 6.4% to 22.8%, along with yield increases ranging from 2.3% to 4.3%.
Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.
To solve the information explosion problem and enhance user experience in various online applications, recommender systems have been developed to model users preferences. Although numerous efforts have been made toward more personalized recommendations, recommender systems still suffer from several challenges, such as data sparsity and cold start. In recent years, generating recommendations with the knowledge graph as side information has attracted considerable interest. Such an approach can not only alleviate the abovementioned issues for a more accurate recommendation, but also provide explanations for recommended items. In this paper, we conduct a systematical survey of knowledge graph-based recommender systems. We collect recently published papers in this field and summarize them from two perspectives. On the one hand, we investigate the proposed algorithms by focusing on how the papers utilize the knowledge graph for accurate and explainable recommendation. On the other hand, we introduce datasets used in these works. Finally, we propose several potential research directions in this field.
To address the sparsity and cold start problem of collaborative filtering, researchers usually make use of side information, such as social networks or item attributes, to improve recommendation performance. This paper considers the knowledge graph as the source of side information. To address the limitations of existing embedding-based and path-based methods for knowledge-graph-aware recommendation, we propose Ripple Network, an end-to-end framework that naturally incorporates the knowledge graph into recommender systems. Similar to actual ripples propagating on the surface of water, Ripple Network stimulates the propagation of user preferences over the set of knowledge entities by automatically and iteratively extending a user's potential interests along links in the knowledge graph. The multiple "ripples" activated by a user's historically clicked items are thus superposed to form the preference distribution of the user with respect to a candidate item, which could be used for predicting the final clicking probability. Through extensive experiments on real-world datasets, we demonstrate that Ripple Network achieves substantial gains in a variety of scenarios, including movie, book and news recommendation, over several state-of-the-art baselines.