Reinforcement Learning (RL) is being increasingly used to learn and adapt application behavior in many domains, including large-scale and safety critical systems, as for example, autonomous driving. With the advent of plug-n-play RL libraries, its applicability has further increased, enabling integration of RL algorithms by users. We note, however, that the majority of such code is not developed by RL engineers, which as a consequence, may lead to poor program quality yielding bugs, suboptimal performance, maintainability, and evolution problems for RL-based projects. In this paper we begin the exploration of this hypothesis, specific to code utilizing RL, analyzing different projects found in the wild, to assess their quality from a software engineering perspective. Our study includes 24 popular RL-based Python projects, analyzed with standard software engineering metrics. Our results, aligned with similar analyses for ML code in general, show that popular and widely reused RL repositories contain many code smells (3.95% of the code base on average), significantly affecting the projects' maintainability. The most common code smells detected are long method and long method chain, highlighting problems in the definition and interaction of agents. Detected code smells suggest problems in responsibility separation, and the appropriateness of current abstractions for the definition of RL algorithms.
Sequential decoding, commonly applied to substitution channels, is a sub-optimal alternative to Viterbi decoding with significantly reduced memory costs. In this work, a sequential decoder for convolutional codes over channels that are prone to insertion, deletion, and substitution errors, is described and analyzed. Our decoder expands the code trellis by a new channel-state variable, called drift state, as proposed by Davey and MacKay. A suitable decoding metric on that trellis for sequential decoding is derived, generalizing the original Fano metric. The decoder is also extended to facilitate the simultaneous decoding of multiple received sequences that arise from a single transmitted sequence. Under low-noise environments, our decoding approach reduces the decoding complexity by a couple orders of magnitude in comparison to Viterbi's algorithm, albeit at slightly higher bit error rates. An analytical method to determine the computational cutoff rate is also suggested. This analysis is supported with numerical evaluations of bit error rates and computational complexity, which are compared with respect to optimal Viterbi decoding.
Internet of Things (IoT) is a network of devices that communicate with each other through the internet and provides intelligence to industry and people. These devices are running in potentially hostile environments, so the need for security is critical. Trust Management aims to ensure the reliability of the network by assigning a trust value in every node indicating its trust level. This paper presents an exhaustive survey of the current Trust Management techniques for IoT, a classification based on the methods used in every work and a discussion of the open challenges and future research directions.
In real-world applications, speaker recognition models often face various domain-mismatch challenges, leading to a significant drop in performance. Although numerous domain adaptation techniques have been developed to address this issue, almost all present methods focus on a simple configuration where the model is trained in one domain and deployed in another. However, real-world environments are often complex and may contain multiple domains, making the methods designed for one-to-one adaptation suboptimal. In our paper, we propose a self-supervised learning method to tackle this multi-domain adaptation problem. Building upon the basic self-supervised adaptation algorithm, we designed three strategies to make it suitable for multi-domain adaptation: an in-domain negative sampling strategy, a MoCo-like memory bank scheme, and a CORAL-like distribution alignment. We conducted experiments using VoxCeleb2 as the source domain dataset and CN-Celeb1 as the target multi-domain dataset. Our results demonstrate that our method clearly outperforms the basic self-supervised adaptation method, which simply treats the data of CN-Celeb1 as a single domain. Importantly, the improvement is consistent in nearly all in-domain tests and cross-domain tests, demonstrating the effectiveness of our proposed method.
Deep Reinforcement Learning (RL) has shown promise in addressing complex robotic challenges. In real-world applications, RL is often accompanied by failsafe controllers as a last resort to avoid catastrophic events. While necessary for safety, these interventions can result in undesirable behaviors, such as abrupt braking or aggressive steering. This paper proposes two safety intervention reduction methods: proactive replacement and proactive projection, which change the action of the agent if it leads to a potential failsafe intervention. These approaches are compared to state-of-the-art constrained RL on the OpenAI safety gym benchmark and a human-robot collaboration task. Our study demonstrates that the combination of our method with provably safe RL leads to high-performing policies with zero safety violations and a low number of failsafe interventions. Our versatile method can be applied to a wide range of real-world robotic tasks, while effectively improving safety without sacrificing task performance.
Digital Twins (DTs), optimize operations and monitor performance in Smart Critical Systems (SCS) domains like smart grids and manufacturing. DT-based cybersecurity solutions are in their infancy, lacking a unified strategy to overcome challenges spanning next three to five decades. These challenges include reliable data accessibility from Cyber-Physical Systems (CPS), operating in unpredictable environments. Reliable data sources are pivotal for intelligent cybersecurity operations aided with underlying modeling capabilities across the SCS lifecycle, necessitating a DT. To address these challenges, we propose Security Digital Twins (SDTs) collecting realtime data from CPS, requiring the Shift Left and Shift Right (SLSR) design paradigm for SDT to implement both design time and runtime cybersecurity operations. Incorporating virtual CPS components (VC) in Cloud/Edge, data fusion to SDT models is enabled with high reliability, providing threat insights and enhancing cyber resilience. VC-enabled SDT ensures accurate data feeds for security monitoring for both design and runtime. This design paradigm shift propagates innovative SDT modeling and analytics for securing future critical systems. This vision paper outlines intelligent SDT design through innovative techniques, exploring hybrid intelligence with data-driven and rule-based semantic SDT models. Various operational use cases are discussed for securing smart critical systems through underlying modeling and analytics capabilities.
WebAssembly (Wasm) is a bytecode format originally serving as a compilation target for Web applications. It has recently been used increasingly on the server side, e.g., providing a safer, faster, and more portable alternative to Linux containers. With the popularity of server-side Wasm applications, it is essential to study performance issues (i.e., abnormal latency) in Wasm runtimes, as they may cause a significant impact on server-side applications. However, there is still a lack of attention to performance issues in server-side Wasm runtimes. In this paper, we design a novel differential testing approach WarpDiff to identify performance issues in server-side Wasm runtimes. The key insight is that in normal cases, the execution time of the same test case on different Wasm runtimes should follow an oracle ratio. We identify abnormal cases where the execution time ratio significantly deviates from the oracle ratio and subsequently locate the Wasm runtimes that cause the performance issues. We apply WarpDiff to test five popular server-side Wasm runtimes using 123 test cases from the LLVM test suite and demonstrate the top 10 abnormal cases we identified. We further conduct an in-depth analysis of these abnormal cases and summarize seven performance issues, all of which have been confirmed by the developers. We hope our work can inspire future investigation on improving Wasm runtime implementation and thus promoting the development of server-side Wasm applications.
High-quality conversational datasets are integral to the successful development of Intelligent Tutoring Systems (ITS) that employ a Large Language Model (LLM) backend. These datasets, when used to fine-tune the LLM backend, significantly enhance the quality of interactions between students and ITS. A common strategy for developing these datasets involves generating synthetic student-teacher dialogues using advanced GPT-4 models. However, challenges arise when these dialogues demand complex calculations, common in subjects like physics. Despite its advanced capabilities, GPT-4's performance falls short in reliably handling even simple multiplication tasks, marking a significant limitation in its utility for these subjects. To address these challenges, this paper introduces an innovative stateful prompt design. Our approach generates a mock conversation between a student and a tutorbot, both roles simulated by GPT-4. Each student response triggers a soliloquy (an inner monologue) in the GPT-tutorbot, which assesses whether its response would necessitate calculations. If so, it proceeds to script the required code in Python and then uses the resulting output to construct its response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. Our findings show that our Higgs model -- a LLaMA finetuned with datasets generated through our novel stateful prompt design -- proficiently utilizes Python for computations. Consequently, finetuning with our datasets enriched with code soliloquies enhances not just the accuracy but also the computational reliability of Higgs' responses.
WikiHow is an open-domain repository of instructional articles for a variety of tasks, which can be revised by users. In this paper, we extract pairwise versions of an instruction before and after a revision was made. Starting from a noisy dataset of revision histories, we specifically extract and analyze edits that involve cases of vagueness in instructions. We further investigate the ability of a neural model to distinguish between two versions of an instruction in our data by adopting a pairwise ranking task from previous work and showing improvements over existing baselines.
Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (//github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.
Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.