This paper presents a control framework for magnetically actuated micron-scale robots ($\mu$bots) designed to mitigate disturbances and improve trajectory tracking. To address the challenges posed by unmodeled dynamics and environmental variability, we combine data-driven modeling with model-based control to accurately track desired trajectories using a relatively small amount of data. The system is represented with a simple linear model, and Gaussian Processes (GP) are employed to capture and estimate disturbances. This disturbance-enhanced model is then integrated into a Model Predictive Controller (MPC). Our approach demonstrates promising performance in both simulation and experimental setups, showcasing its potential for precise and reliable microrobot control in complex environments.
We formulate three generalized Bayesian models for analyzing interrater and intrarater reliability in the presence of multilevel data. Stan implementations of these models provide new estimates of interrater and intrarater reliability. We also derive formulas for calculating marginal correlations under each of the three models. Comparisons of the kappa estimates and marginal correlations across the different models are presented from two real-world datasets. Simulations demonstrate properties of the different measures of agreement under different model assumptions.
Accurately segmenting thin tubular structures, such as vessels, nerves, roads or concrete cracks, is a crucial task in computer vision. Standard deep learning-based segmentation loss functions, such as Dice or Cross-Entropy, focus on volumetric overlap, often at the expense of preserving structural connectivity or topology. This can lead to segmentation errors that adversely affect downstream tasks, including flow calculation, navigation, and structural inspection. Although current topology-focused losses mark an improvement, they introduce significant computational and memory overheads. This is particularly relevant for 3D data, rendering these losses infeasible for larger volumes as well as increasingly important multi-class segmentation problems. To mitigate this, we propose a novel Skeleton Recall Loss, which effectively addresses these challenges by circumventing intensive GPU-based calculations with inexpensive CPU operations. It demonstrates overall superior performance to current state-of-the-art approaches on five public datasets for topology-preserving segmentation, while substantially reducing computational overheads by more than 90%. In doing so, we introduce the first multi-class capable loss function for thin structure segmentation, excelling in both efficiency and efficacy for topology-preservation.
Rate splitting multiple access (RSMA) is one of the most promising techniques for ultra-reliable and low-latency communications (URLLC) with stringent requirements on delay and reliability of multiple access. To fully explore the delay performance enhancement brought by uplink RSMA to URLLC, in this paper, we evaluate the performance of two-user uplink RSMA and propose the corresponding blocklength minimization problem. We analyze the impact of finite blocklength (FBL) code on the achievable rate region and the effective throughput of uplink RSMA. On this basis, we propose the problem of minimizing the blocklength for uplink RSMA with power allocation under constrained reliability and effective throughput. Then, we present an alternating optimization method to solve this non-convex problem. Simulation results show that different from the infinite blocklength (IBL) regime, the achievable rate region of the uplink RSMA is not always larger than that of uplink non-orthogonal multiple access (NOMA) in the FBL regime. But with the help of our proposed blocklength minimization scheme, uplink RSMA can achieve the same achievable rate with a smaller blocklength compared to uplink NOMA, frequency division multiple access (FDMA), and time division multiple access (TDMA) without the need for time sharing in the FBL regime, showing the potential of uplink RSMA to achieve low delay for URLLC.
We address the problem of representing context-specific causal models based on both observational and experimental data collected under general (e.g. hard or soft) interventions by introducing a new family of context-specific conditional independence models called CStrees. This family is defined via a novel factorization criterion that allows for a generalization of the factorization property defining general interventional DAG models. We derive a graphical characterization of model equivalence for observational CStrees that extends the Verma and Pearl criterion for DAGs. This characterization is then extended to CStree models under general, context-specific interventions. To obtain these results, we formalize a notion of context-specific intervention that can be incorporated into concise graphical representations of CStree models. We relate CStrees to other context-specific models, showing that the families of DAGs, CStrees, labeled DAGs and staged trees form a strict chain of inclusions. We end with an application of interventional CStree models to a real data set, revealing the context-specific nature of the data dependence structure and the soft, interventional perturbations.
Panoptic and instance segmentation networks are often trained with specialized object detection modules, complex loss functions, and ad-hoc post-processing steps to manage the permutation-invariance of the instance masks. This work builds upon Stable Diffusion and proposes a latent diffusion approach for panoptic segmentation, resulting in a simple architecture that omits these complexities. Our training consists of two steps: (1) training a shallow autoencoder to project the segmentation masks to latent space; (2) training a diffusion model to allow image-conditioned sampling in latent space. This generative approach unlocks the exploration of mask completion or inpainting. The experimental validation on COCO and ADE20k yields strong segmentation results. Finally, we demonstrate our model's adaptability to multi-tasking by introducing learnable task embeddings.
The paper gives a detailed presentation of a framework, embedded into the simply typed higher-order logic and aimed at the support of sound and structured reasoning about various properties of models of imperative programs with interleaved computations. As a case study, a model of the Peterson's mutual exclusion algorithm will be scrutinised in the course of the paper illustrating applicability of the framework.
Many reaction-diffusion systems in various applications exhibit traveling wave solutions that evolve on multiple spatio-temporal scales. These traveling wave solutions are crucial for understanding the underlying dynamics of the system. In this work, we present sixth-order weighted essentially non-oscillatory (WENO) methods within the finite difference framework to solve reaction-diffusion systems. The WENO method allows us to use fewer grid points and larger time steps compared to classical finite difference methods. Our focus is on solving the reaction-diffusion system for the traveling wave solution with the sharp front. Although the WENO method is popular for hyperbolic conservation laws, especially for problems with discontinuity, it can be adapted for the equations of parabolic type, such as reaction-diffusion systems, to effectively handle sharp wave fronts. Thus, we employed the WENO methods specifically developed for equations of parabolic type. We considered various reaction-diffusion equations, including Fisher's, Zeldovich, Newell-Whitehead-Segel, bistable equations, and the Lotka-Volterra competition-diffusion system, all of which yield traveling wave solutions with sharp wave fronts. Numerical examples in this work demonstrate that the central WENO method is highly more accurate and efficient than the commonly used finite difference method. We also provide an analysis related to the numerical speed of the sharp propagating front in the Newell-Whitehead-Segel equation. The overall results confirm that the central WENO method is highly efficient and is recommended for solving reaction-diffusion equations with sharp wave fronts.
Quantum communication systems support unique applications in the form of distributed quantum computing, distributed quantum sensing, and several cryptographic protocols. The main enabler in these communication systems is an efficient infrastructure that is capable to transport unknown quantum states with high rate and fidelity. This feat requires a new approach to communication system design which efficiently exploits the available physical layer resources, while respecting the limitations and principles of quantum information. Despite the fundamental differences between the classic and quantum worlds, there exist universal communication concepts that may proven beneficial in quantum communication systems as well. In this survey, the distinctive aspects of physical layer quantum communications are highlighted in a attempt to draw commonalities and divergences between classic and quantum communications. More specifically, we begin by overviewing the quantum channels and use cases over diverse optical propagation media, shedding light on the concepts of crosstalk and interference. Subsequently, we survey quantum sources, detectors, channels and modulation techniques. More importantly, we discuss and analyze spatial multiplexing techniques, such as coherent control, multiplexing, diversity and MIMO. Finally, we identify synergies between the two communication technologies and grand open challenges that can be pivotal in the development of next-generation quantum communication systems.
Object detection is a fundamental task in computer vision and image processing. Current deep learning based object detectors have been highly successful with abundant labeled data. But in real life, it is not guaranteed that each object category has enough labeled samples for training. These large object detectors are easy to overfit when the training data is limited. Therefore, it is necessary to introduce few-shot learning and zero-shot learning into object detection, which can be named low-shot object detection together. Low-Shot Object Detection (LSOD) aims to detect objects from a few or even zero labeled data, which can be categorized into few-shot object detection (FSOD) and zero-shot object detection (ZSD), respectively. This paper conducts a comprehensive survey for deep learning based FSOD and ZSD. First, this survey classifies methods for FSOD and ZSD into different categories and discusses the pros and cons of them. Second, this survey reviews dataset settings and evaluation metrics for FSOD and ZSD, then analyzes the performance of different methods on these benchmarks. Finally, this survey discusses future challenges and promising directions for FSOD and ZSD.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.