Electric vehicles (EVs) play critical roles in autonomous mobility-on-demand (AMoD) systems, but their unique charging patterns increase the model uncertainties in AMoD systems (e.g. state transition probability). Since there usually exists a mismatch between the training and test/true environments, incorporating model uncertainty into system design is of critical importance in real-world applications. However, model uncertainties have not been considered explicitly in EV AMoD system rebalancing by existing literature yet, and the coexistence of model uncertainties and constraints that the decision should satisfy makes the problem even more challenging. In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with state transition kernel uncertainty for EV AMoD systems. We then propose a robust and constrained MARL algorithm (ROCOMA) with robust natural policy gradients (RNPG) that trains a robust EV rebalancing policy to balance the supply-demand ratio and the charging utilization rate across the city under model uncertainty. Experiments show that the ROCOMA can learn an effective and robust rebalancing policy. It outperforms non-robust MARL methods in the presence of model uncertainties. It increases the system fairness by 19.6% and decreases the rebalancing costs by 75.8%.
The current body of research on terahertz (THz) wireless communications predominantly focuses on its application for single-user backhaul/fronthaul connectivity at sub-THz frequencies. First, we develop a generalized statistical model for signal propagation at THz frequencies encompassing physical layer impairments, including random path-loss with Gamma distribution for the molecular absorption coefficient, short-term fading characterized by the $\alpha$-$\eta$-$\kappa$-$\mu$ distribution, antenna misalignment errors, and transceiver hardware impairments. Next, we propose random access protocols for a cell-free wireless network, ensuring successful transmission for multiple users with limited delay and energy loss, exploiting the combined effect of random atmospheric absorption, non-linearity of fading, hardware impairments, and antenna misalignment errors. We consider two schemes: a fixed transmission probability (FTP) scheme where the transmission probability (TP) of each user is updated at the beginning of the data transmission and an adaptive transmission probability (ATP) scheme where the TP is updated with each successful reception of the data. We analyze the performance of both protocols using delay, energy consumption, and outage probability with scaling laws for the transmission of a data frame consisting of a single packet from users at a predefined quality of service (QoS).
Virtual reality (VR) renderers driving CAVEs and similar immersive environments use the off-axis stereo camera model so that a tracked user can move freely in front of the projection plane. Geometrically, off-axis projection results in asymmetric viewing frusta and generalizes the ubiquitous perspective camera model to support positioning off the center of the projection plane. VR renderers often integrate with larger visualization systems that rely on libraries for position tracking and pose estimates, for ray tracing-based rendering, and for user interaction. We demonstrate different strategies to implement off-axis stereo projection within the constraints of given VR applications and ray tracing libraries. We aim for minimal to no adjustments required to the internal camera representation of such libraries. We include host and shader code with the article that can be directly integrated in custom applications.
Current approaches for collision avoidance and space traffic management face many challenges, mainly due to the continuous increase in the number of objects in orbit and the lack of scalable and automated solutions. To avoid catastrophic incidents, satellite owners/operators must be aware of their assets' collision risk to decide whether a collision avoidance manoeuvre needs to be performed. This process is typically executed through the use of warnings issued in the form of CDMs which contain information about the event, such as the expected TCA and the probability of collision. Our previous work presented a statistical learning model that allowed us to answer two important questions: (1) Will any new conjunctions be issued in the next specified time interval? (2) When and with what uncertainty will the next CDM arrive? However, the model was based on an empirical Bayes homogeneous Poisson process, which assumes that the arrival rates of CDMs are constant over time. In fact, the rate at which the CDMs are issued depends on the behaviour of the objects as well as on the screening process performed by third parties. Thus, in this work, we extend the previous study and propose a Bayesian non-homogeneous Poisson process implemented with high precision using a Probabilistic Programming Language to fully describe the underlying phenomena. We compare the proposed solution with a baseline model to demonstrate the added value of our approach. The results show that this problem can be successfully modelled by our Bayesian non-homogeneous Poisson Process with greater accuracy, contributing to the development of automated collision avoidance systems and helping operators react timely but sparingly with satellite manoeuvres.
Internet-of-vehicle (IoV) is a general concept referring to, e.g., autonomous drive based vehicle-to-everything (V2X) communications or moving relays. Here, high rate and reliability demands call for advanced multi-antenna techniques and millimeter-wave (mmw) based communications. However, the sensitivity of the mmw signals to blockage may limit the system performance, especially in highways/rural areas with limited building reflectors/base station deployments and high-speed devices. To avoid the blockage, various techniques have been proposed among which reconfigurable intelligent surface (RIS) is a candidate. RIS, however, has been mainly of interest in stationary/low mobility scenarios, due to the associated channel state information acquisition and beam management overhead as well as imperfect reflection. In this article, we study the potentials and challenges of RIS-assisted dynamic blockage avoidance in IoV networks. Particularly, by designing region-based RIS pre-selection as well as blockage prediction schemes, we show that RIS-assisted communication has the potential to boost the performance of IoV networks. However, there are still issues to be solved before RIS can be practically deployed in IoV networks.
The sixth-generation (6G) wireless technology recognizes the potential of reconfigurable intelligent surfaces (RIS) as an effective technique for intelligently manipulating channel paths through reflection to serve desired users. Full-duplex (FD) systems, enabling simultaneous transmission and reception from a base station (BS), offer the theoretical advantage of doubled spectrum efficiency. However, the presence of strong self-interference (SI) in FD systems significantly degrades performance, which can be mitigated by leveraging the capabilities of RIS. Moreover, accurately obtaining channel state information (CSI) from RIS poses a critical challenge. Our objective is to maximize downlink (DL) user data rates while ensuring quality-of-service (QoS) for uplink (UL) users under imperfect CSI from reflected channels. To address this, we propose a robust active BS and passive RIS beamforming (RAPB) scheme for RIS-FD, accounting for both SI and imperfect CSI. RAPB incorporates distributionally robust design, conditional value-at-risk (CVaR), and penalty convex-concave programming (PCCP) techniques. Simulation results demonstrate the UL/DL rate improvement are achieved by considering different levels of imperfect CSI. The proposed RAPB schemes validate their effectiveness across different RIS deployments and RIS/BS configurations. Benefited from robust beamforming, RAPB outperforms the existing methods in terms of non-robustness, deployment without RIS, conventional approximation, and half-duplex systems.
Natural Language Processing (NLP) is a key technique for developing Medical Artificial Intelligence (AI) systems that leverage Electronic Health Record (EHR) data to build diagnostic and prognostic models. NLP enables the conversion of unstructured clinical text into structured data that can be fed into AI algorithms. The emergence of the transformer architecture and large language models (LLMs) has led to remarkable advances in NLP for various healthcare tasks, such as entity recognition, relation extraction, sentence similarity, text summarization, and question answering. In this article, we review the major technical innovations that underpin modern NLP models and present state-of-the-art NLP applications that employ LLMs in radiation oncology research. However, these LLMs are prone to many errors such as hallucinations, biases, and ethical violations, which necessitate rigorous evaluation and validation before clinical deployment. As such, we propose a comprehensive framework for assessing the NLP models based on their purpose and clinical fit, technical performance, bias and trust, legal and ethical implications, and quality assurance, prior to implementation in clinical radiation oncology. Our article aims to provide guidance and insights for researchers and clinicians who are interested in developing and using NLP models in clinical radiation oncology.
The widespread adoption of commercial autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) may largely depend on their acceptance by society, for which their perceived trustworthiness and interpretability to riders are crucial. In general, this task is challenging because modern autonomous systems software relies heavily on black-box artificial intelligence models. Towards this goal, this paper introduces a novel dataset, Rank2Tell, a multi-modal ego-centric dataset for Ranking the importance level and Telling the reason for the importance. Using various close and open-ended visual question answering, the dataset provides dense annotations of various semantic, spatial, temporal, and relational attributes of various important objects in complex traffic scenarios. The dense annotations and unique attributes of the dataset make it a valuable resource for researchers working on visual scene understanding and related fields. Furthermore, we introduce a joint model for joint importance level ranking and natural language captions generation to benchmark our dataset and demonstrate performance with quantitative evaluations.
With the extremely rapid advances in remote sensing (RS) technology, a great quantity of Earth observation (EO) data featuring considerable and complicated heterogeneity is readily available nowadays, which renders researchers an opportunity to tackle current geoscience applications in a fresh way. With the joint utilization of EO data, much research on multimodal RS data fusion has made tremendous progress in recent years, yet these developed traditional algorithms inevitably meet the performance bottleneck due to the lack of the ability to comprehensively analyse and interpret these strongly heterogeneous data. Hence, this non-negligible limitation further arouses an intense demand for an alternative tool with powerful processing competence. Deep learning (DL), as a cutting-edge technology, has witnessed remarkable breakthroughs in numerous computer vision tasks owing to its impressive ability in data representation and reconstruction. Naturally, it has been successfully applied to the field of multimodal RS data fusion, yielding great improvement compared with traditional methods. This survey aims to present a systematic overview in DL-based multimodal RS data fusion. More specifically, some essential knowledge about this topic is first given. Subsequently, a literature survey is conducted to analyse the trends of this field. Some prevalent sub-fields in the multimodal RS data fusion are then reviewed in terms of the to-be-fused data modalities, i.e., spatiospectral, spatiotemporal, light detection and ranging-optical, synthetic aperture radar-optical, and RS-Geospatial Big Data fusion. Furthermore, We collect and summarize some valuable resources for the sake of the development in multimodal RS data fusion. Finally, the remaining challenges and potential future directions are highlighted.
Image-level weakly supervised semantic segmentation (WSSS) is a fundamental yet challenging computer vision task facilitating scene understanding and automatic driving. Most existing methods resort to classification-based Class Activation Maps (CAMs) to play as the initial pseudo labels, which tend to focus on the discriminative image regions and lack customized characteristics for the segmentation task. To alleviate this issue, we propose a novel activation modulation and recalibration (AMR) scheme, which leverages a spotlight branch and a compensation branch to obtain weighted CAMs that can provide recalibration supervision and task-specific concepts. Specifically, an attention modulation module (AMM) is employed to rearrange the distribution of feature importance from the channel-spatial sequential perspective, which helps to explicitly model channel-wise interdependencies and spatial encodings to adaptively modulate segmentation-oriented activation responses. Furthermore, we introduce a cross pseudo supervision for dual branches, which can be regarded as a semantic similar regularization to mutually refine two branches. Extensive experiments show that AMR establishes a new state-of-the-art performance on the PASCAL VOC 2012 dataset, surpassing not only current methods trained with the image-level of supervision but also some methods relying on stronger supervision, such as saliency label. Experiments also reveal that our scheme is plug-and-play and can be incorporated with other approaches to boost their performance.
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.