The bistatic integrated sensing and communication (ISAC) system model avoids the strong self-interference in a monostatic ISAC system by employing a pair of physically separated sensing transceiver and maintaining the merit of co-designing radar sensing and communications on shared spectrum and hardware. Inspired by the appealing benefits of bistatic radar, we study bistatic ISAC, where a transmitter sends a message to a communication receiver and a sensing receiver at another location carries out a decoding-and-estimation(DnE) operation to obtain the state of the communication receiver. In this paper, both communication and sensing channels are modelled as state-dependent memoryless channels with independent and identically distributed time-varying state sequences. We consider a rate of reliable communication for the message at the communication receiver as communication metric. The objective of this model is to characterize the capacity-distortion region, i.e., the set of all the achievable rate while simultaneously allowing the sensing receiver to sense the state sequence with a given distortion threshold. In terms of the decoding degree on this message at the sensing receiver, we propose three achievable DnE strategies, the blind estimation, the partial-decoding-based estimation, and the full-decoding-based estimation, respectively. Based on the three strategies, we derive the three achievable rate-distortion regions. In addition, under the constraint of the degraded broadcast channel, i.e., the communication receiver is statistically stronger than the sensing receiver, and the partial-decoding-based estimation, we characterize the capacity region. Examples in both non-degraded and degraded cases are provided to compare the achievable rate-distortion regions under three DnE strategies and demonstrate the advantages of ISAC over independent communication and sensing.
Modern commercial Heating, Ventilation, and Air Conditioning (HVAC) devices form a complex and interconnected thermodynamic system with the building and outside weather conditions, and current setpoint control policies are not fully optimized for minimizing energy use and carbon emission. Given a suitable training environment, a Reinforcement Learning (RL) model is able to improve upon these policies, but training such a model, especially in a way that scales to thousands of buildings, presents many real world challenges. We propose a novel simulation-based approach, where a customized simulator is used to train the agent for each building. Our open-source simulator (available online: //github.com/google/sbsim) is lightweight and calibrated via telemetry from the building to reach a higher level of fidelity. On a two-story, 68,000 square foot building, with 127 devices, we were able to calibrate our simulator to have just over half a degree of drift from the real world over a six-hour interval. This approach is an important step toward having a real-world RL control system that can be scaled to many buildings, allowing for greater efficiency and resulting in reduced energy consumption and carbon emissions.
Since the Radon transform (RT) consists in a line integral function, some modeling assumptions are made on Computed Tomography (CT) system, making image reconstruction analytical methods, such as Filtered Backprojection (FBP), sensitive to artifacts and noise. In the other hand, recently, a new integral transform, called Scale Space Radon Transform (SSRT), is introduced where, RT is a particular case. Thanks to its interesting properties, such as good scale space behavior, the SSRT has known number of new applications. In this paper, with the aim to improve the reconstructed image quality for these methods, we propose to model the X-ray beam with the Scale Space Radon Transform (SSRT) where, the assumptions done on the physical dimensions of the CT system elements reflect better the reality. After depicting the basic properties and the inversion of SSRT, the FBP algorithm is used to reconstruct the image from the SSRT sinogram where the RT spectrum used in FBP is replaced by SSRT and the Gaussian kernel, expressed in their frequency domain. PSNR and SSIM, as quality measures, are used to compare RT and SSRT-based image reconstruction on Shepp-Logan head and anthropomorphic abdominal phantoms. The first findings show that the SSRT-based method outperforms the methods based on RT, especially, when the number of projections is reduced, making it more appropriate for applications requiring low-dose radiation, such as medical X-ray CT. While SSRT-FBP and RT-FBP have utmost the same runtime, the experiments show that SSRT-FBP is more robust to Poisson-Gaussian noise corrupting CT data.
The nonparametric view of Bayesian inference has transformed statistics and many of its applications. The canonical Dirichlet process and other more general families of nonparametric priors have served as a gateway to solve frontier uncertainty quantification problems of large, or infinite, nature. This success has been greatly due to available constructions and representations of such distributions, the two most useful constructions are the one based on normalization of homogeneous completely random measures and that based on stick-breaking processes. Hence, understanding their distributional features and how different random probability measures compare among themselves is a key ingredient for their proper application. In this paper, we analyse the discrepancy among some nonparametric priors employed in the literature. Initially, we compute the mean and variance of the random Kullback-Leibler divergence between the Dirichlet process and the geometric process. Subsequently, we extend our analysis to encompass a broader class of exchangeable stick-breaking processes, which includes the Dirichlet and geometric processes as extreme cases. Our results establish quantitative conditions where all the aforementioned priors are close in total variation distance. In such instances, adhering to Occam's razor principle advocates for the preference of the simpler process.
The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the open-source community's interest in instruction-tuning, which is deemed to accelerate ChatGPT's replication process. However, research on instruction-tuning LLMs in Chinese, the world's most spoken language, is still in its early stages. Therefore, this paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook that provides valuable findings for effectively customizing LLMs that can better respond to Chinese instructions. Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types, which are the three most important elements for instruction-tuning. Besides, we also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment. We hope that this empirical study can make a modest contribution to the open Chinese version of ChatGPT. This paper will release a powerful Chinese LLMs that is comparable to ChatGLM. The code and data are available at //github.com/PhoebusSi/Alpaca-CoT.
Automatic parsing of human anatomies at instance-level from 3D computed tomography (CT) scans is a prerequisite step for many clinical applications. The presence of pathologies, broken structures or limited field-of-view (FOV) all can make anatomy parsing algorithms vulnerable. In this work, we explore how to exploit and conduct the prosperous detection-then-segmentation paradigm in 3D medical data, and propose a steerable, robust, and efficient computing framework for detection, identification, and segmentation of anatomies in CT scans. Considering complicated shapes, sizes and orientations of anatomies, without lose of generality, we present the nine degrees-of-freedom (9-DoF) pose estimation solution in full 3D space using a novel single-stage, non-hierarchical forward representation. Our whole framework is executed in a steerable manner where any anatomy of interest can be directly retrieved to further boost the inference efficiency. We have validated the proposed method on three medical imaging parsing tasks of ribs, spine, and abdominal organs. For rib parsing, CT scans have been annotated at the rib instance-level for quantitative evaluation, similarly for spine vertebrae and abdominal organs. Extensive experiments on 9-DoF box detection and rib instance segmentation demonstrate the effectiveness of our framework (with the identification rate of 97.0% and the segmentation Dice score of 90.9%) in high efficiency, compared favorably against several strong baselines (e.g., CenterNet, FCOS, and nnU-Net). For spine identification and segmentation, our method achieves a new state-of-the-art result on the public CTSpine1K dataset. Last, we report highly competitive results in multi-organ segmentation at FLARE22 competition. Our annotations, code and models will be made publicly available at: //github.com/alibaba-damo-academy/Med_Query.
Large language models (LLMs) fine-tuned with reinforcement learning from human feedback (RLHF) have been used in some of the most widely deployed AI models to date, such as OpenAI's ChatGPT, Anthropic's Claude, or Meta's LLaMA-2. While there has been significant work developing these methods, our understanding of the benefits and downsides of each stage in RLHF is still limited. To fill this gap, we present an extensive analysis of how each stage of the process (i.e. supervised fine-tuning (SFT), reward modelling, and RLHF) affects two key properties: out-of-distribution (OOD) generalisation and output diversity. OOD generalisation is crucial given the wide range of real-world scenarios in which these models are being used, while output diversity refers to the model's ability to generate varied outputs and is important for a variety of use cases. We perform our analysis across two base models on both summarisation and instruction following tasks, the latter being highly relevant for current LLM use cases. We find that RLHF generalises better than SFT to new inputs, particularly as the distribution shift between train and test becomes larger. However, RLHF significantly reduces output diversity compared to SFT across a variety of measures, implying a tradeoff in current LLM fine-tuning methods between generalisation and diversity. Our results provide guidance on which fine-tuning method should be used depending on the application, and show that more research is needed to improve the trade-off between generalisation and diversity.
We study the optimal order (or sequence) of contracting a tensor network with a minimal computational cost. We conclude 2 different versions of this optimal sequence: that minimize the operation number (OMS) and that minimize the time complexity (CMS). Existing results only shows that OMS is NP-hard, but no conclusion on CMS problem. In this work, we firstly reduce CMS to CMS-0, which is a sub-problem of CMS with no free indices. Then we prove that CMS is easier than OMS, both in general and in tree cases. Last but not least, we prove that CMS is still NP-hard. Based on our results, we have built up relationships of hardness of different tensor network contraction problems.
The formulation of Mean Field Games (MFG) typically requires continuous differentiability of the Hamiltonian in order to determine the advective term in the Kolmogorov--Fokker--Planck equation for the density of players. However, in many cases of practical interest, the underlying optimal control problem may exhibit bang-bang controls, which typically lead to nondifferentiable Hamiltonians. We develop the analysis and numerical analysis of stationary MFG for the general case of convex, Lipschitz, but possibly nondifferentiable Hamiltonians. In particular, we propose a generalization of the MFG system as a Partial Differential Inclusion (PDI) based on interpreting the derivative of the Hamiltonian in terms of subdifferentials of convex functions. We establish existence of a weak solution to the MFG PDI system, and we further prove uniqueness under a similar monotonicity condition to the one considered by Lasry and Lions. We then propose a monotone finite element discretization of the problem, and we prove strong $H^1$-norm convergence of the approximations to the value function and strong $L^q$-norm convergence of the approximations of the density function. We illustrate the performance of the numerical method in numerical experiments featuring nonsmooth solutions.
Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering. General neural architectures that jointly learn representations and transformations of text are very data-inefficient, and it is hard to analyse their reasoning process. These issues are addressed by end-to-end differentiable reasoning systems such as Neural Theorem Provers (NTPs), although they can only be used with small-scale symbolic KBs. In this paper we first propose Greedy NTPs (GNTPs), an extension to NTPs addressing their complexity and scalability limitations, thus making them applicable to real-world datasets. This result is achieved by dynamically constructing the computation graph of NTPs and including only the most promising proof paths during inference, thus obtaining orders of magnitude more efficient models. Then, we propose a novel approach for jointly reasoning over KBs and textual mentions, by embedding logic facts and natural language sentences in a shared embedding space. We show that GNTPs perform on par with NTPs at a fraction of their cost while achieving competitive link prediction results on large datasets, providing explanations for predictions, and inducing interpretable models. Source code, datasets, and supplementary material are available online at //github.com/uclnlp/gntp.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.