We propose a novel inverse rendering method that enables the transformation of existing indoor panoramas with new indoor furniture layouts under natural illumination. To achieve this, we captured indoor HDR panoramas along with real-time outdoor hemispherical HDR photographs. Indoor and outdoor HDR images were linearly calibrated with measured absolute luminance values for accurate scene relighting. Our method consists of three key components: (1) panoramic furniture detection and removal, (2) automatic floor layout design, and (3) global rendering with scene geometry, new furniture objects, and a real-time outdoor photograph. We demonstrate the effectiveness of our workflow in rendering indoor scenes under different outdoor illumination conditions. Additionally, we contribute a new calibrated HDR (Cali-HDR) dataset that consists of 137 calibrated indoor panoramas and their associated outdoor photographs.
Functional data present as functions or curves possessing a spatial or temporal component. These components by nature have a fixed observational domain. Consequently, any asymptotic investigation requires modelling the increased correlation among observations as density increases due to this fixed domain constraint. One such appropriate stochastic process is the Ornstein-Uhlenbeck process. Utilizing this spatial autoregressive process, we demonstrate that parameter estimators for a simple linear regression model display inconsistency in an infill asymptotic domain. Such results are contrary to those expected under the customary increasing domain asymptotics. Although none of these estimator variances approach zero, they do display a pattern of diminishing return regarding decreasing estimator variance as sample size increases. This may prove invaluable to a practitioner as this indicates perhaps an optimal sample size to cease data collection. This in turn reduces time and data collection cost because little information is gained in sampling beyond a certain sample size.
Mittag-Leffler correlated noise (M-L noise) plays a crucial role in the dynamics of complex systems, yet the scientific community has lacked tools for its direct generation. Addressing this gap, our work introduces GenML, a Python library specifically designed for generating M-L noise. We detail the architecture and functionalities of GenML and its underlying algorithmic approach, which enables the precise simulation of M-L noise. The effectiveness of GenML is validated through quantitative analyses of autocorrelation functions and diffusion behaviors, showcasing its capability to accurately replicate theoretical noise properties. Our contribution with GenML enables the effective application of M-L noise data in numerical simulation and data-driven methods for describing complex systems, moving beyond mere theoretical modeling.
Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users to learn how to bid an artificial currency that has no value outside the auctions. Indeed, users must jointly learn the value of the currency in addition to how to spend it optimally. In this work, we study the problem of learning to bid in two prominent classes of artificial currency auctions: those in which currency, which users spend to obtain public resources, is only issued at the beginning of a finite period; and those where, in addition to the initial currency endowment, currency payments are redistributed to users at each time step. In the latter class, the currency has been referred to as karma, since users do not only spend karma to obtain public resources but also gain karma for yielding them. In both classes, we propose a simple learning strategy, called adaptive karma pacing, and show that this strategy a) is asymptotically optimal for a single user bidding against competing bids drawn from a stationary distribution; b) leads to convergent learning dynamics when all users adopt it; and c) constitutes an approximate Nash equilibrium as the number of users grows. Our results require a novel analysis in comparison to adaptive pacing strategies in monetary auctions, since we depart from the classical assumption that the currency has known value outside the auctions, and moreover consider that the currency is both spent and gained in the class of auctions with redistribution.
This paper presents an efficient method for updating particles in a particle filter (PF) to address the position estimation problem when dealing with sharp-peaked likelihood functions derived from multiple observations. Sharp-peaked likelihood functions commonly arise from millimeter-accurate distance observations of carrier phases in the global navigation satellite system (GNSS). However, when such likelihood functions are used for particle weight updates, the absence of particles within the peaks leads to all particle weights becoming zero. To overcome this problem, in this study, a straightforward and effective approach is introduced for updating particles when dealing with sharp-peaked likelihood functions obtained from multiple observations. The proposed method, termed as the multiple update PF, leverages prior knowledge regarding the spread of distribution for each likelihood function and conducts weight updates and resampling iteratively in the particle update process, prioritizing the likelihood function spreads. Experimental results demonstrate the efficacy of our proposed method, particularly when applied to position estimation utilizing GNSS pseudorange and carrier phase observations. The multiple update PF exhibits faster convergence with fewer particles when compared to the conventional PF. Moreover, vehicle position estimation experiments conducted in urban environments reveal that the proposed method outperforms conventional GNSS positioning techniques, yielding more accurate position estimates.
Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment. Distilling Step-by-Step (DSS), a novel method utilizing chain-of-thought (CoT) distillation, has demonstrated promise by imbuing smaller models with the superior reasoning capabilities of their larger counterparts. In DSS, the distilled model acquires the ability to generate rationales and predict labels concurrently through a multi-task learning framework. However, DSS overlooks the intrinsic relationship between the two training tasks, leading to ineffective integration of CoT knowledge with the task of label prediction. To this end, we investigate the mutual relationship of the two tasks from Information Bottleneck perspective and formulate it as maximizing the mutual information of the representation features of the two tasks. We propose a variational approach to solve this optimization problem using a learning-based method. Our experimental results across four datasets demonstrate that our method outperforms the state-of-the-art DSS. Our findings offer insightful guidance for future research on language model distillation as well as applications involving CoT. Code and models will be released soon.
This paper introduces a novel neuro-symbolic architecture for relation classification (RC) that combines rule-based methods with contemporary deep learning techniques. This approach capitalizes on the strengths of both paradigms: the adaptability of rule-based systems and the generalization power of neural networks. Our architecture consists of two components: a declarative rule-based model for transparent classification and a neural component to enhance rule generalizability through semantic text matching. Notably, our semantic matcher is trained in an unsupervised domain-agnostic way, solely with synthetic data. Further, these components are loosely coupled, allowing for rule modifications without retraining the semantic matcher. In our evaluation, we focused on two few-shot relation classification datasets: Few-Shot TACRED and a Few-Shot version of NYT29. We show that our proposed method outperforms previous state-of-the-art models in three out of four settings, despite not seeing any human-annotated training data. Further, we show that our approach remains modular and pliable, i.e., the corresponding rules can be locally modified to improve the overall model. Human interventions to the rules for the TACRED relation \texttt{org:parents} boost the performance on that relation by as much as 26\% relative improvement, without negatively impacting the other relations, and without retraining the semantic matching component.
With the continuous growth in the number of parameters of transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, we conduct experiments using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.
This article presents the affordances that Generative Artificial Intelligence can have in disinformation context, one of the major threats to our digitalized society. We present a research framework to generate customized agent-based social networks for disinformation simulations that would enable understanding and evaluation of the phenomena whilst discussing open challenges.
Substantial efforts have been devoted more recently to presenting various methods for object detection in optical remote sensing images. However, the current survey of datasets and deep learning based methods for object detection in optical remote sensing images is not adequate. Moreover, most of the existing datasets have some shortcomings, for example, the numbers of images and object categories are small scale, and the image diversity and variations are insufficient. These limitations greatly affect the development of deep learning based object detection methods. In the paper, we provide a comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities. Then, we propose a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images, which we name as DIOR. The dataset contains 23463 images and 192472 instances, covering 20 object classes. The proposed DIOR dataset 1) is large-scale on the object categories, on the object instance number, and on the total image number; 2) has a large range of object size variations, not only in terms of spatial resolutions, but also in the aspect of inter- and intra-class size variability across objects; 3) holds big variations as the images are obtained with different imaging conditions, weathers, seasons, and image quality; and 4) has high inter-class similarity and intra-class diversity. The proposed benchmark can help the researchers to develop and validate their data-driven methods. Finally, we evaluate several state-of-the-art approaches on our DIOR dataset to establish a baseline for future research.
Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.