We study a system in which two-state Markov sources send status updates to a common receiver over a slotted ALOHA random access channel. We characterize the performance of the system in terms of state estimation entropy (SEE), which measures the uncertainty at the receiver about the sources' state. Two channel access strategies are considered, a reactive policy that depends on the source behavior and a random one that is independent of it. We prove that the considered policies can be studied using two different hidden Markov models (HMM) and show through density evolution (DE) analysis that the reactive strategy outperforms the random one in terms of SEE while the opposite is true for AoI. Furthermore, we characterize the probability of error in the state estimation at the receiver, considering a maximum a posteriori (MAP) estimator and a low-complexity (decode and hold) estimator. Our study provides useful insights on the design trade-offs that emerge when different performance metrics such as SEE, age or information (AoI) or state estimation probability error are adopted. Moreover, we show how the source statistics significantly impact the system performance.
This paper proposes a method for assessing differential item functioning (DIF) in item response theory (IRT) models. The method does not require pre-specification of anchor items, which is its main virtue. It is developed in two main steps, first by showing how DIF can be re-formulated as a problem of outlier detection in IRT-based scaling, then tackling the latter using established methods from robust statistics. The proposal is a redescending M-estimator of IRT scaling parameters that is tuned to flag items with DIF at the desired asymptotic Type I Error rate. One way of quantifying the robustness of the estimator is in terms of its finite sample breakdown point, which is shown to equal to 1/2 (i.e., the estimator remains bounded whenever fewer than 1/2 of the items on an assessment exhibit DIF). This theoretical result is complemented by simulation studies that illustrate the performance of the estimator and its associated test of DIF. The simulation studies show that the proposed method compares favorably to currently available approaches, and a real data example illustrates its application in a research context where pre-specification of anchor items is infeasible. The focus of the paper is the two-parameter logistic model in two independent groups, with extensions to other settings considered in the conclusion.
We consider sequential state and parameter learning in state-space models with intractable state transition and observation processes. By exploiting low-rank tensor-train (TT) decompositions, we propose new sequential learning methods for joint parameter and state estimation under the Bayesian framework. Our key innovation is the introduction of scalable function approximation tools such as TT for recursively learning the sequentially updated posterior distributions. The function approximation perspective of our methods offers tractable error analysis and potentially alleviates the particle degeneracy faced by many particle-based methods. In addition to the new insights into algorithmic design, our methods complement conventional particle-based methods. Our TT-based approximations naturally define conditional Knothe--Rosenblatt (KR) rearrangements that lead to filtering, smoothing and path estimation accompanying our sequential learning algorithms, which open the door to removing potential approximation bias. We also explore several preconditioning techniques based on either linear or nonlinear KR rearrangements to enhance the approximation power of TT for practical problems. We demonstrate the efficacy and efficiency of our proposed methods on several state-space models, in which our methods achieve state-of-the-art estimation accuracy and computational performance.
It is challenging to quantify numerical preferences for different objectives in a multi-objective decision-making problem. However, the demonstrations of a user are often accessible. We propose an algorithm to infer linear preference weights from either optimal or near-optimal demonstrations. The algorithm is evaluated in three environments with two baseline methods. Empirical results demonstrate significant improvements compared to the baseline algorithms, in terms of both time requirements and accuracy of the inferred preferences. In future work, we plan to evaluate the algorithm's effectiveness in a multi-agent system, where one of the agents is enabled to infer the preferences of an opponent using our preference inference algorithm.
Convolutional Neural Networks have demonstrated dermatologist-level performance in the classification of melanoma from skin lesion images, but prediction irregularities due to biases seen within the training data are an issue that should be addressed before widespread deployment is possible. In this work, we robustly remove bias and spurious variation from an automated melanoma classification pipeline using two leading bias unlearning techniques. We show that the biases introduced by surgical markings and rulers presented in previous studies can be reasonably mitigated using these bias removal methods. We also demonstrate the generalisation benefits of unlearning spurious variation relating to the imaging instrument used to capture lesion images. Our experimental results provide evidence that the effects of each of the aforementioned biases are notably reduced, with different debiasing techniques excelling at different tasks.
The total correlation(TC) is a crucial index to measure the correlation between marginal distribution in multidimensional random variables, and it is frequently applied as an inductive bias in representation learning. Previous research has shown that the TC value can be estimated using mutual information boundaries through decomposition. However, we found through theoretical derivation and qualitative experiments that due to the use of importance sampling in the decomposition process, the bias of TC value estimated based on MI bounds will be amplified when the proposal distribution in the sampling differs significantly from the target distribution. To reduce estimation bias issues, we propose a TC estimation correction model based on supervised learning, which uses the training iteration loss sequence of the TC estimator based on MI bounds as input features to output the true TC value. Experiments show that our proposed method can improve the accuracy of TC estimation and eliminate the variance generated by the TC estimation process.
This paper presents an accelerated proximal gradient method for multiobjective optimization, in which each objective function is the sum of a continuously differentiable, convex function and a closed, proper, convex function. Extending first-order methods for multiobjective problems without scalarization has been widely studied, but providing accelerated methods with accurate proofs of convergence rates remains an open problem. Our proposed method is a multiobjective generalization of the accelerated proximal gradient method, also known as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), for scalar optimization. The key to this successful extension is solving a subproblem with terms exclusive to the multiobjective case. This approach allows us to demonstrate the global convergence rate of the proposed method ($O(1 / k^2)$), using a merit function to measure the complexity. Furthermore, we present an efficient way to solve the subproblem via its dual representation, and we confirm the validity of the proposed method through some numerical experiments.
Conventional supervised learning methods typically assume i.i.d samples and are found to be sensitive to out-of-distribution (OOD) data. We propose Generative Causal Representation Learning (GCRL) which leverages causality to facilitate knowledge transfer under distribution shifts. While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well. First, we propose a novel causal model that explains the generative factors in motion forecasting datasets using features that are common across all environments and with features that are specific to each environment. Selection variables are used to determine which parts of the model can be directly transferred to a new environment without fine-tuning. Second, we propose an end-to-end variational learning paradigm to learn the causal mechanisms that generate observations from features. GCRL is supported by strong theoretical results that imply identifiability of the causal model under certain assumptions. Experimental results on synthetic and real-world motion forecasting datasets show the robustness and effectiveness of our proposed method for knowledge transfer under zero-shot and low-shot settings by substantially outperforming the prior motion forecasting models on out-of-distribution prediction. Our code is available at //github.com/sshirahmad/GCRL.
With the rapid development of facial forgery techniques, forgery detection has attracted more and more attention due to security concerns. Existing approaches attempt to use frequency information to mine subtle artifacts under high-quality forged faces. However, the exploitation of frequency information is coarse-grained, and more importantly, their vanilla learning process struggles to extract fine-grained forgery traces. To address this issue, we propose a progressive enhancement learning framework to exploit both the RGB and fine-grained frequency clues. Specifically, we perform a fine-grained decomposition of RGB images to completely decouple the real and fake traces in the frequency space. Subsequently, we propose a progressive enhancement learning framework based on a two-branch network, combined with self-enhancement and mutual-enhancement modules. The self-enhancement module captures the traces in different input spaces based on spatial noise enhancement and channel attention. The Mutual-enhancement module concurrently enhances RGB and frequency features by communicating in the shared spatial dimension. The progressive enhancement process facilitates the learning of discriminative features with fine-grained face forgery clues. Extensive experiments on several datasets show that our method outperforms the state-of-the-art face forgery detection methods.
Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen before and cannot make a safe decision. This problem first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems are closely related to OOD detection in terms of motivation and methodology. These include anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). Despite having different definitions and problem settings, these problems often confuse readers and practitioners, and as a result, some existing studies misuse terms. In this survey, we first present a generic framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Then, we conduct a thorough review of each of the five areas by summarizing their recent technical developments. We conclude this survey with open challenges and potential research directions.
Object tracking is the cornerstone of many visual analytics systems. While considerable progress has been made in this area in recent years, robust, efficient, and accurate tracking in real-world video remains a challenge. In this paper, we present a hybrid tracker that leverages motion information from the compressed video stream and a general-purpose semantic object detector acting on decoded frames to construct a fast and efficient tracking engine suitable for a number of visual analytics applications. The proposed approach is compared with several well-known recent trackers on the OTB tracking dataset. The results indicate advantages of the proposed method in terms of speed and/or accuracy. Another advantage of the proposed method over most existing trackers is its simplicity and deployment efficiency, which stems from the fact that it reuses and re-purposes the resources and information that may already exist in the system for other reasons.