Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.
The development of nonlinear optimization algorithms capable of performing reliably in the presence of noise has garnered considerable attention lately. This paper advocates for strategies to create noise-tolerant nonlinear optimization algorithms by adapting classical deterministic methods. These adaptations follow certain design guidelines described here, which make use of estimates of the noise level in the problem. The application of our methodology is illustrated by the development of a line search gradient projection method, which is tested on an engineering design problem. It is shown that a new self-calibrated line search and noise-aware finite-difference techniques are effective even in the high noise regime. Numerical experiments investigate the resiliency of key algorithmic components. A convergence analysis of the line search gradient projection method establishes convergence to a neighborhood of the solution.
We use Markov categories to develop generalizations of the theory of Markov chains and hidden Markov models in an abstract setting. This comprises characterizations of hidden Markov models in terms of local and global conditional independences as well as existing algorithms for Bayesian filtering and smoothing applicable in all Markov categories with conditionals. We show that these algorithms specialize to existing ones such as the Kalman filter, forward-backward algorithm, and the Rauch-Tung-Striebel smoother when instantiated in appropriate Markov categories. Under slightly stronger assumptions, we also prove that the sequence of outputs of the Bayes filter is itself a Markov chain with a concrete formula for its transition maps. There are two main features of this categorical framework. The first is its generality, as it can be used in any Markov category with conditionals. In particular, it provides a systematic unified account of hidden Markov models and algorithms for filtering and smoothing in discrete probability, Gaussian probability, measure-theoretic probability, possibilistic nondeterminism and others at the same time. The second feature is the intuitive visual representation of information flow in these algorithms in terms of string diagrams.
Biometric authentication prospered because of its convenient use and security. Early generations of biometric mechanisms suffer from spoofing attacks. Recently, unobservable physiological signals (e.g., Electroencephalogram, Photoplethysmogram, Electrocardiogram) as biometrics offer a potential remedy to this problem. In particular, Photoplethysmogram (PPG) measures the change in blood flow of the human body by an optical method. Clinically, researchers commonly use PPG signals to obtain patients' blood oxygen saturation, heart rate, and other information to assist in diagnosing heart-related diseases. Since PPG signals contain a wealth of individual cardiac information, researchers have begun to explore their potential in cyber security applications. The unique advantages (simple acquisition, difficult to steal, and live detection) of the PPG signal allow it to improve the security and usability of the authentication in various aspects. However, the research on PPG-based authentication is still in its infancy. The lack of systematization hinders new research in this field. We conduct a comprehensive study of PPG-based authentication and discuss these applications' limitations before pointing out future research directions.
Correlation clustering is a well-known unsupervised learning setting that deals with positive and negative pairwise similarities. In this paper, we study the case where the pairwise similarities are not given in advance and must be queried in a cost-efficient way. Thereby, we develop a generic active learning framework for this task that benefits from several advantages, e.g., flexibility in the type of feedback that a user/annotator can provide, adaptation to any correlation clustering algorithm and query strategy, and robustness to noise. In addition, we propose and analyze a number of novel query strategies suited to this setting. We demonstrate the effectiveness of our framework and the proposed query strategies via several experimental studies.
Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity. However, existing work struggles with producing topic hierarchies of low affinity, rationality, and diversity, which hampers document understanding. To overcome these challenges, we in this paper propose Transport Plan and Context-aware Hierarchical Topic Model (TraCo). Instead of early simple topic dependencies, we propose a transport plan dependency method. It constrains dependencies to ensure their sparsity and balance, and also regularizes topic hierarchy building with them. This improves affinity and diversity of hierarchies. We further propose a context-aware disentangled decoder. Rather than previously entangled decoding, it distributes different semantic granularity to topics at different levels by disentangled decoding. This facilitates the rationality of hierarchies. Experiments on benchmark datasets demonstrate that our method surpasses state-of-the-art baselines, effectively improving the affinity, rationality, and diversity of hierarchical topic modeling with better performance on downstream tasks.
Recently the adaption problem of Information-Based Complexity (IBC) for linear problems in the randomized setting was solved in Heinrich (J. Complexity 82, 2024, 101821). Several papers treating further aspects of this problem followed. However, all examples obtained so far were vector-valued. In this paper we settle the scalar-valued case. We study the complexity of mean computation in finite dimensional sequence spaces with mixed $L_p^N$ norms. We determine the $n$-th minimal errors in the randomized adaptive and non-adaptive setting. It turns out that among the problems considered there are examples where adaptive and non-adaptive $n$-th minimal errors deviate by a power of $n$. The gap can be (up to log factors) of the order $n^{1/4}$. We also show how to turn such results into infinite dimensional examples with suitable deviation for all $n$ simultaneously.
Different scheduling mechanisms in Time Sensitive Networking (TSN) can be integrated together to design and support complex architectures with enhanced capabilities for mixed critical networks. Integrating Frame Preemption (FP) with Credit-Based Shaper (CBS) and Gate Control List (GCL) opens up different modes and configuration choices resulting in a complex evaluation of several possibilities and their impact on the Quality of Service (QoS). In this paper, we implement and quantify the integration of preemptive CBS with GCL by incorporating FP into the architecture. Our experiments show that the end-to-end delay of Audio Video Bridging (AVB) flows shaped by CBS reduces significantly (up to 40\%) when AVB flows are set to preemptable class. We further show that the jitter of Time Triggered (TT) traffic remains unaffected in "with Hold/Release" mode. Furthermore, we propose to introduce Guardband (GB) in the "without Hold/Release" to reduce the jitter of the TT flow. We compare all the different integration modes, starting with CBS with GCL, extending it further to FP. We evaluate all feasible combinations in both synthetic and realistic scenarios and offer recommendations for practical configuration methods.
Recent years have witnessed a rapid development of deep generative models for creating synthetic media, such as images and videos. While the practical applications of these models in everyday tasks are enticing, it is crucial to assess the inherent risks regarding their fairness. In this work, we introduce a comprehensive framework for benchmarking the performance and fairness of conditional generative models. We develop a set of metrics$\unicode{x2013}$inspired by their supervised fairness counterparts$\unicode{x2013}$to evaluate the models on their fairness and diversity. Focusing on the specific application of image upsampling, we create a benchmark covering a wide variety of modern upsampling methods. As part of the benchmark, we introduce UnfairFace, a subset of FairFace that replicates the racial distribution of common large-scale face datasets. Our empirical study highlights the importance of using an unbiased training set and reveals variations in how the algorithms respond to dataset imbalances. Alarmingly, we find that none of the considered methods produces statistically fair and diverse results.
A new method of detecting adversarial attacks is proposed for an ensemble of Deep Neural Networks (DNNs) solving two-class pattern recognition problems. The ensemble is combined using Walsh coefficients which are capable of approximating Boolean functions and thereby controlling the complexity of the ensemble decision boundary. The hypothesis in this paper is that decision boundaries with high curvature allow adversarial perturbations to be found, but change the curvature of the decision boundary, which is then approximated in a different way by Walsh coefficients compared to the clean images. By observing the difference in Walsh coefficient approximation between clean and adversarial images, it is shown experimentally that transferability of attack may be used for detection. Furthermore, approximating the decision boundary may aid in understanding the learning and transferability properties of DNNs. While the experiments here use images, the proposed approach of modelling two-class ensemble decision boundaries could in principle be applied to any application area. Code for approximating Boolean functions using Walsh coefficients: //doi.org/10.24433/CO.3695905.v1
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.