We present Lilac, a separation logic for reasoning about probabilistic programs where separating conjunction captures probabilistic independence. Inspired by an analogy with mutable state where sampling corresponds to dynamic allocation, we show how probability spaces over a fixed, ambient sample space appear to be the natural analogue of heap fragments, and present a new combining operation on them such that probability spaces behave like heaps and measurability of random variables behaves like ownership. This combining operation forms the basis for our model of separation, and produces a logic with many pleasant properties. In particular, Lilac has a frame rule identical to the ordinary one, and naturally accommodates advanced features like continuous random variables and reasoning about quantitative properties of programs. Then we propose a new modality based on disintegration theory for reasoning about conditional probability. We show how the resulting modal logic validates examples from prior work, and give a formal verification of an intricate weighted sampling algorithm whose correctness depends crucially on conditional independence structure.
Probabilistic couplings are the foundation for many probabilistic relational program logics and arise when relating random sampling statements across two programs. In relational program logics, this manifests as dedicated coupling rules that, e.g., say we may reason as if two sampling statements return the same value. However, this approach fundamentally requires aligning or "synchronizing" the sampling statements of the two programs which is not always possible. In this paper, we develop Clutch, a higher-order probabilistic relational separation logic that addresses this issue by supporting asynchronous probabilistic couplings. We use Clutch to develop a logical step-indexed logical relational to reason about contextual refinement and equivalence of higher-order programs written in a rich language with higher-order local state and impredicative polymorphism. Finally, we demonstrate the usefulness of our approach on a number of case studies. All the results that appear in the paper have been formalized in the Coq proof assistant using the Coquelicot library and the Iris separation logic framework.
We explore the methodology and theory of reward-directed generation via conditional diffusion models. Directed generation aims to generate samples with desired properties as measured by a reward function, which has broad applications in generative AI, reinforcement learning, and computational biology. We consider the common learning scenario where the data set consists of unlabeled data along with a smaller set of data with noisy reward labels. Our approach leverages a learned reward function on the smaller data set as a pseudolabeler. From a theoretical standpoint, we show that this directed generator can effectively learn and sample from the reward-conditioned data distribution. Additionally, our model is capable of recovering the latent subspace representation of data. Moreover, we establish that the model generates a new population that moves closer to a user-specified target reward value, where the optimality gap aligns with the off-policy bandit regret in the feature subspace. The improvement in rewards obtained is influenced by the interplay between the strength of the reward signal, the distribution shift, and the cost of off-support extrapolation. We provide empirical results to validate our theory and highlight the relationship between the strength of extrapolation and the quality of generated samples.
Lane detection plays a pivotal role in the field of autonomous vehicles and advanced driving assistant systems (ADAS). Over the years, numerous algorithms have emerged, spanning from rudimentary image processing techniques to sophisticated deep neural networks. The performance of deep learning-based models is highly dependent on the quality of their training data. Consequently, these models often experience a decline in performance when confronted with challenging scenarios such as extreme lighting conditions, partially visible lane markings, and sparse lane markings like Botts' dots. To address this, we present an end-to-end lane detection and classification system based on deep learning methodologies. In our study, we introduce a unique dataset meticulously curated to encompass scenarios that pose significant challenges for state-of-the-art (SOTA) models. Through fine-tuning selected models, we aim to achieve enhanced localization accuracy. Moreover, we propose a CNN-based classification branch, seamlessly integrated with the detector, facilitating the identification of distinct lane types. This architecture enables informed lane-changing decisions and empowers more resilient ADAS capabilities. We also investigate the effect of using mixed precision training and testing on different models and batch sizes. Experimental evaluations conducted on the widely-used TuSimple dataset, Caltech lane dataset, and our LVLane dataset demonstrate the effectiveness of our model in accurately detecting and classifying lanes amidst challenging scenarios. Our method achieves state-of-the-art classification results on the TuSimple dataset. The code of the work will be published upon the acceptance of the paper.
We introduce a new method of estimation of parameters in semiparametric and nonparametric models. The method is based on estimating equations that are $U$-statistics in the observations. The $U$-statistics are based on higher order influence functions that extend ordinary linear influence functions of the parameter of interest, and represent higher derivatives of this parameter. For parameters for which the representation cannot be perfect the method leads to a bias-variance trade-off, and results in estimators that converge at a slower than $\sqrt n$-rate. In a number of examples the resulting rate can be shown to be optimal. We are particularly interested in estimating parameters in models with a nuisance parameter of high dimension or low regularity, where the parameter of interest cannot be estimated at $\sqrt n$-rate, but we also consider efficient $\sqrt n$-estimation using novel nonlinear estimators. The general approach is applied in detail to the example of estimating a mean response when the response is not always observed.
Parametricity is a property of the syntax of type theory implying e.g. that there is only one function having the type of the polymorphic identity function. Parametricity is usually proven externally, and does not hold internally. Internalising it is difficult because once there is a term witnessing parametricity, it also has to be parametric itself and this results in the appearance of higher dimensional cubes. In previous theories with internal parametricity, either an explicit syntax for higher cubes is present or the theory is extended with a new sort for the interval. In this paper we present a type theory with internal parametricity which is a simple extension of Martin-L\"of type thoery: there are a few new type formers, term formers and equations. Geometry is not explicit in this syntax, but emergent: the new operations and equations only refer to objects up to dimension 3. We show that this theory is modelled by presheaves over the BCH cube category. Fibrancy conditions are not needed because we use span-based rather than relational parametricity. We define a gluing model for this theory implying that external parametricity and canonicity hold. The theory can be seen as a special case of a new kind of modal type theory, and it is the simplest setting in which the computational properties of higher observational type theory can be demonstrated.
Standard recognition approaches are unable to deal with novel categories at test time. Their overconfidence on the known classes makes the predictions unreliable for safety-critical applications such as healthcare or autonomous driving. Out-Of-Distribution (OOD) detection methods provide a solution by identifying semantic novelty. Most of these methods leverage a learning stage on the known data, which means training (or fine-tuning) a model to capture the concept of normality. This process is clearly sensitive to the amount of available samples and might be computationally expensive for on-board systems. A viable alternative is that of evaluating similarities in the embedding space produced by large pre-trained models without any further learning effort. We focus exactly on such a fine-tuning-free OOD detection setting. This works presents an in-depth analysis of the recently introduced relational reasoning pre-training and investigates the properties of the learned embedding, highlighting the existence of a correlation between the inter-class feature distance and the OOD detection accuracy. As the class separation depends on the chosen pre-training objective, we propose an alternative loss function to control the inter-class margin, and we show its advantage with thorough experiments.
In this work, we propose a novel approach called Operational Support Estimator Networks (OSENs) for the support estimation task. Support Estimation (SE) is defined as finding the locations of non-zero elements in a sparse signal. By its very nature, the mapping between the measurement and sparse signal is a non-linear operation. Traditional support estimators rely on computationally expensive iterative signal recovery techniques to achieve such non-linearity. Contrary to the convolution layers, the proposed OSEN approach consists of operational layers that can learn such complex non-linearities without the need for deep networks. In this way, the performance of the non-iterative support estimation is greatly improved. Moreover, the operational layers comprise so-called generative \textit{super neurons} with non-local kernels. The kernel location for each neuron/feature map is optimized jointly for the SE task during the training. We evaluate the OSENs in three different applications: i. support estimation from Compressive Sensing (CS) measurements, ii. representation-based classification, and iii. learning-aided CS reconstruction where the output of OSENs is used as prior knowledge to the CS algorithm for an enhanced reconstruction. Experimental results show that the proposed approach achieves computational efficiency and outperforms competing methods, especially at low measurement rates by a significant margin. The software implementation is publicly shared at //github.com/meteahishali/OSEN.
The number of modes in a probability density function is representative of the model's complexity and can also be viewed as the number of existing subpopulations. Despite its relevance, little research has been devoted to its estimation. Focusing on the univariate setting, we propose a novel approach targeting prediction accuracy inspired by some overlooked aspects of the problem. We argue for the need for structure in the solutions, the subjective and uncertain nature of modes, and the convenience of a holistic view blending global and local density properties. Our method builds upon a combination of flexible kernel estimators and parsimonious compositional splines. Feature exploration, model selection and mode testing are implemented in the Bayesian inference paradigm, providing soft solutions and allowing to incorporate expert judgement in the process. The usefulness of our proposal is illustrated through a case study in sports analytics, showcasing multiple companion visualisation tools. A thorough simulation study demonstrates that traditional modality-driven approaches paradoxically struggle to provide accurate results. In this context, our method emerges as a top-tier alternative offering innovative solutions for analysts.
In Gaussian graphical models, the likelihood equations must typically be solved iteratively. We investigate two algorithms: A version of iterative proportional scaling which avoids inversion of large matrices, resulting in increased speed when graphs are sparse and we compare this to an algorithm based on convex duality and operating on the covariance matrix by neighbourhood coordinate descent, essentially corresponding to the graphical lasso with zero penalty. For large, sparse graphs, this version of the iterative proportional scaling algorithm appears feasible and has simple convergence properties. The algorithm based on neighbourhood coordinate descent is extremely fast and less dependent on sparsity, but needs a positive definite starting value to converge, which may be difficult to achieve when the number of variables exceeds the number of observations.
Spatio-temporal forecasting is challenging attributing to the high nonlinearity in temporal dynamics as well as complex location-characterized patterns in spatial domains, especially in fields like weather forecasting. Graph convolutions are usually used for modeling the spatial dependency in meteorology to handle the irregular distribution of sensors' spatial location. In this work, a novel graph-based convolution for imitating the meteorological flows is proposed to capture the local spatial patterns. Based on the assumption of smoothness of location-characterized patterns, we propose conditional local convolution whose shared kernel on nodes' local space is approximated by feedforward networks, with local representations of coordinate obtained by horizon maps into cylindrical-tangent space as its input. The established united standard of local coordinate system preserves the orientation on geography. We further propose the distance and orientation scaling terms to reduce the impacts of irregular spatial distribution. The convolution is embedded in a Recurrent Neural Network architecture to model the temporal dynamics, leading to the Conditional Local Convolution Recurrent Network (CLCRN). Our model is evaluated on real-world weather benchmark datasets, achieving state-of-the-art performance with obvious improvements. We conduct further analysis on local pattern visualization, model's framework choice, advantages of horizon maps and etc.