In this work, we show that text-to-image generative models can be 'inverted' to assess their own text-image understanding capabilities in a completely automated manner. Our method, called SelfEval, uses the generative model to compute the likelihood of real images given text prompts, making the generative model directly applicable to discriminative tasks. Using SelfEval, we repurpose standard datasets created for evaluating multimodal text-image discriminative models to evaluate generative models in a fine-grained manner: assessing their performance on attribute binding, color recognition, counting, shape recognition, spatial understanding. To the best of our knowledge SelfEval is the first automated metric to show a high degree of agreement for measuring text-faithfulness with the gold-standard human evaluations across multiple models and benchmarks. Moreover, SelfEval enables us to evaluate generative models on challenging tasks such as Winoground image-score where they demonstrate competitive performance to discriminative models. We also show severe drawbacks of standard automated metrics such as CLIP-score to measure text faithfulness on benchmarks such as DrawBench, and how SelfEval sidesteps these issues. We hope SelfEval enables easy and reliable automated evaluation for diffusion models.
Evidence Networks can enable Bayesian model comparison when state-of-the-art methods (e.g. nested sampling) fail and even when likelihoods or priors are intractable or unknown. Bayesian model comparison, i.e. the computation of Bayes factors or evidence ratios, can be cast as an optimization problem. Though the Bayesian interpretation of optimal classification is well-known, here we change perspective and present classes of loss functions that result in fast, amortized neural estimators that directly estimate convenient functions of the Bayes factor. This mitigates numerical inaccuracies associated with estimating individual model probabilities. We introduce the leaky parity-odd power (l-POP) transform, leading to the novel ``l-POP-Exponential'' loss function. We explore neural density estimation for data probability in different models, showing it to be less accurate and scalable than Evidence Networks. Multiple real-world and synthetic examples illustrate that Evidence Networks are explicitly independent of dimensionality of the parameter space and scale mildly with the complexity of the posterior probability density function. This simple yet powerful approach has broad implications for model inference tasks. As an application of Evidence Networks to real-world data we compute the Bayes factor for two models with gravitational lensing data of the Dark Energy Survey. We briefly discuss applications of our methods to other, related problems of model comparison and evaluation in implicit inference settings.
In this work, we provide a simulation algorithm to simulate from a (multivariate) characteristic function, which is only accessible in a black-box format. We construct a generative neural network, whose loss function exploits a specific representation of the Maximum-Mean-Discrepancy metric to directly incorporate the targeted characteristic function. The construction is universal in the sense that it is independent of the dimension and that it does not require any assumptions on the given characteristic function. Furthermore, finite sample guarantees on the approximation quality in terms of the Maximum-Mean Discrepancy metric are derived. The method is illustrated in a short simulation study.
To handle the large scale of whole slide images in computational pathology, most approaches first tessellate the images into smaller patches, extract features from these patches, and finally aggregate the feature vectors with weakly-supervised learning. The performance of this workflow strongly depends on the quality of the extracted features. Recently, foundation models in computer vision showed that leveraging huge amounts of data through supervised or self-supervised learning improves feature quality and generalizability for a variety of tasks. In this study, we benchmark the most popular vision foundation models as feature extractors for histopathology data. We evaluate the models in two settings: slide-level classification and patch-level classification. We show that foundation models are a strong baseline. Our experiments demonstrate that by finetuning a foundation model on a single GPU for only two hours or three days depending on the dataset, we can match or outperform state-of-the-art feature extractors for computational pathology. These findings imply that even with little resources one can finetune a feature extractor tailored towards a specific downstream task and dataset. This is a considerable shift from the current state, where only few institutions with large amounts of resources and datasets are able to train a feature extractor. We publish all code used for training and evaluation as well as the finetuned models.
In this work we investigate a 1D evolution equation involving a divergence form operator where the diffusion coefficient inside the divergence is changing sign, as in models for metamaterials.We focus on the construction of a fundamental solution for the evolution equation,which does not proceed as in the case of standard parabolic PDE's, since the associatedsecond order operator is not elliptic. We show that a spectral representation of the semigroup associated to the equation can be derived, which leads to a first expression of the fundamental solution. We also derive a probabilistic representation in terms of a pseudo Skew Brownian Motion (SBM).This construction generalizes that derived from the killed SBM when the diffusion coefficientis piecewise constant but remains positive.We show that the pseudo SBM can be approached by a rescaled pseudo asymmetric random walk,which allows us to derive several numerical schemes for the resolution of the PDEand we report the associated numerical test results.
In this paper, we propose R\'enyi information generating function (RIGF) and discuss its various properties. The relation between the RIGF and Shannon entropy of order q > 0 is established. Several bounds are obtained. The RIGF of escort distribution is derived. Furthermore, we introduce R\'enyi divergence information generating function (RDIGF) and show its effect under monotone transformations. Finally, we propose Jensen-R\'enyi information generating function (JRIGF) and introduce its several properties.
In this work, we introduce Regularity Structures B-series which are used for describing solutions of singular stochastic partial differential equations (SPDEs). We define composition and substitutions of these B-series and as in the context of B-series for ordinary differential equations, these operations can be rewritten via products and Hopf algebras which have been used for building up renormalised models. These models provide a suitable topology for solving singular SPDEs. This new construction sheds a new light on these products and open interesting perspectives for the study of singular SPDEs in connection with B-series.
Within Bayesian nonparametrics, dependent Dirichlet process mixture models provide a highly flexible approach for conducting inference about the conditional density function. However, several formulations of this class make either rather restrictive modelling assumptions or involve intricate algorithms for posterior inference, thus preventing their widespread use. In response to these challenges, we present a flexible, versatile, and computationally tractable model for density regression based on a single-weights dependent Dirichlet process mixture of normal distributions model for univariate continuous responses. We assume an additive structure for the mean of each mixture component and incorporate the effects of continuous covariates through smooth nonlinear functions. The key components of our modelling approach are penalised B-splines and their bivariate tensor product extension. Our proposed method also seamlessly accommodates parametric effects of categorical covariates, linear effects of continuous covariates, interactions between categorical and/or continuous covariates, varying coefficient terms, and random effects, which is why we refer our model as a Dirichlet process mixture of normal structured additive regression models. A noteworthy feature of our method is its efficiency in posterior simulation through Gibbs sampling, as closed-form full conditional distributions for all model parameters are available. Results from a simulation study demonstrate that our approach successfully recovers true conditional densities and other regression functionals in various challenging scenarios. Applications to a toxicology, disease diagnosis, and agricultural study are provided and further underpin the broad applicability of our modelling framework. An R package, \texttt{DDPstar}, implementing the proposed method is publicly available at \url{//bitbucket.org/mxrodriguez/ddpstar}.
In a topology optimization setting, design-dependent fluidic pressure loads pose several challenges as their direction, magnitude, and location alter with topology evolution. This paper offers a compact 100-line MATLAB code, TOPress, for topology optimization of structures subjected to fluidic pressure loads using the method of moving asymptotes. The code is intended for pedagogical purposes and aims to ease the beginners' and students' learning toward topology optimization with design-dependent fluidic pressure loads. TOPress is developed per the approach first reported in Kumar et al. (Struct Multidisc Optim 61(4):1637-1655, 2020). The Darcy law, in conjunction with the drainage term, is used to model the applied pressure load. The consistent nodal loads are determined from the obtained pressure field. The employed approach facilitates inexpensive computation of the load sensitivities using the adjoint-variable method. Compliance minimization subject to volume constraint optimization problems are solved. The success and efficacy of the code are demonstrated by solving benchmark numerical examples involving pressure loads, wherein the importance of load sensitivities is also demonstrated. TOPress contains six main parts, is described in detail, and is extended to solve different problems. Steps to include a projection filter are provided to achieve loadbearing designs close to~0-1. The code is provided in Appendix~B and can also be downloaded along with its extensions from \url{//github.com/PrabhatIn/TOPress}.
This work, in a pioneering approach, attempts to build a biometric system that works purely based on the fluid mechanics governing exhaled breath. We test the hypothesis that the structure of turbulence in exhaled human breath can be exploited to build biometric algorithms. This work relies on the idea that the extrathoracic airway is unique for every individual, making the exhaled breath a biomarker. Methods including classical multi-dimensional hypothesis testing approach and machine learning models are employed in building user authentication algorithms, namely user confirmation and user identification. A user confirmation algorithm tries to verify whether a user is the person they claim to be. A user identification algorithm tries to identify a user's identity with no prior information available. A dataset of exhaled breath time series samples from 94 human subjects was used to evaluate the performance of these algorithms. The user confirmation algorithms performed exceedingly well for the given dataset with over $97\%$ true confirmation rate. The machine learning based algorithm achieved a good true confirmation rate, reiterating our understanding of why machine learning based algorithms typically outperform classical hypothesis test based algorithms. The user identification algorithm performs reasonably well with the provided dataset with over $50\%$ of the users identified as being within two possible suspects. We show surprisingly unique turbulent signatures in the exhaled breath that have not been discovered before. In addition to discussions on a novel biometric system, we make arguments to utilise this idea as a tool to gain insights into the morphometric variation of extrathoracic airway across individuals. Such tools are expected to have future potential in the area of personalised medicines.
Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets.