Bayesian inference is often utilized for uncertainty quantification tasks. A recent analysis by Xu and Raginsky 2022 rigorously decomposed the predictive uncertainty in Bayesian inference into two uncertainties, called aleatoric and epistemic uncertainties, which represent the inherent randomness in the data-generating process and the variability due to insufficient data, respectively. They analyzed those uncertainties in an information-theoretic way, assuming that the model is well-specified and treating the model's parameters as latent variables. However, the existing information-theoretic analysis of uncertainty cannot explain the widely believed property of uncertainty, known as the sensitivity between the test and training data. It implies that when test data are similar to training data in some sense, the epistemic uncertainty should become small. In this work, we study such uncertainty sensitivity using our novel decomposition method for the predictive uncertainty. Our analysis successfully defines such sensitivity using information-theoretic quantities. Furthermore, we extend the existing analysis of Bayesian meta-learning and show the novel sensitivities among tasks for the first time.
Content Security Policy (CSP) is an effective security mechanism that prevents the exploitation of Cross-Site Scripting (XSS) vulnerabilities on websites by specifying the sources from which their web pages can load resources, such as scripts and styles. CSP nonces enable websites to allow the execution of specific inline scripts and styles without relying on a whitelist. In this study, we measure and analyze the use of CSP nonces in the wild, specifically looking for nonce reuse, short nonces, and invalid nonces. We find that, of the 2271 sites that deploy a nonce-based policy, 598 of them reuse the same nonce value in more than one response, potentially enabling attackers to bypass protection offered by the CSP against XSS attacks. We analyze the causes of the nonce reuses to identify whether they are introduced by the server-side code or if the nonces are being cached by web caches. Moreover, we investigate whether nonces are only reused within the same session or for different sessions, as this impacts the effectiveness of CSP in preventing XSS attacks. Finally, we discuss the possibilities for attackers to bypass the CSP and achieve XSS in different nonce reuse scenarios.
The concept of updating a probability distribution in the light of new evidence lies at the heart of statistics and machine learning. Pearl's and Jeffrey's rule are two natural update mechanisms which lead to different outcomes, yet the similarities and differences remain mysterious. This paper clarifies their relationship in several ways: via separate descriptions of the two update mechanisms in terms of probabilistic programs and sampling semantics, and via different notions of likelihood (for Pearl and for Jeffrey). Moreover, it is shown that Jeffrey's update rule arises via variational inference. In terms of categorical probability theory, this amounts to an analysis of the situation in terms of the behaviour of the multiset functor, extended to the Kleisli category of the distribution monad.
In this paper, we develop a domain-decomposition method for the generalized Poisson-Boltzmann equation based on a solvent-excluded surface which is widely used in computational chemistry. The solver requires to solve a generalized screened Poisson (GSP) equation defined in $\mathbb{R}^3$ with a space-dependent dielectric permittivity and an ion-exclusion function that accounts for Steric effects. Potential theory arguments transform the GSP equation into two-coupled equations defined in a bounded domain. Then, the Schwarz decomposition method is used to formulate local problems by decomposing the cavity into overlapping balls and only solving a set of coupled sub-equations in each ball in which, the spherical harmonics and the Legendre polynomials are used as basis functions in the angular and radial directions. A series of numerical experiments are presented to test the method.
Perfect paradefinite algebras are De Morgan algebras expanded with a perfection (or classicality) operation. They form a variety that is term-equivalent to the variety of involutive Stone algebras. Their associated multiple-conclusion (Set-Set) and single-conclusion (Set-Fmla) order-preserving logics are non-algebraizable self-extensional logics of formal inconsistency and undeterminedness determined by a six-valued matrix, studied in depth by Gomes et al. (2022) from both the algebraic and the proof-theoretical perspectives. We continue hereby that study by investigating directions for conservatively expanding these logics with an implication connective (essentially, one that admits the deduction-detachment theorem). We first consider logics given by very simple and manageable non-deterministic semantics whose implication (in isolation) is classical. These, nevertheless, fail to be self-extensional. We then consider the implication realized by the relative pseudo-complement over the six-valued perfect paradefinite algebra. Our strategy is to expand such algebra with this connective and study the (self-extensional) Set-Set and Set-Fmla order-preserving logics, as well as the T-assertional logics of the variety induced by the new algebra. We provide axiomatizations for such new variety and for such logics, drawing parallels with the class of symmetric Heyting algebras and with Moisil's `symmetric modal logic'. For the Set-Set logic, in particular, the axiomatization we obtain is analytic. We close by studying interpolation properties for these logics and concluding that the new variety has the Maehara amalgamation property.
Identifiability of discrete statistical models with latent variables is known to be challenging to study, yet crucial to a model's interpretability and reliability. This work presents a general algebraic technique to investigate identifiability of complicated discrete models with latent and graphical components. Specifically, motivated by diagnostic tests collecting multivariate categorical data, we focus on discrete models with multiple binary latent variables. In the considered model, the latent variables can have arbitrary dependencies among themselves while the latent-to-observed measurement graph takes a "star-forest" shape. We establish necessary and sufficient graphical criteria for identifiability, and reveal an interesting and perhaps surprising phenomenon of blessing-of-dependence geometry: under the minimal conditions for generic identifiability, the parameters are identifiable if and only if the latent variables are not statistically independent. Thanks to this theory, we can perform formal hypothesis tests of identifiability in the boundary case by testing certain marginal independence of the observed variables. Our results give new understanding of statistical properties of graphical models with latent variables. They also entail useful implications for designing diagnostic tests or surveys that measure binary latent traits.
Affective polarization is more than mere antagonism as it appears when negative interactions happen mostly across political divisions. Research in polarization usually assumes a given definition of political divisions or conflates polarization and disagreement as the same phenomenon. Leveraging on novel data sources of positive and negative online interactions, we present a method to computationally discover the fault lines of an online community with minimal assumptions on the dividing issues. This enables us to unpack two factors of polarization: Antagonism, which is the general prevalence of hostility in online interaction, and Alignment, which captures how negative relations exist across groups (divisiveness) while positive interactions are contained within (cohesiveness). We apply our approach to Birdwatch, a US-based Twitter fact-checking community, and to the discussion forums of DerStandard, an Austrian online newspaper. Our results reveal that both communities are divided into two large groups and that their separation follows political identities and topics. We can identify issues across various combinations of antagonism and alignment in DerStandard, evidencing that these two metrics are not equivalent. Our methods provide a time-resolved picture that illustrates the separate contribution of cohesiveness and divisiveness and the role of controversial elections and events in the dynamics of alignment.
This paper investigates the properties of Quasi Maximum Likelihood estimation of an approximate factor model for an $n$-dimensional vector of stationary time series. We prove that the factor loadings estimated by Quasi Maximum Likelihood are asymptotically equivalent, as $n\to\infty$, to those estimated via Principal Components. Both estimators are, in turn, also asymptotically equivalent, as $n\to\infty$, to the unfeasible Ordinary Least Squares estimator we would have if the factors were observed. We also show that the usual sandwich form of the asymptotic covariance matrix of the Quasi Maximum Likelihood estimator is asymptotically equivalent to the simpler asymptotic covariance matrix of the unfeasible Ordinary Least Squares. These results hold in the general case in which the idiosyncratic components are cross-sectionally heteroskedastic, as well as serially and cross-sectionally weakly correlated. This paper provides a simple solution to computing the Quasi Maximum Likelihood estimator and its asymptotic confidence intervals without the need of running any iterated algorithm, whose convergence properties are unclear, and estimating the Hessian and Fisher information matrices, whose expressions are very complex.
The efficacy of numerical methods like integral estimates via Gaussian quadrature formulas depends on the localization of the zeros of the associated family of orthogonal polynomials. In this regard, following the renewed interest in quadrature formulas on the unit circle, and $R_{II}$-type polynomials, which include the complementary Romanovski-Routh polynomials, in this work we present a collection of properties of their zeros. Our results include extreme bounds, convexity, and density, alongside the connection of such polynomials to classical orthogonal polynomials via asymptotic formulas.
Online machine learning (ML) is often used in self-adaptive systems to strengthen the adaptation mechanism and improve the system utility. Despite such benefits, applying online ML for self-adaptation can be challenging, and not many papers report its limitations. Recently, we experimented with applying online ML for self-adaptation of a smart farming scenario and we had faced several unexpected difficulties -- traps -- that, to our knowledge, are not discussed enough in the community. In this paper, we report our experience with these traps. Specifically, we discuss several traps that relate to the specification and online training of the ML-based estimators, their impact on self-adaptation, and the approach used to evaluate the estimators. Our overview of these traps provides a list of lessons learned, which can serve as guidance for other researchers and practitioners when applying online ML for self-adaptation.
Large language models (LLMs) like ChatGPT have revealed amazing intelligence. How to evaluate the question-solving abilities of LLMs and their degrees of intelligence is a hot-spot but challenging issue. First, the question-solving abilities are interlaced with different ability branches like understanding and massive knowledge categories like mathematics. Second, the inputs of questions are multimodal that may involve text and images. Third, the response format of LLMs is diverse and thus poses great challenges for result extraction and evaluation. In this paper, we propose AGIBench -- a multi-granularity, multimodal, human-referenced, and auto-scoring benchmarking methodology for LLMs. Instead of a collection of blended questions, AGIBench focuses on three typical ability branches and adopts a four-tuple <ability branch, knowledge, difficulty, modal> to label the attributes of each question. First, it supports multi-granularity benchmarking, e.g., per-question, per-ability branch, per-knowledge, per-modal, per-dataset, and per-difficulty level granularities. Second, it contains multimodal input, including text and images. Third, it classifies all the questions into five degrees of difficulty according to the average accuracy rate of abundant educated humans (human-referenced). Fourth, it adopts zero-shot learning to avoid introducing additional unpredictability and provides an auto-scoring method to extract and judge the result. Finally, it defines multi-dimensional metrics, including accuracy under the average, worst, best, and majority voting cases, and repeatability. AGIBench is publically available from \url{//www.benchcouncil.org/agibench}.