Given a boolean formula $\Phi$(X, Y, Z), the Max\#SAT problem asks for finding a partial model on the set of variables X, maximizing its number of projected models over the set of variables Y. We investigate a strict generalization of Max\#SAT allowing dependencies for variables in X, effectively turning it into a synthesis problem. We show that this new problem, called DQMax\#SAT, subsumes both the DQBF and DSSAT problems. We provide a general resolution method, based on a reduction to Max\#SAT, together with two improvements for dealing with its inherent complexity. We further discuss a concrete application of DQMax\#SAT for symbolic synthesis of adaptive attackers in the field of program security. Finally, we report preliminary results obtained on the resolution of benchmark problems using a prototype DQMax\#SAT solver implementation.
Given a graph $G=(V,E)$ and an integer $k$, the Cluster Editing problem asks whether we can transform $G$ into a union of vertex-disjoint cliques by at most $k$ modifications (edge deletions or insertions). In this paper, we study the following variant of Cluster Editing. We are given a graph $G=(V,E)$, a packing $\cal H$ of modification-disjoint induced $P_3$s (no pair of $P_3$s in $\cal H$ share an edge or non-edge) and an integer $\ell$. The task is to decide whether $G$ can be transformed into a union of vertex-disjoint cliques by at most $\ell+|\cal H|$ modifications (edge deletions or insertions). We show that this problem is NP-hard even when $\ell=0$ (in which case the problem asks to turn $G$ into a disjoint union of cliques by performing exactly one edge deletion or insertion per element of $\cal H$) and when each vertex is in at most 23 $P_3$s of the packing. This answers negatively a question of van Bevern, Froese, and Komusiewicz (CSR 2016, ToCS 2018), repeated by C. Komusiewicz at Shonan meeting no. 144 in March 2019. We then initiate the study to find the largest integer $c$ such that the problem remains tractable when restricting to packings such that each vertex is in at most $c$ packed $P_3$s. Here packed $P_3$s are those belonging to the packing $\cal H$. Van Bevern et al. showed that the case $c = 1$ is fixed-parameter tractable with respect to $\ell$ and we show that the case $c = 2$ is solvable in $|V|^{2\ell + O(1)}$ time.
Generative diffusion models have achieved spectacular performance in many areas of generative modeling. While the fundamental ideas behind these models come from non-equilibrium physics, in this paper we show that many aspects of these models can be understood using the tools of equilibrium statistical mechanics. Using this reformulation, we show that generative diffusion models undergo second-order phase transitions corresponding to symmetry breaking phenomena. We argue that this lead to a form of instability that lies at the heart of their generative capabilities and that can be described by a set of mean field critical exponents. We conclude by analyzing recent work connecting diffusion models and associative memory networks in view of the thermodynamic formulations.
We address speech enhancement based on variational autoencoders, which involves learning a speech prior distribution in the time-frequency (TF) domain. A zero-mean complex-valued Gaussian distribution is usually assumed for the generative model, where the speech information is encoded in the variance as a function of a latent variable. In contrast to this commonly used approach, we propose a weighted variance generative model, where the contribution of each spectrogram time-frame in parameter learning is weighted. We impose a Gamma prior distribution on the weights, which would effectively lead to a Student's t-distribution instead of Gaussian for speech generative modeling. We develop efficient training and speech enhancement algorithms based on the proposed generative model. Our experimental results on spectrogram auto-encoding and speech enhancement demonstrate the effectiveness and robustness of the proposed approach compared to the standard unweighted variance model.
We present a streamlined and simplified exponential lower bound on the length of proofs in intuitionistic implicational logic, adapted to Gordeev and Haeusler's dag-like natural deduction.
A common approach to evaluating the significance of a collection of $p$-values combines them with a pooling function, in particular when the original data are not available. These pooled $p$-values convert a sample of $p$-values into a single number which behaves like a univariate $p$-value. To clarify discussion of these functions, a telescoping series of alternative hypotheses are introduced that communicate the strength and prevalence of non-null evidence in the $p$-values before general pooling formulae are discussed. A pattern noticed in the UMP pooled $p$-value for a particular alternative motivates the definition and discussion of central and marginal rejection levels at $\alpha$. It is proven that central rejection is always greater than or equal to marginal rejection, motivating a quotient to measure the balance between the two for pooled $p$-values. A combining function based on the $\chi^2_{\kappa}$ quantile transformation is proposed to control this quotient and shown to be robust to mis-specified parameters relative to the UMP. Different powers for different parameter settings motivate a map of plausible alternatives based on where this pooled $p$-value is minimized.
We consider the fast in-place computation of the Euclidean polynomial modular remainder $R(X) \not\equiv A(X) \mod B(X)$ with $A$ and $B$ of respective degrees n and m $\le$ n. If the multiplication of two polynomials of degree $k$ can be performed with $M(k)$ operations and $O(k)$ extra space, then standard algorithms for the remainder require $O(n/m M(m))$ arithmetic operations and, apart from that of $A$ and $B$, at least $O(n-m)$ extra memory. This extra space is notably usually used to store the whole quotient $Q(X)$ such that $A = BQ + R$ with deg $R$ < deg $B$. We avoid the storage of the whole of this quotient, and propose an algorithm still using $O(n/m M(m))$ arithmetic operations but only $O(m)$ extra space.When the divisor $B$ is sparse with a constant number of non-zero terms, the arithmetic complexity bound reduces to $O(n)$. When it is allowed to use the input space of $A$ or $B$ for intermediate computations, but putting $A$ and $B$ back to their initial states after the completion of the remainder computation, we further propose an in-place algorithm (that is with its extra required space reduced to $O(1)$ only) using at most $O(n/m M(m) \log(m))$ arithmetic operations if $M(m)$ is quasi-linear and $O(n/m M(m))$ otherwise. We also propose variants that compute -- still in-place and with the same complexity bounds -- the over-place remainder $A(X) \not\equiv A(X) \mod B(X)$ and the accumulated remainder $R(X) +\not\equiv A(X) \mod B(X)$. To achieve this, we develop techniques for Toeplitz matrix operations which output is also part of the input. In-place accumulating versions are obtained for the latter and for polynomial remaindering. This is realized via further reductions to accumulated polynomial multiplication, for which in-place fast algorithms have recently been developed.
Brown and Walker (1997) showed that GMRES determines a least squares solution of $ A x = b $ where $ A \in {\bf R}^{n \times n} $ without breakdown for arbitrary $ b, x_0 \in {\bf R}^n $ if and only if $A$ is range-symmetric, i.e. $ {\cal R} (A^{\rm T}) = {\cal R} (A) $, where $ A $ may be singular and $ b $ may not be in the range space ${\cal R} A)$ of $A$. In this paper, we propose applying GMRES to $ A C A^{\rm T} z = b $, where $ C \in {\bf R}^{n \times n} $ is symmetric positive definite. This determines a least squares solution $ x = CA^{\rm T} z $ of $ A x = b $ without breakdown for arbitrary (singular) matrix $A \in {\bf R}^{n \times n}$ and $ b \in {\bf R}^n $. To make the method numerically stable, we propose using the pseudoinverse with an appropriate threshold parameter to suppress the influence of tiny singular values when solving the severely ill-conditioned Hessenberg systems which arise in the Arnoldi process of GMRES when solving inconsistent range-symmetric systems. Numerical experiments show that the method taking $C$ to be the identity matrix gives the least squares solution even when $A$ is not range-symmetric, including the case when $ {\rm index}(A) >1$.
We describe Bayes factors functions based on z, t, $\chi^2$, and F statistics and the prior distributions used to define alternative hypotheses. The non-local alternative prior distributions are centered on standardized effects, which index the Bayes factor function. The prior densities include a dispersion parameter that models the variation of effect sizes across replicated experiments. We examine the convergence rates of Bayes factor functions under true null and true alternative hypotheses. Several examples illustrate the application of the Bayes factor functions to replicated experimental designs and compare the conclusions from these analyses to other default Bayes factor methods.
Classical inequality curves and inequality measures are defined for distributions with finite mean value. Moreover, their empirical counterparts are not resistant to outliers. For these reasons, quantile versions of known inequality curves such as the Lorenz, Bonferroni, Zenga and $D$ curves, and quantile versions of inequality measures such as the Gini, Bonferroni, Zenga and $D$ indices have been proposed in the literature. We propose various nonparametric estimators of quantile versions of inequality curves and inequality measures, prove their consistency, and compare their accuracy in a~simulation study. We also give examples of the use of quantile versions of inequality measures in real data analysis.
Let ${\mathcal P}$ be a family of probability measures on a measurable space $(S,{\mathcal A}).$ Given a Banach space $E,$ a functional $f:E\mapsto {\mathbb R}$ and a mapping $\theta: {\mathcal P}\mapsto E,$ our goal is to estimate $f(\theta(P))$ based on i.i.d. observations $X_1,\dots, X_n\sim P, P\in {\mathcal P}.$ In particular, if ${\mathcal P}=\{P_{\theta}: \theta\in \Theta\}$ is an identifiable statistical model with parameter set $\Theta\subset E,$ one can consider the mapping $\theta(P)=\theta$ for $P\in {\mathcal P}, P=P_{\theta},$ resulting in a problem of estimation of $f(\theta)$ based on i.i.d. observations $X_1,\dots, X_n\sim P_{\theta}, \theta\in \Theta.$ Given a smooth functional $f$ and estimators $\hat \theta_n(X_1,\dots, X_n), n\geq 1$ of $\theta(P),$ we use these estimators, the sample split and the Taylor expansion of $f(\theta(P))$ of a proper order to construct estimators $T_f(X_1,\dots, X_n)$ of $f(\theta(P)).$ For these estimators and for a functional $f$ of smoothness $s\geq 1,$ we prove upper bounds on the $L_p$-errors of estimator $T_f(X_1,\dots, X_n)$ under certain moment assumptions on the base estimators $\hat \theta_n.$ We study the performance of estimators $T_f(X_1,\dots, X_n)$ in several concrete problems, showing their minimax optimality and asymptotic efficiency. In particular, this includes functional estimation in high-dimensional models with many low dimensional components, functional estimation in high-dimensional exponential families and estimation of functionals of covariance operators in infinite-dimensional subgaussian models.