The problem of binary hypothesis testing between two probability measures is considered. New sharp bounds are derived for the best achievable error probability of such tests based on independent and identically distributed observations. Specifically, the asymmetric version of the problem is examined, where different requirements are placed on the two error probabilities. Accurate nonasymptotic expansions with explicit constants are obtained for the error probability, using tools from large deviations and Gaussian approximation. Examples are shown indicating that, in the asymmetric regime, the approximations suggested by the new bounds are significantly more accurate than the approximations provided by either of the two main earlier approaches -- normal approximation and error exponents.
Error control by means of a posteriori error estimators or indica-tors and adaptive discretizations, such as adaptive mesh refinement, have emerged in the late seventies. Since then, numerous theoretical developments and improvements have been made, as well as the first attempts to introduce them into real-life industrial applications. The present introductory chapter provides an overview of the subject, highlights some of the achievements to date and discusses possible perspectives.
Many well-known logical identities are naturally written as equivalences between contextual formulas. A simple example is the Boole-Shannon expansion $c[p] \equiv (p \wedge c[\mathrm{true}] ) \vee (\neg\, p \wedge c[\mathrm{false}] )$, where $c$ denotes an arbitrary formula with possibly multiple occurrences of a "hole", called a context, and $c[\varphi]$ denotes the result of "filling" all holes of $c$ with the formula $\varphi$. Another example is the unfolding rule $\mu X. c[X] \equiv c[\mu X. c[X]]$ of the modal $\mu$-calculus. We consider the modal $\mu$-calculus as overarching temporal logic and, as usual, reduce the problem whether $\varphi_1 \equiv \varphi_2$ holds for contextual formulas $\varphi_1, \varphi_2$ to the problem whether $\varphi_1 \leftrightarrow \varphi_2$ is valid . We show that the problem whether a contextual formula of the $\mu$-calculus is valid for all contexts can be reduced to validity of ordinary formulas. Our first result constructs a canonical context such that a formula is valid for all contexts if{}f it is valid for this particular one. However, the ordinary formula is exponential in the nesting-depth of the context variables. In a second result we solve this problem, thus proving that validity of contextual formulas is EXP-complete, as for ordinary equivalences. We also prove that both results hold for CTL and LTL as well. We conclude the paper with some experimental results. In particular, we use our implementation to automatically prove the correctness of a set of six contextual equivalences of LTL recently introduced by Esparza et al. for the normalization of LTL formulas. While Esparza et al. need several pages of manual proof, our tool only needs milliseconds to do the job and to compute counterexamples for incorrect variants of the equivalences.
We consider the following generalization of the bin packing problem. We are given a set of items each of which is associated with a rational size in the interval [0,1], and a monotone non-decreasing non-negative cost function f defined over the cardinalities of the subsets of items. A feasible solution is a partition of the set of items into bins subject to the constraint that the total size of items in every bin is at most 1. Unlike bin packing, the goal function is to minimize the total cost of the bins where the cost of a bin is the value of f applied on the cardinality of the subset of items packed into the bin. We present an APTAS for this strongly NP-hard problem. We also provide a complete complexity classification of the problem with respect to the choice of f.
A new approach based on censoring and moment criterion is introduced for parameter estimation of count distributions when the probability generating function is available even though a closed form of the probability mass function and/or finite moments do not exist.
Mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased. Misclassification is especially difficult to deal with when it is differential and when there are no gold standard labels available. Previous work has addressed this problem using a sensitivity analysis framework or by assuming that misclassification rates are known. We leverage a variable related to the misclassification mechanism to recover unbiased parameter estimates without using gold standard labels. The proposed methods require the reasonable assumption that the sum of the sensitivity and specificity is greater than 1. Three correction methods are presented: (1) an ordinary least squares correction for Normal outcome models, (2) a multi-step predictive value weighting method, and (3) a seamless expectation-maximization algorithm. We apply our misclassification correction strategies to investigate the mediating role of gestational hypertension on the association between maternal age and pre-term birth.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.
A number of recent studies have proposed that linear representations are appropriate for solving nonlinear dynamical systems with quantum computers, which fundamentally act linearly on a wave function in a Hilbert space. Linear representations, such as the Koopman representation and Koopman von Neumann mechanics, have regained attention from the dynamical-systems research community. Here, we aim to present a unified theoretical framework, currently missing in the literature, with which one can compare and relate existing methods, their conceptual basis, and their representations. We also aim to show that, despite the fact that quantum simulation of nonlinear classical systems may be possible with such linear representations, a necessary projection into a feasible finite-dimensional space will in practice eventually induce numerical artifacts which can be hard to eliminate or even control. As a result, a practical, reliable and accurate way to use quantum computation for solving general nonlinear dynamical systems is still an open problem.
Discrete choice models with non-monotonic response functions are important in many areas of application, especially political sciences and marketing. This paper describes a novel unfolding model for binary data that allows for heavy-tailed shocks to the underlying utilities. One of our key contributions is a Markov chain Monte Carlo algorithm that requires little or no parameter tuning, fully explores the support of the posterior distribution, and can be used to fit various extensions of our core model that involve (Bayesian) hypothesis testing on the latent construct. Our empirical evaluations of the model and the associated algorithm suggest that they provide better complexity-adjusted fit to voting data from the United States House of Representatives.
We investigate the set of invariant idempotent probabilities for countable idempotent iterated function systems (IFS) defined in compact metric spaces. We demonstrate that, with constant weights, there exists a unique invariant idempotent probability. Utilizing Secelean's approach to countable IFSs, we introduce partially finite idempotent IFSs and prove that the sequence of invariant idempotent measures for these systems converges to the invariant measure of the original countable IFS. We then apply these results to approximate such measures with discrete systems, producing, in the one-dimensional case, data series whose Higuchi fractal dimension can be calculated. Finally, we provide numerical approximations for two-dimensional cases and discuss the application of generalized Higuchi dimensions in these scenarios.
We propose an extremely versatile approach to address a large family of matrix nearness problems, possibly with additional linear constraints. Our method is based on splitting a matrix nearness problem into two nested optimization problems, of which the inner one can be solved either exactly or cheaply, while the outer one can be recast as an unconstrained optimization task over a smooth real Riemannian manifold. We observe that this paradigm applies to many matrix nearness problems of practical interest appearing in the literature, thus revealing that they are equivalent in this sense to a Riemannian optimization problem. We also show that the objective function to be minimized on the Riemannian manifold can be discontinuous, thus requiring regularization techniques, and we give conditions for this to happen. Finally, we demonstrate the practical applicability of our method by implementing it for a number of matrix nearness problems that are relevant for applications and are currently considered very demanding in practice. Extensive numerical experiments demonstrate that our method often greatly outperforms its predecessors, including algorithms specifically designed for those particular problems.