亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In settings where interference between units is possible, we define the prevalance of indirect effects to be the number of units who are affected by the treatment of others. This quantity does not fully identify an indirect effect, but may be used to show whether such effects are widely prevalent. Given a randomized experiment with binary-valued outcomes, methods are presented for conservative point estimation and one-sided interval estimation. No assumptions beyond randomization of treatment are required, allowing for usage in settings where models or assumptions on interference might be questionable. To show asymptotic coverage of our intervals in settings not covered by existing results, we provide a central limit theorem that combines local dependence and sampling without replacement. Consistency and minimax properties of the point estimator are shown as well. The approach is demonstrated on an experiment in which students were treated for a highly transmissible parasitic infection, for which we find that a significant fraction of students were affected by the treatment of schools other than their own.

相關內容

Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of $m$ new points, giving rise to $m$ conformal $p$-values. {While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function.} The results hold for arbitrary exchangeable scores, including {\it adaptive} ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.

Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples can be difficult through standard methods. Inference can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. In this paper, we develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in this threshold choice and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation. We apply our method to the well-known, troublesome example of the River Nidd dataset.

Difference-in-differences (DID) is a popular approach to identify the causal effects of treatments and policies in the presence of unmeasured confounding. DID identifies the sample average treatment effect in the treated (SATT). However, a goal of such research is often to inform decision-making in target populations outside the treated sample. Transportability methods have been developed to extend inferences from study samples to external target populations; these methods have primarily been developed and applied in settings where identification is based on conditional independence between the treatment and potential outcomes, such as in a randomized trial. This paper develops identification and estimators for effects in a target population, based on DID conducted in a study sample that differs from the target population. We present a range of assumptions under which one may identify causal effects in the target population and employ causal diagrams to illustrate these assumptions. In most realistic settings, results depend critically on the assumption that any unmeasured confounders are not effect measure modifiers on the scale of the effect of interest. We develop several estimators of transported effects, including a doubly robust estimator based on the efficient influence function. Simulation results support theoretical properties of the proposed estimators. We discuss the potential application of our approach to a study of the effects of a US federal smoke-free housing policy, where the original study was conducted in New York City alone and the goal is extend inferences to other US cities.

Generative diffusion models have achieved spectacular performance in many areas of generative modeling. While the fundamental ideas behind these models come from non-equilibrium physics, in this paper we show that many aspects of these models can be understood using the tools of equilibrium statistical mechanics. Using this reformulation, we show that generative diffusion models undergo second-order phase transitions corresponding to symmetry breaking phenomena. We argue that this lead to a form of instability that lies at the heart of their generative capabilities and that can be described by a set of mean field critical exponents. We conclude by analyzing recent work connecting diffusion models and associative memory networks in view of the thermodynamic formulations.

Bayesian optimal design of experiments is a well-established approach to planning experiments. Briefly, a probability distribution, known as a statistical model, for the responses is assumed which is dependent on a vector of unknown parameters. A utility function is then specified which gives the gain in information for estimating the true value of the parameters using the Bayesian posterior distribution. A Bayesian optimal design is given by maximising the expectation of the utility with respect to the joint distribution given by the statistical model and prior distribution for the true parameter values. The approach takes account of the experimental aim via specification of the utility and of all assumed sources of uncertainty via the expected utility. However, it is predicated on the specification of the statistical model. Recently, a new type of statistical inference, known as Gibbs (or General Bayesian) inference, has been advanced. This is Bayesian-like, in that uncertainty on unknown quantities is represented by a posterior distribution, but does not necessarily rely on specification of a statistical model. Thus the resulting inference should be less sensitive to misspecification of the statistical model. The purpose of this paper is to propose Gibbs optimal design: a framework for optimal design of experiments for Gibbs inference. The concept behind the framework is introduced along with a computational approach to find Gibbs optimal designs in practice. The framework is demonstrated on exemplars including linear models, and experiments with count and time-to-event responses.

Importance sampling is a popular technique in Bayesian inference: by reweighting samples drawn from a proposal distribution we are able to obtain samples and moment estimates from a Bayesian posterior over some $n$ latent variables. Recent work, however, indicates that importance sampling scales poorly -- in order to accurately approximate the true posterior, the required number of importance samples grows is exponential in the number of latent variables [Chatterjee and Diaconis, 2018]. Massively parallel importance sampling works around this issue by drawing $K$ samples for each of the $n$ latent variables and reasoning about all $K^n$ combinations of latent samples. In principle, we can reason efficiently over $K^n$ combinations of samples by exploiting conditional independencies in the generative model. However, in practice this requires complex algorithms that traverse backwards through the graphical model, and we need separate backward traversals for each computation (posterior expectations, marginals and samples). Our contribution is to exploit the source term trick from physics to entirely avoid the need to hand-write backward traversals. Instead, we demonstrate how to simply and easily compute all the required quantities -- posterior expectations, marginals and samples -- by differentiating through a slightly modified marginal likelihood estimator.

When multiple self-adaptive systems share the same environment and have common goals, they may coordinate their adaptations at runtime to avoid conflicts and to satisfy their goals. There are two approaches to coordination. (1) Logically centralized, where a supervisor has complete control over the individual self-adaptive systems. Such approach is infeasible when the systems have different owners or administrative domains. (2) Logically decentralized, where coordination is achieved through direct interactions. Because the individual systems have control over the information they share, decentralized coordination accommodates multiple administrative domains. However, existing techniques do not account simultaneously for both local concerns, e.g., preferences, and shared concerns, e.g., conflicts, which may lead to goals not being achieved as expected. Our idea to address this shortcoming is to express both types of concerns within the same constraint optimization problem. We propose CoADAPT, a decentralized coordination technique introducing two types of constraints: preference constraints, expressing local concerns, and consistency constraints, expressing shared concerns. At runtime, the problem is solved in a decentralized way using distributed constraint optimization algorithms implemented by each self-adaptive system. As a first step in realizing CoADAPT, we focus in this work on the coordination of adaptation planning strategies, traditionally addressed only with centralized techniques. We show the feasibility of CoADAPT in an exemplar from cloud computing and analyze experimentally its scalability.

The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel and Divol (2019) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it -- called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.

Classical inequality curves and inequality measures are defined for distributions with finite mean value. Moreover, their empirical counterparts are not resistant to outliers. For these reasons, quantile versions of known inequality curves such as the Lorenz, Bonferroni, Zenga and $D$ curves, and quantile versions of inequality measures such as the Gini, Bonferroni, Zenga and $D$ indices have been proposed in the literature. We propose various nonparametric estimators of quantile versions of inequality curves and inequality measures, prove their consistency, and compare their accuracy in a~simulation study. We also give examples of the use of quantile versions of inequality measures in real data analysis.

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

北京阿比特科技有限公司