亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider the efficient estimation of total causal effects in the presence of unmeasured confounding using conditional instrumental sets. Specifically, we consider the two-stage least squares estimator in the setting of a linear structural equation model with correlated errors that is compatible with a known acyclic directed mixed graph. To set the stage for our results, we characterize the class of linearly valid conditional instrumental sets that yield consistent two-stage least squares estimators for the target total effect and derive a new asymptotic variance formula for these estimators. Equipped with these results, we provide three graphical tools for selecting more efficient linearly valid conditional instrumental sets. First, a graphical criterion that for certain pairs of linearly valid conditional instrumental sets identifies which of the two corresponding estimators has the smaller asymptotic variance. Second, an algorithm that greedily adds covariates that reduce the asymptotic variance to a given linearly valid conditional instrumental set. Third, a linearly valid conditional instrumental set for which the corresponding estimator has the smallest asymptotic variance that can be ensured with a graphical criterion.

相關內容

Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $d\mu_\phi \propto e^{-\phi} \mathrm{dvol}_g $ on a compact Riemannian manifold. Two estimators of linear functionals of $\mu_\phi $ based on the discretized Markov process are considered: a time-averaging estimator based on a single trajectory and an ensemble-averaging estimator based on multiple independent trajectories. Imposing no restrictions beyond a nominal level of smoothness on $\phi$, first-order error bounds, in discretization step size, on the bias and variances of both estimators are derived. The order of error matches the optimal rate in Euclidean and flat spaces, and leads to a first-order bound on distance between the invariant measure $\mu_\phi$ and a stationary measure of the discretized Markov process. Generality of the proof techniques, which exploit links between two partial differential equations and the semigroup of operators corresponding to the Langevin diffusion, renders them amenable for the study of a more general class of sampling algorithms related to the Langevin diffusion. Conditions for extending analysis to the case of non-compact manifolds are discussed. Numerical illustrations with distributions, log-concave and otherwise, on the manifolds of positive and negative curvature elucidate on the derived bounds and demonstrate practical utility of the sampling algorithm.

Bias correction can often improve the finite sample performance of estimators. We show that the choice of bias correction method has no effect on the higher-order variance of semiparametrically efficient parametric estimators, so long as the estimate of the bias is asymptotically linear. It is also shown that bootstrap, jackknife, and analytical bias estimates are asymptotically linear for estimators with higher-order expansions of a standard form. In particular, we find that for a variety of estimators the straightforward bootstrap bias correction gives the same higher-order variance as more complicated analytical or jackknife bias corrections. In contrast, bias corrections that do not estimate the bias at the parametric rate, such as the split-sample jackknife, result in larger higher-order variances in the i.i.d. setting we focus on. For both a cross-sectional MLE and a panel model with individual fixed effects, we show that the split-sample jackknife has a higher-order variance term that is twice as large as that of the `leave-one-out' jackknife.

Many interesting physical problems described by systems of hyperbolic conservation laws are stiff, and thus impose a very small time-step because of the restrictive CFL stability condition. In this case, one can exploit the superior stability properties of implicit time integration which allows to choose the time-step only from accuracy requirements, and thus avoid the use of small time-steps. We discuss an efficient framework to devise high order implicit schemes for stiff hyperbolic systems without tailoring it to a specific problem. The nonlinearity of high order schemes, due to space- and time-limiting procedures which control nonphysical oscillations, makes the implicit time integration difficult, e.g.~because the discrete system is nonlinear also on linear problems. This nonlinearity of the scheme is circumvented as proposed in (Puppo et al., Comm.~Appl.~Math.~\& Comput., 2023) for scalar conservation laws, where a first order implicit predictor is computed to freeze the nonlinear coefficients of the essentially non-oscillatory space reconstruction, and also to assist limiting in time. In addition, we propose a novel conservative flux-centered a-posteriori time-limiting procedure using numerical entropy indicators to detect troubled cells. The numerical tests involve classical and artificially devised stiff problems using the Euler's system of gas-dynamics.

The complexity of a well-quasi-order (wqo) can be measured through three classical ordinal invariants: the width as a measure of antichains, the height as a measure of chains, and the maximal order type as a measure of bad sequences. This article considers the "finitary powerset" construction: the collection Pf(X) of finite subsets of a wqo X ordered with the Hoare embedding relation remains a wqo. The width, height and maximal order type of Pf(X) cannot be expressed as a function of the invariants of X, and we provide tight upper and lower bounds for the three invariants. The article also identifies an algebra of well-behaved wqos, that include finitary powersets as well as other more classical constructions, and for which the ordinal invariants can be computed compositionnally. This relies on a new ordinal invariant called the approximated maximal order type.

We present new results on average causal effects in settings with unmeasured exposure-outcome confounding. Our results are motivated by a class of estimands, e.g., frequently of interest in medicine and public health, that are currently not targeted by standard approaches for average causal effects. We recognize these estimands as queries about the average causal effect of an intervening variable. We anchor our introduction of these estimands in an investigation of the role of chronic pain and opioid prescription patterns in the opioid epidemic, and illustrate how conventional approaches will lead unreplicable estimates with ambiguous policy implications. We argue that our altenative effects are replicable and have clear policy implications, and furthermore are non-parametrically identified by the classical frontdoor formula. As an independent contribution, we derive a new semiparametric efficient estimator of the frontdoor formula with a uniform sample boundedness guarantee. This property is unique among previously-described estimators in its class, and we demonstrate superior performance in finite-sample settings. Theoretical results are applied with data from the National Health and Nutrition Examination Survey.

Deep neural networks (DNNs) often fail silently with over-confident predictions on out-of-distribution (OOD) samples, posing risks in real-world deployments. Existing techniques predominantly emphasize either the feature representation space or the gradient norms computed with respect to DNN parameters, yet they overlook the intricate gradient distribution and the topology of classification regions. To address this gap, we introduce GRadient-aware Out-Of-Distribution detection in interpolated manifolds (GROOD), a novel framework that relies on the discriminative power of gradient space to distinguish between in-distribution (ID) and OOD samples. To build this space, GROOD relies on class prototypes together with a prototype that specifically captures OOD characteristics. Uniquely, our approach incorporates a targeted mix-up operation at an early intermediate layer of the DNN to refine the separation of gradient spaces between ID and OOD samples. We quantify OOD detection efficacy using the distance to the nearest neighbor gradients derived from the training set, yielding a robust OOD score. Experimental evaluations substantiate that the introduction of targeted input mix-upamplifies the separation between ID and OOD in the gradient space, yielding impressive results across diverse datasets. Notably, when benchmarked against ImageNet-1k, GROOD surpasses the established robustness of state-of-the-art baselines. Through this work, we establish the utility of leveraging gradient spaces and class prototypes for enhanced OOD detection for DNN in image classification.

Conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence, because much of the power and energy is consumed by constant data transfers between logic and memory modules. A new paradigm, called "computational random-access memory (CRAM)" has emerged to address this fundamental limitation. CRAM performs logic operations directly using the memory cells themselves, without having the data ever leave the memory. The energy and performance benefits of CRAM for both conventional and emerging applications have been well established by prior numerical studies. However, there lacks an experimental demonstration and study of CRAM to evaluate its computation accuracy, which is a realistic and application-critical metrics for its technological feasibility and competitiveness. In this work, a CRAM array based on magnetic tunnel junctions (MTJs) is experimentally demonstrated. First, basic memory operations as well as 2-, 3-, and 5-input logic operations are studied. Then, a 1-bit full adder with two different designs is demonstrated. Based on the experimental results, a suite of modeling has been developed to characterize the accuracy of CRAM computation. Further analysis of scalar addition, multiplication, and matrix multiplication shows promising results. These results are then applied to a complete application: a neural network based handwritten digit classifier, as an example to show the connection between the application performance and further MTJ development. The classifier achieved almost-perfect classification accuracy, with reasonable projections of future MTJ development. With the confirmation of MTJ-based CRAM's accuracy, there is a strong case that this technology will have a significant impact on power- and energy-demanding applications of machine intelligence.

May's Theorem [K. O. May, Econometrica 20 (1952) 680-684] characterizes majority voting on two alternatives as the unique preferential voting method satisfying several simple axioms. Here we show that by adding some desirable axioms to May's axioms, we can uniquely determine how to vote on three alternatives. In particular, we add two axioms stating that the voting method should mitigate spoiler effects and avoid the so-called strong no show paradox. We prove a theorem stating that any preferential voting method satisfying our enlarged set of axioms, which includes some weak homogeneity and preservation axioms, agrees with Minimax voting in all three-alternative elections, except perhaps in some improbable knife-edged elections in which ties may arise and be broken in different ways.

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.

北京阿比特科技有限公司