亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We propose a generalization of the standard matched pairs design in which experimental units (often geographic regions or geos) may be combined into larger units/regions called "supergeos" in order to improve the average matching quality. Unlike optimal matched pairs design which can be found in polynomial time (Lu et al. 2011), this generalized matching problem is NP-hard. We formulate it as a mixed-integer program (MIP) and show that experimental design obtained by solving this MIP can often provide a significant improvement over the standard design regardless of whether the treatment effects are homogeneous or heterogeneous. Furthermore, we present the conditions under which trimming techniques that often improve performance in the case of homogeneous effects (Chen and Au, 2022), may lead to biased estimates and show that the proposed design does not introduce such bias. We use empirical studies based on real-world advertising data to illustrate these findings.

相關內容

設計是對(dui)現有狀的(de)一種重(zhong)新(xin)認識和打破(po)重(zhong)組(zu)的(de)過(guo)程,設計讓一切變得(de)更美。

Storage-efficient privacy-preserving learning is crucial due to increasing amounts of sensitive user data required for modern learning tasks. We propose a framework for reducing the storage cost of user data while at the same time providing privacy guarantees, without essential loss in the utility of the data for learning. Our method comprises noise injection followed by lossy compression. We show that, when appropriately matching the lossy compression to the distribution of the added noise, the compressed examples converge, in distribution, to that of the noise-free training data as the sample size of the training data (or the dimension of the training data) increases. In this sense, the utility of the data for learning is essentially maintained, while reducing storage and privacy leakage by quantifiable amounts. We present experimental results on the CelebA dataset for gender classification and find that our suggested pipeline delivers in practice on the promise of the theory: the individuals in the images are unrecognizable (or less recognizable, depending on the noise level), overall storage of the data is substantially reduced, with no essential loss (and in some cases a slight boost) to the classification accuracy. As an added bonus, our experiments suggest that our method yields a substantial boost to robustness in the face of adversarial test data.

The concepts of sparsity, and regularised estimation, have proven useful in many high-dimensional statistical applications. Dynamic factor models (DFMs) provide a parsimonious approach to modelling high-dimensional time series, however, it is often hard to interpret the meaning of the latent factors. This paper formally introduces a class of sparse DFMs whereby the loading matrices are constrained to have few non-zero entries, thus increasing interpretability of factors. We present a regularised M-estimator for the model parameters, and construct an efficient expectation maximisation algorithm to enable estimation. Synthetic experiments demonstrate consistency in terms of estimating the loading structure, and superior predictive performance where a low-rank factor structure may be appropriate. The utility of the method is further illustrated in an application forecasting electricity consumption across a large set of smart meters.

Semantically coherent out-of-distribution (SCOOD) detection aims to discern outliers from the intended data distribution with access to unlabeled extra set. The coexistence of in-distribution and out-of-distribution samples will exacerbate the model overfitting when no distinction is made. To address this problem, we propose a novel uncertainty-aware optimal transport scheme. Our scheme consists of an energy-based transport (ET) mechanism that estimates the fluctuating cost of uncertainty to promote the assignment of semantic-agnostic representation, and an inter-cluster extension strategy that enhances the discrimination of semantic property among different clusters by widening the corresponding margin distance. Furthermore, a T-energy score is presented to mitigate the magnitude gap between the parallel transport and classifier branches. Extensive experiments on two standard SCOOD benchmarks demonstrate the above-par OOD detection performance, outperforming the state-of-the-art methods by a margin of 27.69% and 34.4% on FPR@95, respectively.

This paper proposes a methodology for discovering meaningful properties in data by exploring the latent space of unsupervised deep generative models. We combine manipulation of individual latent variables to extreme values outside the training range with methods inspired by causal inference into an approach we call causal disentanglement with extreme values (CDEV) and show that this approach yields insights for model interpretability. Using this technique, we can infer what properties of unknown data the model encodes as meaningful. We apply the methodology to test what is meaningful in the communication system of sperm whales, one of the most intriguing and understudied animal communication systems. We train a network that has been shown to learn meaningful representations of speech and test whether we can leverage such unsupervised learning to decipher the properties of another vocal communication system for which we have no ground truth. The proposed technique suggests that sperm whales encode information using the number of clicks in a sequence, the regularity of their timing, and audio properties such as the spectral mean and the acoustic regularity of the sequences. Some of these findings are consistent with existing hypotheses, while others are proposed for the first time. We also argue that our models uncover rules that govern the structure of communication units in the sperm whale communication system and apply them while generating innovative data not shown during training. This paper suggests that an interpretation of the outputs of deep neural networks with causal methodology can be a viable strategy for approaching data about which little is known and presents another case of how deep learning can limit the hypothesis space. Finally, the proposed approach combining latent space manipulation and causal inference can be extended to other architectures and arbitrary datasets.

The numerical solution of partial differential equations (PDEs) is difficult, having led to a century of research so far. Recently, there have been pushes to build neural--numerical hybrid solvers, which piggy-backs the modern trend towards fully end-to-end learned systems. Most works so far can only generalize over a subset of properties to which a generic solver would be faced, including: resolution, topology, geometry, boundary conditions, domain discretization regularity, dimensionality, etc. In this work, we build a solver, satisfying these properties, where all the components are based on neural message passing, replacing all heuristically designed components in the computation graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. In order to encourage stability in training autoregressive models, we put forward a method that is based on the principle of zero-stability, posing stability as a domain adaptation problem. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.

Many physical problems involving heterogeneous spatial scales, such as the flow through fractured porous media, the study of fiber-reinforced materials, or the modeling of the small circulation in living tissues -- just to mention a few examples -- can be described as coupled partial differential equations defined in domains of heterogeneous dimensions that are embedded into each other. This formulation is a consequence of geometric model reduction techniques that transform the original problems defined in complex three-dimensional domains into more tractable ones. The definition and the approximation of coupling operators suitable for this class of problems is still a challenge. We develop a general mathematical framework for the analysis and the approximation of partial differential equations coupled by non-matching constraints across different dimensions, focusing on their enforcement using Lagrange multipliers. In this context, we address in abstract and general terms the well-posedness, stability, and robustness of the problem with respect to the smallest characteristic length of the embedded domain. We also address the numerical approximation of the problem and we discuss the inf-sup stability of the proposed numerical scheme for some representative configuration of the embedded domain. The main message of this work is twofold: from the standpoint of the theory of mixed-dimensional problems, we provide general and abstract mathematical tools to formulate coupled problems across dimensions. From the practical standpoint of the numerical approximation, we show the interplay between the mesh characteristic size, the dimension of the Lagrange multiplier space, and the size of the inclusion in representative configurations interesting for applications. The latter analysis is complemented with illustrative numerical examples.

This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by non-ignorable cluster sizes we mean that "large'' clusters and "small'' clusters may be heterogeneous, and, in particular, the effects of the treatment may vary across clusters of differing sizes. In order to permit this sort of flexibility, we consider a sampling framework in which cluster sizes themselves are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using a covariate-adaptive stratified randomization procedure. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study and empirical demonstration show the practical relevance of our theoretical results.

Selection of covariates is crucial in the estimation of average treatment effects given observational data with high or even ultra-high dimensional pretreatment variables. Existing methods for this problem typically assume sparse linear models for both outcome and univariate treatment, and cannot handle situations with ultra-high dimensional covariates. In this paper, we propose a new covariate selection strategy called double screening prior adaptive lasso (DSPAL) to select confounders and predictors of the outcome for multivariate treatments, which combines the adaptive lasso method with the marginal conditional (in)dependence prior information to select target covariates, in order to eliminate confounding bias and improve statistical efficiency. The distinctive features of our proposal are that it can be applied to high-dimensional or even ultra-high dimensional covariates for multivariate treatments, and can deal with the cases of both parametric and nonparametric outcome models, which makes it more robust compared to other methods. Our theoretical analyses show that the proposed procedure enjoys the sure screening property, the ranking consistency property and the variable selection consistency. Through a simulation study, we demonstrate that the proposed approach selects all confounders and predictors consistently and estimates the multivariate treatment effects with smaller bias and mean squared error compared to several alternatives under various scenarios. In real data analysis, the method is applied to estimate the causal effect of a three-dimensional continuous environmental treatment on cholesterol level and enlightening results are obtained.

Gradient-based learning in multi-layer neural networks displays a number of striking features. In particular, the decrease rate of empirical risk is non-monotone even after averaging over large batches. Long plateaus in which one observes barely any progress alternate with intervals of rapid decrease. These successive phases of learning often take place on very different time scales. Finally, models learnt in an early phase are typically `simpler' or `easier to learn' although in a way that is difficult to formalize. Although theoretical explanations of these phenomena have been put forward, each of them captures at best certain specific regimes. In this paper, we study the gradient flow dynamics of a wide two-layer neural network in high-dimension, when data are distributed according to a single-index model (i.e., the target function depends on a one-dimensional projection of the covariates). Based on a mixture of new rigorous results, non-rigorous mathematical derivations, and numerical simulations, we propose a scenario for the learning dynamics in this setting. In particular, the proposed evolution exhibits separation of timescales and intermittency. These behaviors arise naturally because the population gradient flow can be recast as a singularly perturbed dynamical system.

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

北京阿比特科技有限公司