亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

With continuous outcomes, the average causal effect is typically defined using a contrast of expected potential outcomes. However, in the presence of skewed outcome data, the expectation may no longer be meaningful and the definition of the causal effect should be considered more closely. When faced with this challenge in practice, the typical approach is to either "ignore or transform" - ignore the skewness in the data entirely or transform the outcome to obtain a more symmetric distribution for which the expectation is interpretable as the central value. However, neither approach is entirely satisfactory. An appealing alternative is to define the causal effect using a contrast of median potential outcomes, although there is limited discussion or availability of confounding-adjustment methods to estimate this parameter. Within this study, we described and compared confounding-adjustment methods to estimate the causal difference in medians, addressing this gap. The methods considered were multivariable quantile regression, an inverse probability weighted (IPW) estimator, weighted quantile regression and two possible, little-known implementations of g-computation. The methods were evaluated within a simulation study under varying degrees of skewness in the outcome and applied to an empirical study. Results indicated that the IPW estimator, weighted quantile regression and g-computation implementations minimised bias across all simulation settings, if the corresponding model was correctly specified, with g-computation additionally minimising the variance in estimates. The methods presented within this paper provide appealing alternatives to the common "ignore or transform" approach, enhancing our capability to obtain meaningful causal effect estimates with skewed outcome data.

相關內容

Online optimization is a well-established optimization paradigm that aims to make a sequence of correct decisions given knowledge of the correct answer to previous decision tasks. Bilevel programming involves a hierarchical optimization problem where the feasible region of the so-called outer problem is restricted by the graph of the solution set mapping of the inner problem. This paper brings these two ideas together and studies an online bilevel optimization setting in which a sequence of time-varying bilevel problems are revealed one after the other. We extend the known regret bounds for single-level online algorithms to the bilevel setting. Specifically, we introduce new notions of bilevel regret, develop an online alternating time-averaged gradient method that is capable of leveraging smoothness, and provide regret bounds in terms of the path-length of the inner and outer minimizer sequences.

We study the widely known Cubic-Newton method in the stochastic setting and propose a general framework to use variance reduction which we call the helper framework. In all previous work, these methods were proposed with very large batches (both in gradients and Hessians) and with various and often strong assumptions. In this work, we investigate the possibility of using such methods without large batches and use very simple assumptions that are sufficient for all our methods to work. In addition, we study these methods applied to gradient-dominated functions. In the general case, we show improved convergence (compared to first-order methods) to an approximate local minimum, and for gradient-dominated functions, we show convergence to approximate global minima.

Unobserved confounding is one of the main challenges when estimating causal effects. We propose a causal reduction method that, given a causal model, replaces an arbitrary number of possibly high-dimensional latent confounders with a single latent confounder that takes values in the same space as the treatment variable, without changing the observational and interventional distributions the causal model entails. This allows us to estimate the causal effect in a principled way from combined data without relying on the common but often unrealistic assumption that all confounders have been observed. We apply our causal reduction in three different settings. In the first setting, we assume the treatment and outcome to be discrete. The causal reduction then implies bounds between the observational and interventional distributions that can be exploited for estimation purposes. In certain cases with highly unbalanced observational samples, the accuracy of the causal effect estimate can be improved by incorporating observational data. Second, for continuous variables and assuming a linear-Gaussian model, we derive equality constraints for the parameters of the observational and interventional distributions. Third, for the general continuous setting (possibly nonlinear and non-Gaussian), we parameterize the reduced causal model using normalizing flows, a flexible class of easily invertible nonlinear transformations. We perform a series of experiments on synthetic data and find that in several cases the number of interventional samples can be reduced when adding observational training samples without sacrificing accuracy.

We study the power of uniform sampling for $k$-Median in various metric spaces. We relate the query complexity for approximating $k$-Median, to a key parameter of the dataset, called the balancedness $\beta \in (0, 1]$ (with $1$ being perfectly balanced). We show that any algorithm must make $\Omega(1 / \beta)$ queries to the point set in order to achieve $O(1)$-approximation for $k$-Median. This particularly implies existing constructions of coresets, a popular data reduction technique, cannot be query-efficient. On the other hand, we show a simple uniform sample of $\mathrm{poly}(k \epsilon^{-1} \beta^{-1})$ points suffices for $(1 + \epsilon)$-approximation for $k$-Median for various metric spaces, which nearly matches the lower bound. We conduct experiments to verify that in many real datasets, the balancedness parameter is usually well bounded, and that the uniform sampling performs consistently well even for the case with moderately large balancedness, which justifies that uniform sampling is indeed a viable approach for solving $k$-Median.

In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the dynamics in various time series. To address this problem, we propose a novel model-based time series clustering method with mixtures of linear Gaussian state space models, which have high flexibility. The proposed method uses a new expectation-maximization algorithm for the mixture model to estimate the model parameters, and determines the number of clusters using the Bayesian information criterion. Experiments on a simulated dataset demonstrate the effectiveness of the method in clustering, parameter estimation, and model selection. The method is applied to real datasets commonly used to evaluate time series clustering methods. Results showed that the proposed method produces clustering results that are as accurate or more accurate than those obtained using previous methods.

This work is motivated by the need to accurately model a vector of responses related to pediatric functional status using administrative health data from inpatient rehabilitation visits. The components of the responses have known and structured interrelationships. To make use of these relationships in modeling, we develop a two-pronged regularization approach to borrow information across the responses. The first component of our approach encourages joint selection of the effects of each variable across possibly overlapping groups related responses and the second component encourages shrinkage of effects towards each other for related responses. As the responses in our motivating study are not normally-distributed, our approach does not rely on an assumption of multivariate normality of the responses. We show that with an adaptive version of our penalty, our approach results in the same asymptotic distribution of estimates as if we had known in advance which variables were non-zero and which variables have the same effects across some outcomes. We demonstrate the performance of our method in extensive numerical studies and in an application in the prediction of functional status of pediatric patients using administrative health data in a population of children with neurological injury or illness at a large children's hospital.

Most existing evaluations of explainable machine learning (ML) methods rely on simplifying assumptions or proxies that do not reflect real-world use cases; the handful of more robust evaluations on real-world settings have shortcomings in their design, resulting in limited conclusions of methods' real-world utility. In this work, we seek to bridge this gap by conducting a study that evaluates three popular explainable ML methods in a setting consistent with the intended deployment context. We build on a previous study on e-commerce fraud detection and make crucial modifications to its setup relaxing the simplifying assumptions made in the original work that departed from the deployment context. In doing so, we draw drastically different conclusions from the earlier work and find no evidence for the incremental utility of the tested methods in the task. Our results highlight how seemingly trivial experimental design choices can yield misleading conclusions, with lessons about the necessity of not only evaluating explainable ML methods using tasks, data, users, and metrics grounded in the intended deployment contexts but also developing methods tailored to specific applications. In addition, we believe the design of this experiment can serve as a template for future study designs evaluating explainable ML methods in other real-world contexts.

Monitoring microbiological behaviors in water is crucial to manage public health risk from waterborne pathogens, although quantifying the concentrations of microbiological organisms in water is still challenging because concentrations of many pathogens in water samples may often be below the quantification limit, producing censoring data. To enable statistical analysis based on quantitative values, the true values of non-detected measurements are required to be estimated with high precision. Tobit model is a well-known linear regression model for analyzing censored data. One drawback of the Tobit model is that only the target variable is allowed to be censored. In this study, we devised a novel extension of the classical Tobit model, called the \emph{multi-target Tobit model}, to handle multiple censored variables simultaneously by introducing multiple target variables. For fitting the new model, a numerical stable optimization algorithm was developed based on elaborate theories. Experiments conducted using several real-world water quality datasets provided an evidence that estimating multiple columns jointly gains a great advantage over estimating them separately.

Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment.

Estimating treatment effects plays a crucial role in causal inference, having many real-world applications like policy analysis and decision making. Nevertheless, estimating treatment effects in the longitudinal setting in the presence of hidden confounders remains an extremely challenging problem. Recently, there is a growing body of work attempting to obtain unbiased ITE estimates from time-dynamic observational data by ignoring the possible existence of hidden confounders. Additionally, many existing works handling hidden confounders are not applicable for continuous-time settings. In this paper, we extend the line of work focusing on deconfounding in the dynamic time setting in the presence of hidden confounders. We leverage recent advancements in neural differential equations to build a latent factor model using a stochastic controlled differential equation and Lipschitz constrained convolutional operation in order to continuously incorporate information about ongoing interventions and irregularly sampled observations. Experiments on both synthetic and real-world datasets highlight the promise of continuous time methods for estimating treatment effects in the presence of hidden confounders.

北京阿比特科技有限公司