亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Confounding variables are a recurrent challenge for causal discovery and inference. In many situations, complex causal mechanisms only manifest themselves in extreme events, or take simpler forms in the extremes. Stimulated by data on extreme river flows and precipitation, we introduce a new causal discovery methodology for heavy-tailed variables that allows the use of a known potential confounder as a covariate and allows its effect to be almost entirely removed when the variables have comparable tails, and also decreases it sufficiently to enable correct causal inference when the confounder has a heavier tail. We introduce a new parametric estimator for the existing causal tail coefficient and a permutation test. Simulations show that the methods work well and the ideas are applied to the motivating dataset.

相關內容

Undertaking causal inference with observational data is incredibly useful across a wide range of tasks including the development of medical treatments, advertisements and marketing, and policy making. There are two significant challenges associated with undertaking causal inference using observational data: treatment assignment heterogeneity (\textit{i.e.}, differences between the treated and untreated groups), and an absence of counterfactual data (\textit{i.e.}, not knowing what would have happened if an individual who did get treatment, were instead to have not been treated). We address these two challenges by combining structured inference and targeted learning. In terms of structure, we factorize the joint distribution into risk, confounding, instrumental, and miscellaneous factors, and in terms of targeted learning, we apply a regularizer derived from the influence curve in order to reduce residual bias. An ablation study is undertaken, and an evaluation on benchmark datasets demonstrates that TVAE has competitive and state of the art performance.

PageRank is a fundamental property of graph and there have been plenty of PageRank algorithms. Generally, we consider undirected graph as a complicated directed graph. However, some properties of undirected graph, such as symmetry, are ignored when computing PageRank by existing algorithms. In this paper, we propose a parallel PageRank algorithm which is specially for undirected graph. We first demonstrate that the PageRank vector can be viewed as a linear combination of eigenvectors of probability transition matrix and the corresponding coefficients are the functions of eigenvalues. Then we introduce the Chebyshev polynomial approximation by which PageRank vector can be computed iteratively. Finally, we propose the parallel PageRank algorithm as the Chebyshev polynomial approximating algorithm(CPAA). Experimental results show that CPAA only takes 60% of iteration rounds of the power method and is at least 4 times faster than the power method.

We investigate several rank-based change-point procedures for the covariance operator in a sequence of observed functions, called FKWC change-point procedures. Our methods allow the user to test for one change-point, to test for an epidemic period, or to detect an unknown amount of change-points in the data. Our methodology combines functional data depth values with the traditional Kruskal Wallis test statistic. By taking this approach we have no need to estimate the covariance operator, which makes our methods computationally cheap. For example, our procedure can identify multiple change-points in $O(n\log n)$ time. Our procedure is fully non-parametric and is robust to outliers through the use of data depth ranks. We show that when $n$ is large, our methods have simple behaviour under the null hypothesis.We also show that the FKWC change-point procedures are $n^{-1/2}$-consistent. In addition to asymptotic results, we provide a finite sample accuracy result for our at-most-one change-point estimator. In simulation, we compare our methods against several others. We also present an application of our methods to intraday asset returns and f-MRI scans.

Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

Current deep learning research is dominated by benchmark evaluation. A method is regarded as favorable if it empirically performs well on the dedicated test set. This mentality is seamlessly reflected in the resurfacing area of continual learning, where consecutively arriving sets of benchmark data are investigated. The core challenge is framed as protecting previously acquired representations from being catastrophically forgotten due to the iterative parameter updates. However, comparison of individual methods is nevertheless treated in isolation from real world application and typically judged by monitoring accumulated test set performance. The closed world assumption remains predominant. It is assumed that during deployment a model is guaranteed to encounter data that stems from the same distribution as used for training. This poses a massive challenge as neural networks are well known to provide overconfident false predictions on unknown instances and break down in the face of corrupted data. In this work we argue that notable lessons from open set recognition, the identification of statistically deviating data outside of the observed dataset, and the adjacent field of active learning, where data is incrementally queried such that the expected performance gain is maximized, are frequently overlooked in the deep learning era. Based on these forgotten lessons, we propose a consolidated view to bridge continual learning, active learning and open set recognition in deep neural networks. Our results show that this not only benefits each individual paradigm, but highlights the natural synergies in a common framework. We empirically demonstrate improvements when alleviating catastrophic forgetting, querying data in active learning, selecting task orders, while exhibiting robust open world application where previously proposed methods fail.

The inaccessibility of controlled randomized trials due to inherent constraints in many fields of science has been a fundamental issue in causal inference. In this paper, we focus on distinguishing the cause from effect in the bivariate setting under limited observational data. Based on recent developments in meta learning as well as in causal inference, we introduce a novel generative model that allows distinguishing cause and effect in the small data setting. Using a learnt task variable that contains distributional information of each dataset, we propose an end-to-end algorithm that makes use of similar training datasets at test time. We demonstrate our method on various synthetic as well as real-world data and show that it is able to maintain high accuracy in detecting directions across varying dataset sizes.

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Predictive models can fail to generalize from training to deployment environments because of dataset shift, posing a threat to model reliability and the safety of downstream decisions made in practice. Instead of using samples from the target distribution to reactively correct dataset shift, we use graphical knowledge of the causal mechanisms relating variables in a prediction problem to proactively remove relationships that do not generalize across environments, even when these relationships may depend on unobserved variables (violations of the "no unobserved confounders" assumption). To accomplish this, we identify variables with unstable paths of statistical influence and remove them from the model. We also augment the causal graph with latent counterfactual variables that isolate unstable paths of statistical influence, allowing us to retain stable paths that would otherwise be removed. Our experiments demonstrate that models that remove vulnerable variables and use estimates of the latent variables transfer better, often outperforming in the target domain despite some accuracy loss in the training domain.

Recommender systems (RSs) provide an effective way of alleviating the information overload problem by selecting personalized items for different users. Latent factors based collaborative filtering (CF) has become the popular approaches for RSs due to its accuracy and scalability. Recently, online social networks and user-generated content provide diverse sources for recommendation beyond ratings. Although {\em social matrix factorization} (Social MF) and {\em topic matrix factorization} (Topic MF) successfully exploit social relations and item reviews, respectively, both of them ignore some useful information. In this paper, we investigate the effective data fusion by combining the aforementioned approaches. First, we propose a novel model {\em \mbox{MR3}} to jointly model three sources of information (i.e., ratings, item reviews, and social relations) effectively for rating prediction by aligning the latent factors and hidden topics. Second, we incorporate the implicit feedback from ratings into the proposed model to enhance its capability and to demonstrate its flexibility. We achieve more accurate rating prediction on real-life datasets over various state-of-the-art methods. Furthermore, we measure the contribution from each of the three data sources and the impact of implicit feedback from ratings, followed by the sensitivity analysis of hyperparameters. Empirical studies demonstrate the effectiveness and efficacy of our proposed model and its extension.

北京阿比特科技有限公司