亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Zero-inflated explanatory variables are common in fields such as ecology and finance. In this paper we address the problem of having excess of zero values in some explanatory variables which are subject to multioutcome lasso-regularized variable selection. Briefly, the problem results from the failure of the lasso-type of shrinkage methods to recognize any difference between zero value occurring either in the regression coefficient or in the corresponding value of the explanatory variable. This kind of confounding will obviously increase number of false positives - all non-zero regression coefficients do not necessarily represent real outcome effects. We present here the adaptive LAD-lasso for multiple outcomes which extends the earlier work of multivariate LAD-lasso with adaptive penalization. In addition of well known property of having less biased regression coefficients, we show here how the adaptivity improves also method's ability to recover from influences of excess of zero values measured in continuous covariates.

相關內容

We propose a new, more general definition of extended probability measures. We study their properties and provide a behavioral interpretation. We use them in an inference procedure, whose environment is canonically represented by the probability space $(\Omega,\mathcal{F},P)$, when both $P$ and the composition of $\Omega$ are unknown. We develop an ex ante analysis -- taking place before the statistical analysis requiring knowledge of $\Omega$ -- in which we progressively learn the true composition of $\Omega$. We describe how to update extended probabilities in this setting, and introduce the concept of lower extended probabilities. We provide two examples in the fields of ecology and opinion dynamics.

The goal of a typical adaptive sequential decision making problem is to design an interactive policy that selects a group of items sequentially, based on some partial observations, to maximize the expected utility. It has been shown that the utility functions of many real-world applications, including pooled-based active learning and adaptive influence maximization, satisfy the property of adaptive submodularity. However, most of existing studies on adaptive submodular maximization focus on the fully adaptive setting, i.e., one must wait for the feedback from \emph{all} past selections before making the next selection. Although this approach can take full advantage of feedback from the past to make informed decisions, it may take a longer time to complete the selection process as compared with the non-adaptive solution where all selections are made in advance before any observations take place. In this paper, we explore the problem of partial-adaptive submodular maximization where one is allowed to make multiple selections in a batch simultaneously and observe their realizations together. Our approach enjoys the benefits of adaptivity while reducing the time spent on waiting for the observations from past selections. To the best of our knowledge, no results are known for partial-adaptive policies for the non-monotone adaptive submodular maximization problem. We study this problem under both cardinality constraint and knapsack constraints, and develop effective and efficient solutions for both cases. We also analyze the batch query complexity, i.e., the number of batches a policy takes to complete the selection process, of our policy under some additional assumptions.

Acyclic model, often depicted as a directed acyclic graph (DAG), has been widely employed to represent directional causal relations among collected nodes. In this article, we propose an efficient method to learn linear non-Gaussian DAG in high dimensional cases, where the noises can be of any continuous non-Gaussian distribution. This is in sharp contrast to most existing DAG learning methods assuming Gaussian noise with additional variance assumptions to attain exact DAG recovery. The proposed method leverages a novel concept of topological layer to facilitate the DAG learning. Particularly, we show that the topological layers can be exactly reconstructed in a bottom-up fashion, and the parent-child relations among nodes in each layer can also be consistently established. More importantly, the proposed method does not require the faithfulness or parental faithfulness assumption which has been widely assumed in the literature of DAG learning. Its advantage is also supported by the numerical comparison against some popular competitors in various simulated examples as well as a real application on the global spread of COVID-19.

In the era of open data, Poisson and other count regression models are increasingly important. Still, conventional Poisson regression has remaining issues in terms of identifiability and computational efficiency. Especially, due to an identification problem, Poisson regression can be unstable for small samples with many zeros. Provided this, we develop a closed-form inference for an over-dispersed Poisson regression including Poisson additive mixed models. The approach is derived via mode-based log-Gaussian approximation. The resulting method is fast, practical, and free from the identification problem. Monte Carlo experiments demonstrate that the estimation error of the proposed method is a considerably smaller estimation error than the closed-form alternatives and as small as the usual Poisson regressions. For counts with many zeros, our approximation has better estimation accuracy than conventional Poisson regression. We obtained similar results in the case of Poisson additive mixed modeling considering spatial or group effects. The developed method was applied for analyzing COVID-19 data in Japan. This result suggests that influences of pedestrian density, age, and other factors on the number of cases change over periods.

Acceleration of first order methods is mainly obtained via inertial techniques \`a la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.

Graphs (networks) are an important tool to model data in different domains. Real-world graphs are usually directed, where the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network $G$ and a vertex $r \in V(G)$, we propose an exact algorithm to compute betweenness score of $r$. Our algorithm pre-computes a set $\mathcal{RV}(r)$, which is used to prune a huge amount of computations that do not contribute to the betweenness score of $r$. Time complexity of our algorithm depends on $|\mathcal{RV}(r)|$ and it is respectively $\Theta(|\mathcal{RV}(r)|\cdot|E(G)|)$ and $\Theta(|\mathcal{RV}(r)|\cdot|E(G)|+|\mathcal{RV}(r)|\cdot|V(G)|\log |V(G)|)$ for unweighted graphs and weighted graphs with positive weights. $|\mathcal{RV}(r)|$ is bounded from above by $|V(G)|-1$ and in most cases, it is a small constant. Then, for the cases where $\mathcal{RV}(r)$ is large, we present a simple randomized algorithm that samples from $\mathcal{RV}(r)$ and performs computations for only the sampled elements. We show that this algorithm provides an $(\epsilon,\delta)$-approximation to the betweenness score of $r$. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that for estimating betweenness score of a single vertex, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also reveal that our algorithm improves the existing algorithms when someone is interested in computing betweenness values of the vertices in a set whose cardinality is very small.

We propose an adaptive finite element algorithm to approximate solutions of elliptic problems whose forcing data is locally defined and is approximated by regularization (or mollification). We show that the energy error decay is quasi-optimal in two dimensional space and sub-optimal in three dimensional space. Numerical simulations are provided to confirm our findings.

Measuring quality of cancer care delivered by US health providers is challenging. Patients receiving oncology care greatly vary in disease presentation among other key characteristics. In this paper we discuss a framework for institutional quality measurement which addresses the heterogeneity of patient populations. For this, we follow recent statistical developments on health outcomes research and conceptualize the task of quality measurement as a causal inference problem, helping to target flexible covariate profiles that can represent specific populations of interest. To our knowledge, such covariate profiles have not been used in the quality measurement literature. We use different clinically relevant covariate profiles and evaluate methods for layered case-mix adjustments that combine weighting and regression modeling approaches in a sequential manner in order to reduce model extrapolation and allow for provider effect modification. We appraise these methods in an extensive simulation study and highlight the practical utility of weighting methods that warn the investigator when case-mix adjustments are infeasible without some form of extrapolation that goes beyond the support of the data. In a study of cancer-care outcomes, we assess the performance of oncology practices for different profiles that correspond to the types of patients who may receive cancer care. We describe how the methods examined may be particularly important for high-stakes quality measurement, such as public reporting or performance-based payments. These methods may also be applied to support the health care decisions of individual patients and provide a path to personalized quality measurement.

Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the agreement of representations in the two views. Despite the prosperous development of graph CL methods, the design of graph augmentation schemes -- a crucial component in CL -- remains rarely explored. We argue that the data augmentation schemes should preserve intrinsic structures and attributes of graphs, which will force the model to learn representations that are insensitive to perturbation on unimportant nodes and edges. However, most existing methods adopt uniform data augmentation schemes, like uniformly dropping edges and uniformly shuffling features, leading to suboptimal performance. In this paper, we propose a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph. Specifically, on the topology level, we design augmentation schemes based on node centrality measures to highlight important connective structures. On the node attribute level, we corrupt node features by adding more noise to unimportant node features, to enforce the model to recognize underlying semantic information. We perform extensive experiments of node classification on a variety of real-world datasets. Experimental results demonstrate that our proposed method consistently outperforms existing state-of-the-art baselines and even surpasses some supervised counterparts, which validates the effectiveness of the proposed contrastive framework with adaptive augmentation.

Inferring the most likely configuration for a subset of variables of a joint distribution given the remaining ones - which we refer to as co-generation - is an important challenge that is computationally demanding for all but the simplest settings. This task has received a considerable amount of attention, particularly for classical ways of modeling distributions like structured prediction. In contrast, almost nothing is known about this task when considering recently proposed techniques for modeling high-dimensional distributions, particularly generative adversarial nets (GANs). Therefore, in this paper, we study the occurring challenges for co-generation with GANs. To address those challenges we develop an annealed importance sampling based Hamiltonian Monte Carlo co-generation algorithm. The presented approach significantly outperforms classical gradient based methods on a synthetic and on the CelebA and LSUN datasets.

北京阿比特科技有限公司