亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The change in the least squares estimator (LSE) of a vector of regression coefficients due to a case deletion is often used for investigating the influence of an observation on the LSE. A normalization of the change in the LSE using the Moore-Penrose inverse of the covariance matrix of the change in the LSE is derived. This normalization turns out to be a square of the internally studentized residual. It is shown that the numerator term of Cook's distance does not in general have a chi-squared distribution except for a single case. An elaborate explanation about the inappropriateness of the choice of a scaling matrix defining Cook's distance is given. By reflecting a distributional property of the change in the LSE due to a case deletion, a new diagnostic measure that is a scalar is suggested. Three numerical examples are given for illustration.

相關內容

We consider the problem of training a classification model with group annotated training data. Recent work has established that, if there is distribution shift across different groups, models trained using the standard empirical risk minimization (ERM) objective suffer from poor performance on minority groups and that group distributionally robust optimization (Group-DRO) objective is a better alternative. The starting point of this paper is the observation that though Group-DRO performs better than ERM on minority groups for some benchmark datasets, there are several other datasets where it performs much worse than ERM. Inspired by ideas from the closely related problem of domain generalization, this paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups. The key insight behind our proposed algorithm is that while Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features, thereby enhancing minority performance beyond what is achieved by Group-DRO. Empirically, we show that our proposed algorithm matches or achieves better performance compared to strong contemporary baselines including ERM and Group-DRO on standard benchmarks on both minority groups and across all groups. Theoretically, we show that the proposed algorithm is a descent method and finds first order stationary points of smooth nonconvex functions.

A directed graph is oriented if it can be obtained by orienting the edges of a simple, undirected graph. For an oriented graph $G$, let $\beta(G)$ denote the size of a minimum feedback arc set, a smallest subset of edges whose deletion leaves an acyclic subgraph. A simple consequence of a result of Berger and Shor is that any oriented graph $G$ with $m$ edges satisfies $\beta(G) = m/2 - \Omega(m^{3/4})$. We observe that if an oriented graph $G$ has a fixed forbidden subgraph $B$, the upper bound of $\beta(G) = m/2 - \Omega(m^{3/4})$ is best possible as a function of the number of edges if $B$ is not bipartite, but the exponent $3/4$ in the lower order term can be improved if $B$ is bipartite. We also show that for every rational number $r$ between $3/4$ and $1$, there is a finite collection of digraphs $\mathcal{B}$ such that every $\mathcal{B}$-free digraph $G$ with $m$ edges satisfies $\beta(G) = m/2 - \Omega(m^r)$, and this bound is best possible up to the implied constant factor. The proof uses a connection to Tur\'an numbers and a result of Bukh and Conlon. Both of our upper bounds come equipped with randomized linear-time algorithms that construct feedback arc sets achieving those bounds. Finally, we give a characterization of quasirandom directed graphs via minimum feedback arc sets.

The idea of approximating the Shapley value of an n-person game by Monte Carlo simulation was first suggested by Mann and Shapley (1960) and they also introduced four different heuristical methods to reduce the estimation error. Since 1960, several statistical methods have been developed to reduce the standard deviation of the estimate. In this paper, we develop an algorithm that uses a pair of negatively correlated samples to reduce the variance of the estimate. Although the observations generated are not independent, the sample is ergodic (obeys the strong law of large numbers), hence the name "ergodic sampling". Unlike Shapley and Mann, we do not use heuristics, the algorithm uses a small sample to learn the best ergodic transformation for a given game. We illustrate the algorithm on eight games with different characteristics to test the performance and understand how the proposed algorithm works. The experiments show that this method has at least as low variance as an independent sample, and in five test games, it significantly improves the quality of the estimation, up to 75 percent.

We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.

We investigate the feature compression of high-dimensional ridge regression using the optimal subsampling technique. Specifically, based on the basic framework of random sampling algorithm on feature for ridge regression and the A-optimal design criterion, we first obtain a set of optimal subsampling probabilities. Considering that the obtained probabilities are uneconomical, we then propose the nearly optimal ones. With these probabilities, a two step iterative algorithm is established which has lower computational cost and higher accuracy. We provide theoretical analysis and numerical experiments to support the proposed methods. Numerical results demonstrate the decent performance of our methods.

Super-Resolution is the process of generating a high-resolution image from a low-resolution image. A picture may be of lower resolution due to smaller spatial resolution, poor camera quality, as a result of blurring, or due to other possible degradations. Super-Resolution is the technique to improve the quality of a low-resolution photo by boosting its plausible resolution. The computer vision community has extensively explored the area of Super-Resolution. However, the previous Super-Resolution methods require vast amounts of data for training. This becomes problematic in domains where very few low-resolution, high-resolution pairs might be available. One of such areas is statistical downscaling, where super-resolution is increasingly being used to obtain high-resolution climate information from low-resolution data. Acquiring high-resolution climate data is extremely expensive and challenging. To reduce the cost of generating high-resolution climate information, Super-Resolution algorithms should be able to train with a limited number of low-resolution, high-resolution pairs. This paper tries to solve the aforementioned problem by introducing a semi-supervised way to perform super-resolution that can generate sharp, high-resolution images with as few as 500 paired examples. The proposed semi-supervised technique can be used as a plug-and-play module with any supervised GAN-based Super-Resolution method to enhance its performance. We quantitatively and qualitatively analyze the performance of the proposed model and compare it with completely supervised methods as well as other unsupervised techniques. Comprehensive evaluations show the superiority of our method over other methods on different metrics. We also offer the applicability of our approach in statistical downscaling to obtain high-resolution climate images.

We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number $\sigma$. We additionally assume the loss function is $\eta$-H\"older continuous with respect to the data. The oracle complexity of LPI-GD is $\tilde{O}\left(\sigma m^d \log(1/\varepsilon)\right)$ for a desired accuracy $\varepsilon$, where $d$ is the dimension of the parameter space, and $m$ is the cardinality of an approximation grid. The factor $m^d$ can be shown to scale as $O((1/\varepsilon)^{d/2\eta})$. LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of $\tilde{O}\left(\sqrt{\sigma} m^d \log(1/\varepsilon)\right)$. Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.

The inverse probability (IPW) and doubly robust (DR) estimators are often used to estimate the average causal effect (ATE), but are vulnerable to outliers. The IPW/DR median can be used for outlier-resistant estimation of the ATE, but the outlier resistance of the median is limited and it is not resistant enough for heavy contamination. We propose extensions of the IPW/DR estimators with density power weighting, which can eliminate the influence of outliers almost completely. The outlier resistance of the proposed estimators is evaluated through the unbiasedness of the estimating equations. Unlike the median-based methods, our estimators are resistant to outliers even under heavy contamination. Interestingly, the naive extension of the DR estimator requires bias correction to keep the double robustness even under the most tractable form of contamination. In addition, the proposed estimators are found to be highly resistant to outliers in more difficult settings where the contamination ratio depends on the covariates. The outlier resistance of our estimators from the viewpoint of the influence function is also favorable. Our theoretical results are verified via Monte Carlo simulations and real data analysis. The proposed methods were found to have more outlier resistance than the median-based methods and estimated the potential mean with a smaller error than the median-based methods.

Applications of machine learning in healthcare often require working with time-to-event prediction tasks including prognostication of an adverse event, re-hospitalization or death. Such outcomes are typically subject to censoring due to loss of follow up. Standard machine learning methods cannot be applied in a straightforward manner to datasets with censored outcomes. In this paper, we present auton-survival, an open-source repository of tools to streamline working with censored time-to-event or survival data. auton-survival includes tools for survival regression, adjustment in the presence of domain shift, counterfactual estimation, phenotyping for risk stratification, evaluation, as well as estimation of treatment effects. Through real world case studies employing a large subset of the SEER oncology incidence data, we demonstrate the ability of auton-survival to rapidly support data scientists in answering complex health and epidemiological questions.

Sufficient dimension reduction (SDR) is a successful tool in regression models. It is a feasible method to solve and analyze the nonlinear nature of the regression problems. This paper introduces the \textbf{itdr} R package that provides several functions based on integral transformation methods to estimate the SDR subspaces in a comprehensive and user-friendly manner. In particular, the \textbf{itdr} package includes the Fourier method (FM) and the convolution method (CM) of estimating the SDR subspaces such as the central mean subspace (CMS) and the central subspace (CS). In addition, the \textbf{itdr} package facilitates the recovery of the CMS and the CS by using the iterative Hessian transformation (IHT) method and the Fourier transformation approach for inverse dimension reduction method (invFM), respectively. Moreover, the use of the package is illustrated by three datasets. \textcolor{black}{Furthermore, this is the first package that implements integral transformation methods to estimate SDR subspaces. Hence, the \textbf{itdr} package may provide a huge contribution to research in the SDR field.

北京阿比特科技有限公司