亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A general framework with a series of different methods is proposed to improve the estimate of convex function (or functional) values when only noisy observations of the true input are available. Technically, our methods catch the bias introduced by the convexity and remove this bias from a baseline estimate. Theoretical analysis are conducted to show that the proposed methods can strictly reduce the expected estimate error under mild conditions. When applied, the methods require no specific knowledge about the problem except the convexity and the evaluation of the function. Therefore, they can serve as off-the-shelf tools to obtain good estimate for a wide range of problems, including optimization problems with random objective functions or constraints, and functionals of probability distributions such as the entropy and the Wasserstein distance. Numerical experiments on a wide variety of problems show that our methods can significantly improve the quality of the estimate compared with the baseline method.

相關內容

Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain many important challenges: Existing guarantees (1) typically only hold for the averaged iterates rather than the more desirable last iterates, (2) lack convergence metrics that capture the scales of the variables such as Wasserstein distances, and (3) mainly apply to elementary schemes such as stochastic gradient Langevin dynamics. In this paper, we develop a new framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators. Beyond existing methods, our framework also motivates more efficient schemes that enjoy the same rigorous guarantees.

Stochastic versions of proximal methods have gained much attention in statistics and machine learning. These algorithms tend to admit simple, scalable forms, and enjoy numerical stability via implicit updates. In this work, we propose and analyze a stochastic version of the recently proposed proximal distance algorithm, a class of iterative optimization methods that recover a desired constrained estimation problem as a penalty parameter $\rho \rightarrow \infty$. By uncovering connections to related stochastic proximal methods and interpreting the penalty parameter as the learning rate, we justify heuristics used in practical manifestations of the proximal distance method, establishing their convergence guarantees for the first time. Moreover, we extend recent theoretical devices to establish finite error bounds and a complete characterization of convergence rates regimes. We validate our analysis via a thorough empirical study, also showing that unsurprisingly, the proposed method outpaces batch versions on popular learning tasks.

Compared to on-policy policy gradient techniques, off-policy model-free deep reinforcement learning (RL) that uses previously gathered data can improve sampling efficiency. However, off-policy learning becomes challenging when the discrepancy between the distributions of the policy of interest and the policies that collected the data increases. Although the well-studied importance sampling and off-policy policy gradient techniques were proposed to compensate for this discrepancy, they usually require a collection of long trajectories that increases the computational complexity and induce additional problems such as vanishing/exploding gradients or discarding many useful experiences. Moreover, their generalization to continuous action domains is strictly limited as they require action probabilities, which is unsuitable for deterministic policies. To overcome these limitations, we introduce a novel policy similarity measure to mitigate the effects of such discrepancy. Our method offers an adequate single-step off-policy correction without any probability estimates, and theoretical results show that it can achieve a contraction mapping with a fixed unique point, which allows "safe" off-policy learning. An extensive set of empirical results indicate that our algorithm substantially improves the state-of-the-art and attains higher returns in fewer steps than the competing methods by efficiently scheduling the learning rate in Q-learning and policy optimization.

Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from the Euclidean space $\mathbb{R}^p$. However, it is increasingly common that complex datasets are best summarized as data points in non-linear spaces. In this paper, we present a novel framework of causal effects for outcomes from the Wasserstein space of cumulative distribution functions, which in contrast to the Euclidean space, is non-linear. We develop doubly robust estimators and associated asymptotic theory for these causal effects. As an illustration, we use our framework to quantify the causal effect of marriage on physical activity patterns using wearable device data collected through the National Health and Nutrition Examination Survey.

We provide adaptive inference methods, based on $\ell_1$ regularization, for regular (semi-parametric) and non-regular (nonparametric) linear functionals of the conditional expectation function. Examples of regular functionals include average treatment effects, policy effects, and derivatives. Examples of non-regular functionals include average treatment effects, policy effects, and derivatives conditional on a covariate subvector fixed at a point. We construct a Neyman orthogonal equation for the target parameter that is approximately invariant to small perturbations of the nuisance parameters. To achieve this property, we include the Riesz representer for the functional as an additional nuisance parameter. Our analysis yields weak ``double sparsity robustness'': either the approximation to the regression or the approximation to the representer can be ``completely dense'' as long as the other is sufficiently ``sparse''. Our main results are non-asymptotic and imply asymptotic uniform validity over large classes of models, translating into honest confidence bands for both global and local parameters.

A difficulty in MSE estimation occurs because we do not specify a full distribution for the survey weights. This obfuscates the use of fully parametric bootstrap procedures. To overcome this challenge, we develop a novel MSE estimator. We estimate the leading term in the MSE, which is the MSE of the best predictor (constructed with the true parameters), using the same simulated samples used to construct the basic predictor. We then exploit the asymptotic normal distribution of the parameter estimators to estimate the second term in the MSE, which reflects variability in the estimated parameters. We incorporate a correction for the bias of the estimator of the leading term without the use of computationally intensive double-bootstrap procedures. We further develop calibrated prediction intervals that rely less on normal theory than standard prediction intervals. We empirically demonstrate the validity of the proposed procedures through extensive simulation studies. We apply the methods to predict several functions of sheet and rill erosion for Iowa counties using data from a complex agricultural survey.

Digital sensors can lead to noisy results under many circumstances. To be able to remove the undesired noise from images, proper noise modeling and an accurate noise parameter estimation is crucial. In this project, we use a Poisson-Gaussian noise model for the raw-images captured by the sensor, as it fits the physical characteristics of the sensor closely. Moreover, we limit ourselves to the case where observed (noisy), and ground-truth (noise-free) image pairs are available. Using such pairs is beneficial for the noise estimation and is not widely studied in literature. Based on this model, we derive the theoretical maximum likelihood solution, discuss its practical implementation and optimization. Further, we propose two algorithms based on variance and cumulant statistics. Finally, we compare the results of our methods with two different approaches, a CNN we trained ourselves, and another one taken from literature. The comparison between all these methods shows that our algorithms outperform the others in terms of MSE and have good additional properties.

Feature selection plays a vital role in promoting the classifier's performance. However, current methods ineffectively distinguish the complex interaction in the selected features. To further remove these hidden negative interactions, we propose a GA-like dynamic probability (GADP) method with mutual information which has a two-layer structure. The first layer applies the mutual information method to obtain a primary feature subset. The GA-like dynamic probability algorithm, as the second layer, mines more supportive features based on the former candidate features. Essentially, the GA-like method is one of the population-based algorithms so its work mechanism is similar to the GA. Different from the popular works which frequently focus on improving GA's operators for enhancing the search ability and lowering the converge time, we boldly abandon GA's operators and employ the dynamic probability that relies on the performance of each chromosome to determine feature selection in the new generation. The dynamic probability mechanism significantly reduces the parameter number in GA that making it easy to use. As each gene's probability is independent, the chromosome variety in GADP is more notable than in traditional GA, which ensures GADP has a wider search space and selects relevant features more effectively and accurately. To verify our method's superiority, we evaluate our method under multiple conditions on 15 datasets. The results demonstrate the outperformance of the proposed method. Generally, it has the best accuracy. Further, we also compare the proposed model to the popular heuristic methods like POS, FPA, and WOA. Our model still owns advantages over them.

In solving multi-modal, multi-objective optimization problems (MMOPs), the objective is not only to find a good representation of the Pareto-optimal front (PF) in the objective space but also to find all equivalent Pareto-optimal subsets (PSS) in the variable space. Such problems are practically relevant when a decision maker (DM) is interested in identifying alternative designs with similar performance. There has been significant research interest in recent years to develop efficient algorithms to deal with MMOPs. However, the existing algorithms still require prohibitive number of function evaluations (often in several thousands) to deal with problems involving as low as two objectives and two variables. The algorithms are typically embedded with sophisticated, customized mechanisms that require additional parameters to manage the diversity and convergence in the variable and the objective spaces. In this letter, we introduce a steady-state evolutionary algorithm for solving MMOPs, with a simple design and no additional userdefined parameters that need tuning compared to a standard EA. We report its performance on 21 MMOPs from various test suites that are widely used for benchmarking using a low computational budget of 1000 function evaluations. The performance of the proposed algorithm is compared with six state-of-the-art algorithms (MO Ring PSO SCD, DN-NSGAII, TriMOEA-TA&R, CPDEA, MMOEA/DC and MMEA-WI). The proposed algorithm exhibits significantly better performance than the above algorithms based on the established metrics including IGDX, PSP and IGD. We hope this study would encourage design of simple, efficient and generalized algorithms to improve its uptake for practical applications.

Image segmentation is still an open problem especially when intensities of the interested objects are overlapped due to the presence of intensity inhomogeneity (also known as bias field). To segment images with intensity inhomogeneities, a bias correction embedded level set model is proposed where Inhomogeneities are Estimated by Orthogonal Primary Functions (IEOPF). In the proposed model, the smoothly varying bias is estimated by a linear combination of a given set of orthogonal primary functions. An inhomogeneous intensity clustering energy is then defined and membership functions of the clusters described by the level set function are introduced to rewrite the energy as a data term of the proposed model. Similar to popular level set methods, a regularization term and an arc length term are also included to regularize and smooth the level set function, respectively. The proposed model is then extended to multichannel and multiphase patterns to segment colourful images and images with multiple objects, respectively. It has been extensively tested on both synthetic and real images that are widely used in the literature and public BrainWeb and IBSR datasets. Experimental results and comparison with state-of-the-art methods demonstrate that advantages of the proposed model in terms of bias correction and segmentation accuracy.

北京阿比特科技有限公司