亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This work addresses inverse linear optimization where the goal is to infer the unknown cost vector of a linear program. Specifically, we consider the data-driven setting in which the available data are noisy observations of optimal solutions that correspond to different instances of the linear program. We introduce a new formulation of the problem that, compared to other existing methods, allows the recovery of a less restrictive and generally more appropriate admissible set of cost estimates. It can be shown that this inverse optimization problem yields a finite number of solutions, and we develop an exact two-phase algorithm to determine all such solutions. Moreover, we propose an efficient decomposition algorithm to solve large instances of the problem. The algorithm extends naturally to an online learning environment where it can be used to provide quick updates of the cost estimate as new data becomes available over time. For the online setting, we further develop an effective adaptive sampling strategy that guides the selection of the next samples. The efficacy of the proposed methods is demonstrated in computational experiments involving two applications, customer preference learning and cost estimation for production planning. The results show significant reductions in computation and sampling efforts.

相關內容

We develop a novel Markov chain Monte Carlo (MCMC) method that exploits a hierarchy of models of increasing complexity to efficiently generate samples from an unnormalized target distribution. Broadly, the method rewrites the Multilevel MCMC approach of Dodwell et al. (2015) in terms of the Delayed Acceptance (DA) MCMC of Christen & Fox (2005). In particular, DA is extended to use a hierarchy of models of arbitrary depth, and allow subchains of arbitrary length. We show that the algorithm satisfies detailed balance, hence is ergodic for the target distribution. Furthermore, multilevel variance reduction is derived that exploits the multiple levels and subchains, and an adaptive multilevel correction to coarse-level biases is developed. Three numerical examples of Bayesian inverse problems are presented that demonstrate the advantages of these novel methods. The software and examples are available in PyMC3.

Dictionary learning consists of finding a sparse representation from noisy data and is a common way to encode data-driven prior knowledge on signals. Alternating minimization (AM) is standard for the underlying optimization, where gradient descent steps alternate with sparse coding procedures. The major drawback of this method is its prohibitive computational cost, making it unpractical on large real-world data sets. This work studies an approximate formulation of dictionary learning based on unrolling and compares it to alternating minimization to find the best trade-off between speed and precision. We analyze the asymptotic behavior and convergence rate of gradients estimates in both methods. We show that unrolling performs better on the support of the inner problem solution and during the first iterations. Finally, we apply unrolling on pattern learning in magnetoencephalography (MEG) with the help of a stochastic algorithm and compare the performance to a state-of-the-art method.

In this paper, we propose Nesterov Accelerated Shuffling Gradient (NASG), a new algorithm for the convex finite-sum minimization problems. Our method integrates the traditional Nesterov's acceleration momentum with different shuffling sampling schemes. We show that our algorithm has an improved rate of $\mathcal{O}(1/T)$ using unified shuffling schemes, where $T$ is the number of epochs. This rate is better than that of any other shuffling gradient methods in convex regime. Our convergence analysis does not require an assumption on bounded domain or a bounded gradient condition. For randomized shuffling schemes, we improve the convergence bound further. When employing some initial condition, we show that our method converges faster near the small neighborhood of the solution. Numerical simulations demonstrate the efficiency of our algorithm.

Potentially, taking advantage of available side information boosts the performance of recommender systems; nevertheless, with the rise of big data, the side information has often several dimensions. Hence, it is imperative to develop decision-making algorithms that can cope with such a high-dimensional context in real-time. That is especially challenging when the decision-maker has a variety of items to recommend. In this paper, we build upon the linear contextual multi-armed bandit framework to address this problem. We develop a decision-making policy for a linear bandit problem with high-dimensional context vectors and several arms. Our policy employs Thompson sampling and feeds it with reduced context vectors, where the dimensionality reduction follows by random projection. Our proposed recommender system follows this policy to learn online the item preferences of users while keeping its runtime as low as possible. We prove a regret bound that scales as a factor of the reduced dimension instead of the original one. For numerical evaluation, we use our algorithm to build a recommender system and apply it to real-world datasets. The theoretical and numerical results demonstrate the effectiveness of our proposed algorithm compared to the state-of-the-art in terms of computational complexity and regret performance.

A new approach to $L_2$-consistent estimation of a general density functional using $k$-nearest neighbor distances is proposed, where the functional under consideration is in the form of the expectation of some function $f$ of the densities at each point. The estimator is designed to be asymptotically unbiased, using the convergence of the normalized volume of a $k$-nearest neighbor ball to a Gamma distribution in the large-sample limit, and naturally involves the inverse Laplace transform of a scaled version of the function $f.$ Some instantiations of the proposed estimator recover existing $k$-nearest neighbor based estimators of Shannon and R\'enyi entropies and Kullback--Leibler and R\'enyi divergences, and discover new consistent estimators for many other functionals such as logarithmic entropies and divergences. The $L_2$-consistency of the proposed estimator is established for a broad class of densities for general functionals, and the convergence rate in mean squared error is established as a function of the sample size for smooth, bounded densities.

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.

We consider hyper-differential sensitivity analysis (HDSA) of nonlinear Bayesian inverse problems governed by PDEs with infinite-dimensional parameters. In previous works, HDSA has been used to assess the sensitivity of the solution of deterministic inverse problems to additional model uncertainties and also different types of measurement data. In the present work, we extend HDSA to the class of Bayesian inverse problems governed by PDEs. The focus is on assessing the sensitivity of certain key quantities derived from the posterior distribution. Specifically, we focus on analyzing the sensitivity of the MAP point and the Bayes risk and make full use of the information embedded in the Bayesian inverse problem. After establishing our mathematical framework for HDSA of Bayesian inverse problems, we present a detailed computational approach for computing the proposed HDSA indices. We examine the effectiveness of the proposed approach on a model inverse problem governed by a PDE for heat conduction.

In real world settings, numerous constraints are present which are hard to specify mathematically. However, for the real world deployment of reinforcement learning (RL), it is critical that RL agents are aware of these constraints, so that they can act safely. In this work, we consider the problem of learning constraints from demonstrations of a constraint-abiding agent's behavior. We experimentally validate our approach and show that our framework can successfully learn the most likely constraints that the agent respects. We further show that these learned constraints are \textit{transferable} to new agents that may have different morphologies and/or reward functions. Previous works in this regard have either mainly been restricted to tabular (discrete) settings, specific types of constraints or assume the environment's transition dynamics. In contrast, our framework is able to learn arbitrary \textit{Markovian} constraints in high-dimensions in a completely model-free setting. The code can be found it: \url{//github.com/shehryar-malik/icrl}.

Active learning from demonstration allows a robot to query a human for specific types of input to achieve efficient learning. Existing work has explored a variety of active query strategies; however, to our knowledge, none of these strategies directly minimize the performance risk of the policy the robot is learning. Utilizing recent advances in performance bounds for inverse reinforcement learning, we propose a risk-aware active inverse reinforcement learning algorithm that focuses active queries on areas of the state space with the potential for large generalization error. We show that risk-aware active learning outperforms standard active IRL approaches on gridworld, simulated driving, and table setting tasks, while also providing a performance-based stopping criterion that allows a robot to know when it has received enough demonstrations to safely perform a task.

In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. The optimal cost function of the aggregate problem, a nonlinear function of the features, serves as an architecture for approximation in value space of the optimal cost function or the cost functions of policies of the original problem. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with reinforcement learning based on deep neural networks, which is used to obtain the needed features. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by deep reinforcement learning, thereby potentially leading to more effective policy improvement.

北京阿比特科技有限公司