销魂美女一区二区三区AV_久久久久精品电影_人妻丰满AV中文久久不卡_久久精品黄色夫妻视频_色欲A蜜臀AV在线播放_久久久无码精品亚洲日韩四虎_亚洲精品无码成人AV电影网密臀

Partial MaxSAT (PMS) and Weighted Partial MaxSAT (WPMS) are both practical generalizations to the typical combinatorial problem of MaxSAT. In this work, we propose an effective farsighted probabilistic sampling based local search algorithm called FPS for solving these two problems, denoted as (W)PMS. The FPS algorithm replaces the mechanism of flipping a single variable per iteration step, that is widely used in existing (W)PMS local search algorithms, with the proposed farsighted local search strategy, and provides higher-quality local optimal solutions. The farsighted strategy employs the probabilistic sampling technique that allows the algorithm to look-ahead widely and efficiently. In this way, FPS can provide more and better search directions and improve the performance without reducing the efficiency. Extensive experiments on all the benchmarks of (W)PMS problems from the incomplete track of recent four years of MaxSAT Evaluations demonstrate that our method significantly outperforms SATLike3.0, the state-of-the-art local search algorithm, for solving both the PMS and WPMS problems. We furthermore do comparison with the extended solver of SATLike, SATLike-c, which is the champion of three categories among the total four (PMS and WPMS categories, each associated with two time limits) of the incomplete track in the recent MaxSAT Evaluation (MSE2021). We replace the local search component in SATLike-c with the proposed farsighted sampling local search approach, and the resulting solver FPS-c also outperforms SATLike-c for solving both the PMS and WPMS problems.

相關內容

FPS

關注 0

Networking · Better · 最優化 · 優化器 · 最大后驗 ·

2022 年 2 月 1 日

Blind Image Deconvolution Using Variational Deep Image Prior

Dong Huo,Abbas Masoumzadeh,Rafsanjany Kushol,Yee-Hong Yang

Conventional deconvolution methods utilize hand-crafted image priors to constrain the optimization. While deep-learning-based methods have simplified the optimization by end-to-end training, they fail to generalize well to blurs unseen in the training dataset. Thus, training image-specific models is important for higher generalization. Deep image prior (DIP) provides an approach to optimize the weights of a randomly initialized network with a single degraded image by maximum a posteriori (MAP), which shows that the architecture of a network can serve as the hand-crafted image prior. Different from the conventional hand-crafted image priors that are statistically obtained, it is hard to find a proper network architecture because the relationship between images and their corresponding network architectures is unclear. As a result, the network architecture cannot provide enough constraint for the latent sharp image. This paper proposes a new variational deep image prior (VDIP) for blind image deconvolution, which exploits additive hand-crafted image priors on latent sharp images and approximates a distribution for each pixel to avoid suboptimal solutions. Our mathematical analysis shows that the proposed method can better constrain the optimization. The experimental results further demonstrate that the generated images have better quality than that of the original DIP on benchmark datasets. The source code of our VDIP is available at //github.com/Dong-Huo/VDIP-Deconvolution.

可辨認的 · 值域 · 進化計算 ·

2022 年 1 月 31 日

Minimal Conditions for Beneficial Local Search

Mark G Wallace

from arxiv, 36 pages plus 19 pages of appendix

This paper investigates why it is beneficial, when solving a problem, to search in the neighbourhood of a current solution. The paper identifies properties of problems and neighbourhoods that support two novel proofs that neighbourhood search is beneficial over blind search. These are: firstly a proof that search within the neighbourhood is more likely to find an improving solution in a single search step than blind search; and secondly a proof that a local improvement, using a sequence of neighbourhood search steps, is likely to achieve a greater improvement than a sequence of blind search steps. To explore the practical impact of these properties, a range of problem sets and neighbourhoods are generated, where these properties are satisfied to different degrees. Experiments reveal that the benefits of neighbourhood search vary dramatically in consequence. Random problems of a classical combinatorial optimisation problem are analysed, in order to demonstrate that the underlying theory is reflected in practice.

INFORMS · 學成 · 有向 · Extensibility · 可約的 ·

2022 年 1 月 31 日

Information Directed Reward Learning for Reinforcement Learning

David Lindner,Matteo Turchetta,Sebastian Tschiatschek,Kamil Ciosek,Andreas Krause

from arxiv, Presented at Conference on Neural Information Processing Systems (NeurIPS), 2021

For many reinforcement learning (RL) applications, specifying a reward is difficult. This paper considers an RL setting where the agent obtains information about the reward only by querying an expert that can, for example, evaluate individual states or provide binary preferences over trajectories. From such expensive feedback, we aim to learn a model of the reward that allows standard RL algorithms to achieve high expected returns with as few expert queries as possible. To this end, we propose Information Directed Reward Learning (IDRL), which uses a Bayesian model of the reward and selects queries that maximize the information gain about the difference in return between plausibly optimal policies. In contrast to prior active reward learning methods designed for specific types of queries, IDRL naturally accommodates different query types. Moreover, it achieves similar or better performance with significantly fewer queries by shifting the focus from reducing the reward approximation error to improving the policy induced by the reward model. We support our findings with extensive evaluations in multiple environments and with different query types.

學成 · Performer · 聯邦學習 · Weight · SOFT ·

2022 年 1 月 30 日

DearFSAC: An Approach to Optimizing Unreliable Federated Learning via Deep Reinforcement Learning

Chenghao Huang,Weilong Chen,Yuxi Chen,Shunji Yang,Yanru Zhang

In federated learning (FL), model aggregation has been widely adopted for data privacy. In recent years, assigning different weights to local models has been used to alleviate the FL performance degradation caused by differences between local datasets. However, when various defects make the FL process unreliable, most existing FL approaches expose weak robustness. In this paper, we propose the DEfect-AwaRe federated soft actor-critic (DearFSAC) to dynamically assign weights to local models to improve the robustness of FL. The deep reinforcement learning algorithm soft actor-critic is adopted for near-optimal performance and stable convergence. Besides, an auto-encoder is trained to output low-dimensional embedding vectors that are further utilized to evaluate model quality. In the experiments, DearFSAC outperforms three existing approaches on four datasets for both independent and identically distributed (IID) and non-IID settings under defective scenarios.

優化器 · 學成 · 變分分布 · 策略改進 · 強化學習 ·

2022 年 1 月 28 日

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Zuxin Liu,Zhepeng Cen,Vladislav Isenbaev,Wei Liu,Zhiwei Steven Wu,Bo Li,Ding Zhao

from arxiv, 22 pages, 12 figures. Under review

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying to safety-critical applications. Primal-dual as a prevalent constrained optimization framework suffers from instability issues and lacks optimality guarantees. This paper overcomes the issues from a novel probabilistic inference perspective and proposes an Expectation-Maximization style approach to learn safe policy. We show that the safe RL problem can be decomposed to 1) a convex optimization phase with a non-parametric variational distribution and 2) a supervised learning phase. We show the unique advantages of constrained variational policy optimization by proving its optimality and policy improvement stability. A wide range of experiments on continuous robotic tasks show that the proposed method achieves significantly better performance in terms of constraint satisfaction and sample efficiency than primal-dual baselines.

MoDELS · 簇 · Processing（編程語言） · INFORMS · Performer ·

2021 年 5 月 8 日

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Hongyin Tang,Xingwu Sun,Beihong Jin,Jingang Wang,Fuzheng Zhang,Wei Wu

from arxiv, 11 pages, 2 figures, Accepted by ACL 2021

Recently, the retrieval models based on dense representations have been gradually applied in the first stage of the document retrieval tasks, showing better performance than traditional sparse vector space models. To obtain high efficiency, the basic structure of these models is Bi-encoder in most cases. However, this simple structure may cause serious information loss during the encoding of documents since the queries are agnostic. To address this problem, we design a method to mimic the queries on each of the documents by an iterative clustering process and represent the documents by multiple pseudo queries (i.e., the cluster centroids). To boost the retrieval process using approximate nearest neighbor search library, we also optimize the matching function with a two-step score calculation procedure. Experimental results on several popular ranking and QA datasets show that our model can achieve state-of-the-art results.

Guidance · Performer · Extensibility · 路徑 · state-of-the-art ·

2021 年 2 月 8 日

Path Planning using Neural A* Search

Ryo Yonetani,Tatsunori Taniai,Mohammadamin Barekatain,Mai Nishimura,Asako Kanezaki

We present Neural A*, a novel data-driven search method for path planning problems. Despite the recent increasing attention to data-driven path planning, a machine learning approach to search-based planning is still challenging due to the discrete nature of search algorithms. In this work, we reformulate a canonical A* search algorithm to be differentiable and couple it with a convolutional encoder to form an end-to-end trainable neural network planner. Neural A* solves a path planning problem by encoding a problem instance to a guidance map and then performing the differentiable A* search with the guidance map. By learning to match the search results with ground-truth paths provided by experts, Neural A* can produce a path consistent with the ground truth accurately and efficiently. Our extensive experiments confirmed that Neural A* outperformed state-of-the-art data-driven planners in terms of the search optimality and efficiency trade-off, and furthermore, successfully predicted realistic human trajectories by directly performing search-based planning on natural image inputs.

三角不等式 · 邊緣化 · 度量學習 · MoDELS · 學成 ·

2021 年 1 月 13 日

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Chen Ma,Liheng Ma,Yingxue Zhang,Ruiming Tang,Xue Liu,Mark Coates

from arxiv, Accepted by the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2020 Research Track)

Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this, we develop a distance-based recommendation model with several novel aspects: (i) each user and item are parameterized by Gaussian distributions to capture the learning uncertainties; (ii) an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets; (iii) explicit user-user/item-item similarity modeling is incorporated in the objective function. The Wasserstein distance is employed to determine preferences because it obeys the triangle inequality and can measure the distance between probabilistic distributions. Via a comparison using five real-world datasets with state-of-the-art methods, the proposed model outperforms the best existing models by 4-22% in terms of recall@K on Top-K recommendation.

Better · 強化學習 · 學成 · Performer · 最優化 ·

2018 年 4 月 24 日

Accelerated Reinforcement Learning

K. Lakshmanan

from arxiv, The proof is not complete as it has to be shown the algorithm tracks the ODE

Policy gradient methods are widely used in reinforcement learning algorithms to search for better policies in the parameterized policy space. They do gradient search in the policy space and are known to converge very slowly. Nesterov developed an accelerated gradient search algorithm for convex optimization problems. This has been recently extended for non-convex and also stochastic optimization. We use Nesterov's acceleration for policy gradient search in the well-known actor-critic algorithm and show the convergence using ODE method. We tested this algorithm on a scheduling problem. Here an incoming job is scheduled into one of the four queues based on the queue lengths. We see from experimental results that algorithm using Nesterov's acceleration has significantly better performance compared to algorithm which do not use acceleration. To the best of our knowledge this is the first time Nesterov's acceleration has been used with actor-critic algorithm.

MoDELS · SimPLe · CC · 模型評估 · 高斯混合（模型） ·

2018 年 2 月 24 日

The Search Problem in Mixture Models

Avik Ray,Joe Neeman,Sujay Sanghavi,Sanjay Shakkottai

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.