久草精品视频在线观看,91免费国产自产地址入口,青娱乐精品视频在线,无线电影站国产日本欧美第一页

An evolutionary approach for computing the winning strategy for Nim-like games is proposed in this paper. The winning strategy is computed by using the Multi Expression Programming (MEP) technique - a fast and efficient variant of the Genetic Programming (GP). Each play strategy is represented by a mathematical expression that contains mathematical operators (such as +, -, *, mod, div, and , or, xor, not) and operands (encoding the current game state). Several numerical experiments for computing the winning strategy for the Nim game are performed. The computational effort needed for evolving a winning strategy is reported. The results show that the proposed evolutionary approach is very suitable for computing the winning strategy for Nim-like games.

相關內容

數學

關注 103

數學是關于數量、結構、變化等主題的探索。

Extensibility · 線性組合 · 估計/估計量 · 蒙特卡羅 · 正交 ·

2021 年 12 月 7 日

An offline-online strategy for multiscale problems with random defects

Axel M?lqvist,Barbara Verfürth

In this paper, we propose an offline-online strategy based on the Localized Orthogonal Decomposition (LOD) method for elliptic multiscale problems with randomly perturbed diffusion coefficient. We consider a periodic deterministic coefficient with local defects that occur with probability $p$. The offline phase pre-computes entries to global LOD stiffness matrices on a single reference element (exploiting the periodicity) for a selection of defect configurations. Given a sample of the perturbed diffusion the corresponding LOD stiffness matrix is then computed by taking linear combinations of the pre-computed entries, in the online phase. Our computable error estimates show that this yields a good coarse-scale approximation of the solution for small $p$, which is illustrated by extensive numerical experiments. This makes the proposed technique attractive already for moderate sample sizes in a Monte Carlo simulation.

圖 · 估計/估計量 · 學成 · 秩 · 極大 ·

2021 年 12 月 6 日

Learning Graph Representations for Influence Maximization

George Panagopoulos,Nikolaos Tziortziotis,Fragkiskos D. Malliaros,Michalis Vazirgiannis

from arxiv, 28

As the field of machine learning for combinatorial optimization advances, traditional problems are resurfaced and readdressed through this new perspective. The overwhelming majority of the literature focuses on small graph problems, while several real-world problems are devoted to large graphs. Here, we focus on two such problems: influence estimation, a #P-hard counting problem, and influence maximization, an NP-hard problem. We develop GLIE, a Graph Neural Network (GNN) that inherently parameterizes an upper bound of influence estimation and train it on small simulated graphs. Experiments show that GLIE provides accurate influence estimation for real graphs up to 10 times larger than the train set. More importantly, it can be used for influence maximization on considerably larger graphs, as the predictions ranking is not affected by the drop of accuracy. We develop a version of CELF optimization with GLIE instead of simulated influence estimation, surpassing the benchmark for influence maximization, although with a computational overhead. To balance the time complexity and quality of influence, we propose two different approaches. The first is a Q-network that learns to choose seeds sequentially using GLIE's predictions. The second defines a provably submodular function based on GLIE's representations to rank nodes fast while building the seed set. The latter provides the best combination of time efficiency and influence spread, outperforming SOTA benchmarks.

AlphaZero · 可辨認的 · 樣例 · 情景 · 人工智能 ·

2021 年 12 月 5 日

A Novel Approach to Solving Goal-Achieving Problems for Board Games

Chung-Chin Shih,Ti-Rong Wu,Ting Han Wei,I-Chen Wu

from arxiv, Accepted by AAAI2022. In this version, supplementary materials are added

Goal-achieving problems are puzzles that set up a specific situation with a clear objective. An example that is well-studied is the category of life-and-death (L&D) problems for Go, which helps players hone their skill of identifying region safety. Many previous methods like lambda search try null moves first, then derive so-called relevance zones (RZs), outside of which the opponent does not need to search. This paper first proposes a novel RZ-based approach, called the RZ-Based Search (RZS), to solving L&D problems for Go. RZS tries moves before determining whether they are null moves post-hoc. This means we do not need to rely on null move heuristics, resulting in a more elegant algorithm, so that it can also be seamlessly incorporated into AlphaZero's super-human level play in our solver. To repurpose AlphaZero for solving, we also propose a new training method called Faster to Life (FTL), which modifies AlphaZero to entice it to win more quickly. We use RZS and FTL to solve L&D problems on Go, namely solving 68 among 106 problems from a professional L&D book while a previous program solves 11 only. Finally, we discuss that the approach is generic in the sense that RZS is applicable to solving many other goal-achieving problems for board games.

可約的 · 正則的 · Processing（編程語言） · 回合 · 有向 ·

2021 年 11 月 30 日

Canonical Representations for Direct Generation of Strategies in High-level Petri Games

Manuel Gieseking,Nick Würdemann

from arxiv, 30 pages, 6 figures. arXiv admin note: substantial text overlap with arXiv:2103.10207

Petri games are a multi-player game model for the synthesis of distributed systems with multiple concurrent processes based on Petri nets. The processes are the players in the game represented by the token of the net. The players are divided into two teams: the controllable system and the uncontrollable environment. An individual controller is synthesized for each process based only on their locally available causality-based information. For one environment player and a bounded number of system players, the problem of solving Petri games can be reduced to that of solving B\"uchi games. High-level Petri games are a concise representation of ordinary Petri games. Symmetries, derived from a high-level representation, can be exploited to significantly reduce the state space in the corresponding B\"uchi game. We present a new construction for solving high-level Petri games. It involves the definition of a unique, canonical representation of the reduced B\"uchi game. This allows us to translate a strategy in the B\"uchi game directly into a strategy in the Petri game. An implementation applied on six structurally different benchmark families shows in almost all cases a performance increase for larger state spaces.

負例 · contrastive · 樣本 · 對比學習 · 學成 ·

2020 年 10 月 9 日

Contrastive Learning with Hard Negative Samples

Joshua Robinson,Ching-Yao Chuang,Suvrit Sra,Stefanie Jegelka

We consider the question: how can you sample good negative examples for contrastive learning? We argue that, as with metric learning, learning contrastive representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor point). The key challenge toward using hard negatives is that contrastive methods must remain unsupervised, making it infeasible to adopt existing negative sampling strategies that use label information. In response, we develop a new class of unsupervised methods for selecting hard negative samples where the user can control the amount of hardness. A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible. The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.

無監督 · 表示學習 · 損失函數（機器學習） · 學成 · 未標記 ·

2020 年 2 月 26 日

Evolving Losses for Unsupervised Video Representation Learning

AJ Piergiovanni,Anelia Angelova,Michael S. Ryoo

from arxiv, arXiv admin note: text overlap with arXiv:1906.03248

We present a new method to learn video representations from large-scale unlabeled video data. Ideally, this representation will be generic and transferable, directly usable for new tasks such as action recognition and zero or few-shot learning. We formulate unsupervised representation learning as a multi-modal, multi-task learning problem, where the representations are shared across different modalities via distillation. Further, we introduce the concept of loss function evolution by using an evolutionary search algorithm to automatically find optimal combination of loss functions capturing many (self-supervised) tasks and modalities. Thirdly, we propose an unsupervised representation evaluation metric using distribution matching to a large unlabeled dataset as a prior constraint, based on Zipf's law. This unsupervised constraint, which is not guided by any labeling, produces similar results to weakly-supervised, task-specific ones. The proposed unsupervised representation learning results in a single RGB network and outperforms previous methods. Notably, it is also more effective than several label-based methods (e.g., ImageNet), with the exception of large, fully labeled video datasets.

學成 · 深度學習 · ASSETS · 模型評估 · Performance ·

2018 年 11 月 13 日

A generic framework for privacy preserving deep learning

Theo Ryffel,Andrew Trask,Morten Dahl,Bobby Wagner,Jason Mancuso,Daniel Rueckert,Jonathan Passerat-Palmbach

from arxiv, PPML 2018, 5 pages

We detail a new framework for privacy preserving deep learning and discuss its assets. The framework puts a premium on ownership and secure processing of data and introduces a valuable representation based on chains of commands and tensors. This abstraction allows one to implement complex privacy preserving constructs such as Federated Learning, Secure Multiparty Computation, and Differential Privacy while still exposing a familiar deep learning API to the end-user. We report early results on the Boston Housing and Pima Indian Diabetes datasets. While the privacy features apart from Differential Privacy do not impact the prediction accuracy, the current implementation of the framework introduces a significant overhead in performance, which will be addressed at a later stage of the development. We believe this work is an important milestone introducing the first reliable, general framework for privacy preserving deep learning.

重要性采樣 · 樣本空間 · 方差減小 · 樣本 · 蒙特卡羅 ·

2018 年 8 月 23 日

Learning to Importance Sample in Primary Sample Space

Quan Zheng,Matthias Zwicker

from arxiv, Submitted to SIGGRAPH ASIA'18

Importance sampling is one of the most widely used variance reduction strategies in Monte Carlo rendering. In this paper, we propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples. Our approach considers an existing Monte Carlo rendering algorithm as a black box. During a scene-dependent training phase, we learn to generate samples with a desired density in the primary sample space of the rendering algorithm using maximum likelihood estimation. We leverage a recent neural network architecture that was designed to represent real-valued non-volume preserving ('Real NVP') transformations in high dimensional spaces. We use Real NVP to non-linearly warp primary sample space and obtain desired densities. In addition, Real NVP efficiently computes the determinant of the Jacobian of the warp, which is required to implement the change of integration variables implied by the warp. A main advantage of our approach is that it is agnostic of underlying light transport effects, and can be combined with many existing rendering techniques by treating them as a black box. We show that our approach leads to effective variance reduction in several practical scenarios.

學成 · 可約的 · 粵港澳大灣區數字經濟研究院 · state-of-the-art · 視頻分類 ·

2018 年 3 月 14 日

Learning Representative Temporal Features for Action Recognition

Ali Javidani,Ahmad Mahmoudi-Aznaveh

from arxiv, 5 pages

In this paper, a novel video classification methodology is presented that aims to recognize different categories of third-person videos efficiently. The idea is to keep track of motion in videos by following optical flow elements over time. To classify the resulted motion time series efficiently, the idea is letting the machine to learn temporal features along the time dimension. This is done by training a multi-channel one dimensional Convolutional Neural Network (1D-CNN). Since CNNs represent the input data hierarchically, high level features are obtained by further processing of features in lower level layers. As a result, in the case of time series, long-term temporal features are extracted from short-term ones. Besides, the superiority of the proposed method over most of the deep-learning based approaches is that we only try to learn representative temporal features along the time dimension. This reduces the number of learning parameters significantly which results in trainability of our method on even smaller datasets. It is illustrated that the proposed method could reach state-of-the-art results on two public datasets UCF11 and jHMDB with the aid of a more efficient feature vector representation.

深度強化學習 · 學成 · 強化學習 · tuning · CASE ·

2018 年 1 月 17 日

The Case for Automatic Database Administration using Deep Reinforcement Learning

Ankur Sharma,Felix Martin Schuhknecht,Jens Dittrich

Like any large software system, a full-fledged DBMS offers an overwhelming amount of configuration knobs. These range from static initialisation parameters like buffer sizes, degree of concurrency, or level of replication to complex runtime decisions like creating a secondary index on a particular column or reorganising the physical layout of the store. To simplify the configuration, industry grade DBMSs are usually shipped with various advisory tools, that provide recommendations for given workloads and machines. However, reality shows that the actual configuration, tuning, and maintenance is usually still done by a human administrator, relying on intuition and experience. Recent work on deep reinforcement learning has shown very promising results in solving problems, that require such a sense of intuition. For instance, it has been applied very successfully in learning how to play complicated games with enormous search spaces. Motivated by these achievements, in this work we explore how deep reinforcement learning can be used to administer a DBMS. First, we will describe how deep reinforcement learning can be used to automatically tune an arbitrary software system like a DBMS by defining a problem environment. Second, we showcase our concept of NoDBA at the concrete example of index selection and evaluate how well it recommends indexes for given workloads.