亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In the setting where we ask participants multiple similar possibly subjective multi-choice questions (e.g. Do you like Bulbasaur? Y/N; do you like Squirtle? Y/N), peer prediction aims to design mechanisms that encourage honest feedback without verification. A series of works have successfully designed multi-task peer prediction mechanisms where reporting truthfully is better than any other strategy (dominantly truthful), while they require an infinite number of tasks. A recent work proposes the first multi-task peer prediction mechanism, Determinant Mutual Information (DMI)-Mechanism, where not only is dominantly truthful but also works for a finite number of tasks (practical). However, the existence of other practical dominantly-truthful multi-task peer prediction mechanisms remains to be an open question. This work answers the above question by providing 1. a new family of information-monotone information measures: volume mutual information (VMI), where DMI is a special case; 2. a new family of practical dominantly-truthful multi-task peer prediction mechanisms, VMI-Mechanisms. To illustrate the importance of VMI-Mechanisms, we also provide a tractable effort incentive optimization goal. We show that DMI-Mechanism may not be not optimal but we can construct a sequence of VMI-Mechanisms that are approximately optimal. The main technical highlight in this paper is a novel geometric information measure, Volume Mutual Information, that is based on a simple idea: we can measure an object A's information amount by the number of objects that is less informative than A. Different densities over the object lead to different information measures. This also gives Determinant Mutual Information a simple geometric interpretation.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · INTERACT · Guidance · Atom(文本編輯器) · Processing(編程語言) ·
2022 年 1 月 12 日

Self-assembly is the process in which the components of a system, whether molecules, polymers, or macroscopic particles, are organized into ordered structures as a result of local interactions between the components themselves, without exterior guidance. In this paper, we speak about the self-assembly of aperiodic tilings. Aperiodic tilings serve as a mathematical model for quasicrystals - crystals that do not have any translational symmetry. Because of the specific atomic arrangement of these crystals, the question of how they grow remains open. In this paper, we state the theorem regarding purely local and deterministic growth of Golden-Octagonal tilings. Showing, contrary to the popular belief, that local growth of aperiodic tilings is possible.

Learning problems commonly exhibit an interesting feedback mechanism wherein the population data reacts to competing decision makers' actions. This paper formulates a new game theoretic framework for this phenomenon, called multi-player performative prediction. We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game. The latter equilibria are arguably more informative, but can be found efficiently only when the game is monotone. We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and repeated (stochastic) gradient play. We then establish transparent sufficient conditions for strong monotonicity of the game and use them to develop algorithms for finding Nash equilibria. We investigate derivative free methods and adaptive gradient algorithms wherein each player alternates between learning a parametric description of their distribution and gradient steps on the empirical risk. Synthetic and semi-synthetic numerical experiments illustrate the results.

Most physical processes posses structural properties such as constant energies, volumes, and other invariants over time. When learning models of such dynamical systems, it is critical to respect these invariants to ensure accurate predictions and physically meaningful behavior. Strikingly, state-of-the-art methods in Gaussian process (GP) dynamics model learning are not addressing this issue. On the other hand, classical numerical integrators are specifically designed to preserve these crucial properties through time. We propose to combine the advantages of GPs as function approximators with structure preserving numerical integrators for dynamical systems, such as Runge-Kutta methods. These integrators assume access to the ground truth dynamics and require evaluations of intermediate and future time steps that are unknown in a learning-based scenario. This makes direct inference of the GP dynamics, with embedded numerical scheme, intractable. Our key technical contribution is the evaluation of the implicitly defined Runge-Kutta transition probability. In a nutshell, we introduce an implicit layer for GP regression, which is embedded into a variational inference-based model learning scheme.

We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show that the reward design problem for admissible policy teaching is computationally challenging, and it is NP-hard to find an approximately optimal reward modification. We then proceed by formulating a surrogate problem whose optimal solution approximates the optimal solution to the reward design problem in our setting, but is more amenable to optimization techniques and analysis. For this surrogate problem, we present characterization results that provide bounds on the value of the optimal solution. Finally, we design a local search algorithm to solve the surrogate problem and showcase its utility using simulation-based experiments.

Recent research has enabled fixed-wing unmanned aerial vehicles (UAVs) to maneuver in constrained spaces through the use of direct nonlinear model predictive control (NMPC). However, this approach has been limited to a priori known maps and ground truth state measurements. In this paper, we present a direct NMPC approach that leverages NanoMap, a light-weight point-cloud mapping framework to generate collision-free trajectories using onboard stereo vision. We first explore our approach in simulation and demonstrate that our algorithm is sufficient to enable vision-based navigation in urban environments. We then demonstrate our approach in hardware using a 42-inch fixed-wing UAV and show that our motion planning algorithm is capable of navigating around a building using a minimalistic set of goal-points. We also show that storing a point-cloud history is important for navigating these types of constrained environments.

We study the class of first-order locally-balanced Metropolis--Hastings algorithms introduced in Livingstone & Zanella (2021). To choose a specific algorithm within the class the user must select a balancing function $g:\mathbb{R} \to \mathbb{R}$ satisfying $g(t) = tg(1/t)$, and a noise distribution for the proposal increment. Popular choices within the class are the Metropolis-adjusted Langevin algorithm and the recently introduced Barker proposal. We first establish a universal limiting optimal acceptance rate of 57% and scaling of $n^{-1/3}$ as the dimension $n$ tends to infinity among all members of the class under mild smoothness assumptions on $g$ and when the target distribution for the algorithm is of the product form. In particular we obtain an explicit expression for the asymptotic efficiency of an arbitrary algorithm in the class, as measured by expected squared jumping distance. We then consider how to optimise this expression under various constraints. We derive an optimal choice of noise distribution for the Barker proposal, optimal choice of balancing function under a Gaussian noise distribution, and optimal choice of first-order locally-balanced algorithm among the entire class, which turns out to depend on the specific target distribution. Numerical simulations confirm our theoretical findings and in particular show that a bi-modal choice of noise distribution in the Barker proposal gives rise to a practical algorithm that is consistently more efficient than the original Gaussian version.

Real economies can be seen as a sequential imperfect-information game with many heterogeneous, interacting strategic agents of various agent types, such as consumers, firms, and governments. Dynamic general equilibrium models are common economic tools to model the economic activity, interactions, and outcomes in such systems. However, existing analytical and computational methods struggle to find explicit equilibria when all agents are strategic and interact, while joint learning is unstable and challenging. Amongst others, a key reason is that the actions of one economic agent may change the reward function of another agent, e.g., a consumer's expendable income changes when firms change prices or governments change taxes. We show that multi-agent deep reinforcement learning (RL) can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types, in economic simulations with many agents, through the use of structured learning curricula and efficient GPU-only simulation and training. Conceptually, our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing, that are commonly used for analytical tractability. Our GPU implementation enables training and analyzing economies with a large number of agents within reasonable time frames, e.g., training completes within a day. We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes. We validate the learned meta-game epsilon-Nash equilibria through approximate best-response analyses, show that RL policies align with economic intuitions, and that our approach is constructive, e.g., by explicitly learning a spectrum of meta-game epsilon-Nash equilibria in open RBC models.

Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models. Our experiments show that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup. We also report new state of the art results on both IWSLT'14 German-English translation as well as Gigaword abstractive summarization. On the larger WMT'14 English-French translation task, sequence-level training achieves 41.5 BLEU which is on par with the state of the art.

Networks provide a powerful formalism for modeling complex systems, by representing the underlying set of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once; for example, communication within a group rather than person-to-person, collaboration among a team rather than a pair of co-authors, or biological interaction between a set of molecules rather than just two. We refer to these type of simultaneous interactions on sets of more than two nodes as higher-order interactions; they are ubiquitous, but the empirical study of them has lacked a general framework for evaluating higher-order models. Here we introduce such a framework, based on link prediction, a fundamental problem in network analysis. The traditional link prediction problem seeks to predict the appearance of new links in a network, and here we adapt it to predict which (larger) sets of elements will have future interactions. We study the temporal evolution of 19 datasets from a variety of domains, and use our higher-order formulation of link prediction to assess the types of structural features that are most predictive of new multi-way interactions. Among our results, we find that different domains vary considerably in their distribution of higher-order structural parameters, and that the higher-order link prediction problem exhibits some fundamental differences from traditional pairwise link prediction, with a greater role for local rather than long-range information in predicting the appearance of new interactions.

北京阿比特科技有限公司