成年人日屄视频免费观看,GOGOGO高清在线播放,亚洲主播福利视频网,性一级录像视频网站免费大全,欧美人妻中文有码日韩大香蕉

We revise the proof of low-rate upper bounds on the reliability function of discrete memoryless channels for ordinary and list-decoding schemes, in particular Berlekamp and Blinovsky's zero-rate bound, as well as Blahut's bound for low rates. The available proofs of the zero-rate bound devised by Berlekamp and Blinovsky are somehow complicated in that they contain in one form or another some cumbersome "non-standard" procedures or computations. Here we follow Blinovsky's idea of using a Ramsey-theoretic result by Komlos, and we complement it with some missing steps to present a proof which is rigorous and easier to inspect. Furthermore, we show how these techniques can be used to fix an error that invalidated the proof of Blahut's low-rate bound, which is here presented in an extended form for list decoding and for general channels.

相關內容

離散化

關注 0

CC · 情景 · 易處理的 · 示例 · Facebook AI Research ·

2022 年 2 月 3 日

Multivariate Algorithmics for Eliminating Envy by Donating Goods

Niclas Boehmer,Robert Bredereck,Klaus Heeger,Du?an Knop,Junjie Luo

from arxiv, Accepted to AAMAS'22

Fairly dividing a set of indivisible resources to a set of agents is of utmost importance in some applications. However, after an allocation has been implemented the preferences of agents might change and envy might arise. We study the following problem to cope with such situations: Given an allocation of indivisible resources to agents with additive utility-based preferences, is it possible to socially donate some of the resources (which means removing these resources from the allocation instance) such that the resulting modified allocation is envy-free (up to one good). We require that the number of deleted resources and/or the caused utilitarian welfare loss of the allocation are bounded. We conduct a thorough study of the (parameterized) computational complexity of this problem considering various natural and problem-specific parameters (e.g., the number of agents, the number of deleted resources, or the maximum number of resources assigned to an agent in the initial allocation) and different preference models, including unary and 0/1-valuations. In our studies, we obtain a rich set of (parameterized) tractability and intractability results and discover several surprising contrasts, for instance, between the two closely related fairness concepts envy-freeness and envy-freeness up to one good and between the influence of the parameters maximum number and welfare of the deleted resources.

Processing（編程語言） · 優化器 · 泛函 · 樣本 · Extensibility ·

2022 年 2 月 2 日

Gaussian Process Sampling and Optimization with Approximate Upper and Lower Bounds

Vu Nguyen,Marc Peter Deisenroth,Michael A. Osborne

from arxiv, 19 pages

Many functions have approximately-known upper and/or lower bounds, potentially aiding the modeling of such functions. In this paper, we introduce Gaussian process models for functions where such bounds are (approximately) known. More specifically, we propose the first use of such bounds to improve Gaussian process (GP) posterior sampling and Bayesian optimization (BO). That is, we transform a GP model satisfying the given bounds, and then sample and weight functions from its posterior. To further exploit these bounds in BO settings, we present bounded entropy search (BES) to select the point gaining the most information about the underlying function, estimated by the GP samples, while satisfying the output constraints. We characterize the sample variance bounds and show that the decision made by BES is explainable. Our proposed approach is conceptually straightforward and can be used as a plug in extension to existing methods for GP posterior sampling and Bayesian optimization.

Networking · SimPLe · 可約的 · 平穩分布 · Gossip協議 ·

2022 年 2 月 2 日

Age Distribution in Arbitrary Preemptive Memoryless Networks

Rajai Nasser,Ibrahim Issa,Ibrahim Abou-Faycal

from arxiv, 20 pages

We study the probability distribution of age of information (AoI) in arbitrary networks with memoryless service times. A source node generates packets following a Poisson process, and then the packets are forwarded across the network in such a way that newer updates preempt older ones. This model is equivalent to gossip networks that was recently studied by Yates, and for which he obtained a recursive formula allowing the computation for the average AoI. In this paper, we obtain a very simple characterization of the stationary distribution of AoI at every node in the network. This allows for the computation of the average of an arbitrary function of the age. In particular, we can compute age-violation probabilities. Furthermore, we show how it is possible to use insights from our simple characterization in order to substantially reduce the computation time of average AoIs in some structured networks. Finally, we describe how it is possible to use our characterization in order to obtain faster and more accurate Monte Carlo simulations estimating the average AoI, or the average of an arbitrary function of the age.

香農 · Networking · 逼真度 · 準則 · MoDELS ·

2022 年 2 月 2 日

Shannon Bounds on Lossy Gray-Wyner Networks

Erixhen Sula,Michael Gastpar

from arxiv, 6 pages, 2 figures

The Gray-Wyner network subject to a fidelity criterion is studied. Upper and lower bounds for the trade-offs between the private sum-rate and the common rate are obtained for arbitrary sources subject to mean-squared error distortion. The bounds meet exactly, leading to the computation of the rate region, when the source is jointly Gaussian. They meet partially when the sources are modeled via an additive Gaussian "channel". The bounds are inspired from the Shannon bounds on the rate-distortion problem.

離散化 · 判別器 · 通道 · 傳感器 · 可辨認的 ·

2022 年 2 月 2 日

On Joint Communication and Channel Discrimination

Han Wu,Hamdi Joudeh

We consider a basic communication and sensing setup comprising a transmitter, a receiver and a sensor. The transmitter sends an encoded sequence to the receiver through a discrete memoryless channel, and the receiver is interested in decoding the sequence. On the other hand, the sensor picks up a noisy version of the transmitted sequence through one of two possible discrete memoryless channels. The sensor knows the transmitted sequence and wishes to discriminate between the two possible channels, i.e. to identify the channel that has generated the output given the input. We study the trade-off between communication and sensing in the asymptotic regime, captured in terms of the coding rate to the receiver against the discrimination error exponent at the sensor. We characterize the optimal rate-exponent trade-off for general discrete memoryless channels with an input cost constraint.

近似 · 泛函 · AIM · 正則的 · 奇異的 ·

2022 年 2 月 2 日

Quasi-collocation based on CCC-Schoenberg operators and collocation methods

Tina Bosner

We propose a collocation and quasi-collocation method for solving second order boundary value problems $L_2 y=f$, in which the differential operator $L_2$ can be represented in the product formulation, aiming mostly on singular and singularly perturbed boundary value problems. Seeking an approximating Canonical Complete Chebyshev spline $s$ by a collocation method leads to demand that $L_2s$ interpolates the function $f$. On the other hand, in quasi-collocation method we require that $L_2 s$ is equal to an approximation of $f$ by the Schoenberg operator. We offer the calculation of both methods based on the Green's function, and give their error bounds.

估計/估計量 · Integration · 通道 · 狀態估計 · contrastive ·

2022 年 2 月 2 日

Coding for Sensing: An Improved Scheme for Integrated Sensing and Communication over MACs

Mehrasa Ahmadipour,Michele Wigger,Mari Kobayashi

A memoryless state-dependent multiple-access channel (MAC) is considered, where two transmitters wish to convey their messages to a single receiver while simultaneously sensing (estimating) the respective states via generalized feedbacks. For this channel, an improved inner bound is provided on the \emph{fundamental rate-distortions tradeoff} which characterizes the communication rates the transmitters can achieve while simultaneously ensuring that their state-estimates satisfy desired distortion criteria. The new inner bound is based on a scheme where each transmitter codes over the generalized feedback so as to improve the state estimation at the other transmitter. This is in contrast to the schemes proposed for point-to-point and broadcast channels where coding is used only for the transmission of messages and the optimal estimators operate on a symbol-by-symbol basis on the sequences of channel inputs and feedback outputs.

估計/估計量 · 學成 · off-policy · 價值函數 · TD ·

2022 年 2 月 2 日

Chaining Value Functions for Off-Policy Learning

Simon Schmitt,John Shawe-Taylor,Hado van Hasselt

To accumulate knowledge and improve its policy of behaviour, a reinforcement learning agent can learn `off-policy' about policies that differ from the policy used to generate its experience. This is important to learn counterfactuals, or because the experience was generated out of its own control. However, off-policy learning is non-trivial, and standard reinforcement-learning algorithms can be unstable and divergent. In this paper we discuss a novel family of off-policy prediction algorithms which are convergent by construction. The idea is to first learn on-policy about the data-generating behaviour, and then bootstrap an off-policy value estimate on this on-policy estimate, thereby constructing a value estimate that is partially off-policy. This process can be repeated to build a chain of value functions, each time bootstrapping a new estimate on the previous estimate in the chain. Each step in the chain is stable and hence the complete algorithm is guaranteed to be stable. Under mild conditions this comes arbitrarily close to the off-policy TD solution when we increase the length of the chain. Hence it can compute the solution even in cases where off-policy TD diverges. We prove that the proposed scheme is convergent and corresponds to an iterative decomposition of the inverse key matrix. Furthermore it can be interpreted as estimating a novel objective -- that we call a `k-step expedition' -- of following the target policy for finitely many steps before continuing indefinitely with the behaviour policy. Empirically we evaluate the idea on challenging MDPs such as Baird's counter example and observe favourable results.

自助法/自舉法 · 學成 · state-of-the-art · 機器人 · Extensibility ·

2022 年 2 月 2 日

Using Deep Learning to Bootstrap Abstractions for Hierarchical Robot Planning

Naman Shah,Siddharth Srivastava

This paper addresses the problem of learning abstractions that boost robot planning performance while providing strong guarantees of reliability. Although state-of-the-art hierarchical robot planning algorithms allow robots to efficiently compute long-horizon motion plans for achieving user desired tasks, these methods typically rely upon environment-dependent state and action abstractions that need to be hand-designed by experts. We present a new approach for bootstrapping the entire hierarchical planning process. It shows how abstract states and actions for new environments can be computed automatically using the critical regions predicted by a deep neural-network with an auto-generated robot specific architecture. It uses the learned abstractions in a novel multi-source bi-directional hierarchical robot planning algorithm that is sound and probabilistically complete. An extensive empirical evaluation on twenty different settings using holonomic and non-holonomic robots shows that (a) the learned abstractions provide the information necessary for efficient multi-source hierarchical planning; and that (b) this approach of learning abstraction and planning outperforms state-of-the-art baselines by nearly a factor of ten in terms of planning time on test environments not seen during training.

優化器 · 強化學習 · 學成 · state-of-the-art · SimPLe ·

2018 年 7 月 25 日

Variational Bayesian Reinforcement Learning with Regret Bounds

Brendan O'Donoghue

We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.