亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper presents a case study for the application of semiring semantics for fixed-point formulae to the analysis of strategies in B\"uchi games. Semiring semantics generalizes the classical Boolean semantics by permitting multiple truth values from certain semirings. Evaluating the fixed-point formula that defines the winning region in a given game in an appropriate semiring of polynomials provides not only the Boolean information on who wins, but also tells us how they win and which strategies they might use. This is well-understood for reachability games, where the winning region is definable as a least fixed point. The case of B\"uchi games is of special interest, not only due to their practical importance, but also because it is the simplest case where the fixed-point definition involves a genuine alternation of a greatest and a least fixed point. We show that, in a precise sense, semiring semantics provide information about all absorption-dominant strategies -- strategies that win with minimal effort, and we discuss how these relate to positional and the more general persistent strategies. This information enables applications such as game synthesis or determining minimal modifications to the game needed to change its outcome. Lastly, we discuss limitations of our approach and present questions that cannot be immediately answered by semiring semantics.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · 優化器 · 轉移核 · 泛函 · AIM ·
2022 年 4 月 20 日

We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves in the finite-horizon setting, where the transition kernel of the underlying Markov games can be parameterized by a linear function over the current state, both players' actions and the next state. In particular, we assume that we can control both players and aim to find the Nash Equilibrium by minimizing the duality gap. We propose an algorithm Nash-UCRL based on the principle "Optimism-in-Face-of-Uncertainty". Our algorithm only needs to find a Coarse Correlated Equilibrium (CCE), which is computationally efficient. Specifically, we show that Nash-UCRL can provably achieve an $\tilde{O}(dH\sqrt{T})$ regret, where $d$ is the linear function dimension, $H$ is the length of the game and $T$ is the total number of steps in the game. To assess the optimality of our algorithm, we also prove an $\tilde{\Omega}( dH\sqrt{T})$ lower bound on the regret. Our upper bound matches the lower bound up to logarithmic factors, which suggests the optimality of our algorithm.

We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.

We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.

The success of large-scale models in recent years has increased the importance of statistical models with numerous parameters. Several studies have analyzed over-parameterized linear models with high-dimensional data that may not be sparse; however, existing results depend on the independent setting of samples. In this study, we analyze a linear regression model with dependent time series data under over-parameterization settings. We consider an estimator via interpolation and developed a theory for excess risk of the estimator under multiple dependence types. This theory can treat infinite-dimensional data without sparsity and handle long-memory processes in a unified manner. Moreover, we bound the risk in our theory via the integrated covariance and nondegeneracy of autocorrelation matrices. The results show that the convergence rate of risks with short-memory processes is identical to that of cases with independent data, while long-memory processes slow the convergence rate. We also present several examples of specific dependent processes that can be applied to our setting.

In this study, we examine a clustering problem in which the covariates of each individual element in a dataset are associated with an uncertainty specific to that element. More specifically, we consider a clustering approach in which a pre-processing applying a non-linear transformation to the covariates is used to capture the hidden data structure. To this end, we approximate the sets representing the propagated uncertainty for the pre-processed features empirically. To exploit the empirical uncertainty sets, we propose a greedy and optimistic clustering (GOC) algorithm that finds better feature candidates over such sets, yielding more condensed clusters. As an important application, we apply the GOC algorithm to synthetic datasets of the orbital properties of stars generated through our numerical simulation mimicking the formation process of the Milky Way. The GOC algorithm demonstrates an improved performance in finding sibling stars originating from the same dwarf galaxy. These realistic datasets have also been made publicly available.

Weighted automata are a generalization of nondeterministic automata that associate a weight drawn from a semiring $K$ with every transition and every state. Their behaviours can be formalized either as weighted language equivalence or weighted bisimulation. In this paper we explore the properties of weighted automata in the framework of coalgebras over (i) the category $\mathsf{SMod}$ of semimodules over a semiring $K$ and $K$-linear maps, and (ii) the category $\mathsf{Set}$ of sets and maps. We show that the behavioural equivalences defined by the corresponding final coalgebras in these two cases characterize weighted language equivalence and weighted bisimulation, respectively. These results extend earlier work by Bonchi et al. using the category $\mathsf{Vect}$ of vector spaces and linear maps as the underlying model for weighted automata with weights drawn from a field $K$. The key step in our work is generalizing the notions of linear relation and linear bisimulation of Boreale from vector spaces to semimodules using the concept of the kernel of a $K$-linear map in the sense of universal algebra. We also provide an abstract procedure for forward partition refinement for computing weighted language equivalence. Since for weighted automata defined over semirings the problem is undecidable in general, it is guaranteed to halt only in special cases. We provide sufficient conditions for the termination of our procedure. Although the results are similar to those of Bonchi et al., many of our proofs are new, especially those about the coalgebra in $\mathsf{SMod}$ characterizing weighted language equivalence.

Many forms of dependence manifest themselves over time, with behavior of variables in dynamical systems as a paradigmatic example. This paper studies temporal dependence in dynamical systems from a logical perspective, by extending a minimal modal base logic of static functional dependencies. We define a logic for dynamical systems with single time steps, provide a complete axiomatic proof calculus, and show the decidability of the satisfiability problem for a substantial fragment. The system comes in two guises: modal and first-order, that naturally complement each other. Next, we consider a timed semantics for our logic, as an intermediate between state spaces and temporal universes for the unfoldings of a dynamical system. We prove completeness and decidability by combining techniques from dynamic-epistemic logic and modal logic of functional dependencies with complex terms for objects. Also, we extend these results to the timed logic with functional symbols and term identity. Finally, we conclude with a brief outlook on how the system proposed here connects with richer temporal logics of system behavior, and with dynamic topological logic.

Given a matrix $A$ and vector $b$ with polynomial entries in $d$ real variables $\delta=(\delta_1,\ldots,\delta_d)$ we consider the following notion of feasibility: the pair $(A,b)$ is locally feasible if there exists an open neighborhood $U$ of $0$ such that for every $\delta\in U$ there exists $x$ satisfying $A(\delta)x\ge b(\delta)$ entry-wise. For $d=1$ we construct a polynomial time algorithm for deciding local feasibility. For $d \ge 2$ we show local feasibility is NP-hard. As an application (which was the primary motivation for this work) we give a computer-assisted proof of ergodicity of the following elementary 1D cellular automaton: given the current state $\eta_t \in \{0,1\}^{\mathbb{Z}}$ the next state $\eta_{t+1}(n)$ at each vertex $n\in \mathbb{Z}$ is obtained by $\eta_{t+1}(n)= \text{NAND}\big(\text{BSC}_\delta(\eta_t(n-1)), \text{BSC}_\delta(\eta_t(n))\big)$. Here the binary symmetric channel $\text{BSC}_\delta$ takes a bit as input and flips it with probability $\delta$ (and leaves it unchanged with probability $1-\delta$). We also consider the problem of broadcasting information on the 2D-grid of noisy binary-symmetric channels $\text{BSC}_\delta$, where each node may apply an arbitrary processing function to its input bits. We prove that there exists $\delta_0'>0$ such that for all noise levels $0<\delta<\delta_0'$ it is impossible to broadcast information for any processing function, as conjectured in Makur, Mossel, Polyanskiy (ISIT 2021).

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.

北京阿比特科技有限公司