欧美狂野视频一区国产精品_国产精品性爱视频亚洲国产黄片_999久久久无码精品亚洲日韩_成人黄黄的网站在线观看_国产一国产一级毛片视频下载_亚洲熟妇无码久久精品APP_久久久精品国产亚洲一区

One of the main problem in prediction theory of stationary processes $X(t)$ is to describe the asymptotic behavior of the best linear mean squared prediction error in predicting $X(0)$ given $ X(t),$ $-n\le t\le-1$, as $n$ goes to infinity. This behavior depends on the regularity (deterministic or non-deterministic) of the process $X(t)$. In his seminal paper {\it 'Some purely deterministic processes' (J. of Math. and Mech.,} {\bf 6}(6), 801-810, 1957), for a specific spectral density that has a very high order contact with zero M. Rosenblatt showed that the prediction error behaves like a power as $n\to\f$. In the paper Babayan et al. {\it 'Extensions of Rosenblatt's results on the asymptotic behavior of the prediction error for deterministic stationary sequences' (J. Time Ser. Anal.} {\bf 42}, 622-652, 2021), Rosenblatt's result was extended to the class of spectral densities of the form $f=f_dg$, where $f_d$ is the spectral density of a deterministic process that has a very high order contact with zero, while $g$ is a function that can have polynomial type singularities. In this paper, we describe new extensions of the above quoted results in the case where the function $g$ can have {\it arbitrary power type singularities}. Examples illustrate the obtained results.

相關內容

平穩的

關注 0

樣本復雜度 · 優化器 · 正則化 · 樣本 · 價值函數 ·

2022 年 1 月 25 日

Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

Yan Li,Tuo Zhao,Guanghui Lan

We propose the homotopic policy mirror descent (HPMD) method for solving discounted, infinite horizon MDPs with finite state and action space, and study its policy convergence. We report three properties that seem to be new in the literature of policy gradient methods: (1) The policy first converges linearly, then superlinearly with order $\gamma^{-2}$ to the set of optimal policies, after $\mathcal{O}(\log(1/\Delta^*))$ number of iterations, where $\Delta^*$ is defined via a gap quantity associated with the optimal state-action value function; (2) HPMD also exhibits last-iterate convergence, with the limiting policy corresponding exactly to the optimal policy with the maximal entropy for every state. No regularization is added to the optimization objective and hence the second observation arises solely as an algorithmic property of the homotopic policy gradient method. (3) For the stochastic HPMD method, we further demonstrate a better than $\mathcal{O}(|\mathcal{S}| |\mathcal{A}| / \epsilon^2)$ sample complexity for small optimality gap $\epsilon$, when assuming a generative model for policy evaluation.

隨機場 · 線性的 · 優化器 · 預測器/決策函數 · 泛函 ·

2022 年 1 月 25 日

Necessary and sufficient conditions for asymptotically optimal linear prediction of random fields on compact metric spaces

Kristin Kirchner,David Bolin

from arxiv, 36 pages (including the supplement). To appear in Annals of Statistics

Optimal linear prediction (aka. kriging) of a random field $\{Z(x)\}_{x\in\mathcal{X}}$ indexed by a compact metric space $(\mathcal{X},d_{\mathcal{X}})$ can be obtained if the mean value function $m\colon\mathcal{X}\to\mathbb{R}$ and the covariance function $\varrho\colon\mathcal{X}\times\mathcal{X}\to\mathbb{R}$ of $Z$ are known. We consider the problem of predicting the value of $Z(x^*)$ at some location $x^*\in\mathcal{X}$ based on observations at locations $\{x_j\}_{j=1}^n$ which accumulate at $x^*$ as $n\to\infty$ (or, more generally, predicting $\varphi(Z)$ based on $\{\varphi_j(Z)\}_{j=1}^n$ for linear functionals $\varphi,\varphi_1,\ldots,\varphi_n$). Our main result characterizes the asymptotic performance of linear predictors (as $n$ increases) based on an incorrect second order structure $(\tilde{m},\tilde{\varrho})$, without any restrictive assumptions on $\varrho,\tilde{\varrho}$ such as stationarity. We, for the first time, provide necessary and sufficient conditions on $(\tilde{m},\tilde{\varrho})$ for asymptotic optimality of the corresponding linear predictor holding uniformly with respect to $\varphi$. These general results are illustrated by weakly stationary random fields on $\mathcal{X}\subset\mathbb{R}^d$ with Mat\'ern or periodic covariance functions, and on the sphere $\mathcal{X}=\mathbb{S}^2$ for the case of two isotropic covariance functions.

UniFormer · 隨機梯度下降 · 動量 · SGD · 高斯混合（模型） ·

2022 年 1 月 25 日

On Uniform Boundedness Properties of SGD and its Momentum Variants

Xiaoyu Wang,Mikael Johansson

A theoretical, and potentially also practical, problem with stochastic gradient descent is that trajectories may escape to infinity. In this note, we investigate uniform boundedness properties of iterates and function values along the trajectories of the stochastic gradient descent algorithm and its important momentum variant. Under smoothness and $R$-dissipativity of the loss function, we show that broad families of step-sizes, including the widely used step-decay and cosine with (or without) restart step-sizes, result in uniformly bounded iterates and function values. Several important applications that satisfy these assumptions, including phase retrieval problems, Gaussian mixture models and some neural network classifiers, are discussed in detail.

CASES · 穩健性 · 控制器 · 設計 · 軟件工程 ·

2022 年 1 月 25 日

Investigating Software Testability and Test cases Effectiveness

Mamdouh Alenezi

Software measurement is an essential management tool to develop robust and maintainable software systems. Software metrics can be used to control the inherent complexities in software design. To guarantee that the components of the software are inevitably testable, the testability attribute is used, which is a sub-characteristics of the software's maintabilility as well as quality assurance. This study investigates the relationship between static code and test metrics and testability and test cases effectiveness. The study answers three formulated research questions. The results of the analysis showed that size and complexity metrics are suitable for predicting the testability of object-oriented classes.

ReLU · Extensibility · 欠估計 · 方差 · MoDELS ·

2022 年 1 月 24 日

An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence

Agustinus Kristiadi,Matthias Hein,Philipp Hennig

from arxiv, NeurIPS 2021

A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data. But far away from them, ReLU Bayesian neural networks (BNNs) can still underestimate uncertainty and thus be asymptotically overconfident. This issue arises since the output variance of a BNN with finitely many features is quadratic in the distance from the data region. Meanwhile, Bayesian linear models with ReLU features converge, in the infinite-width limit, to a particular Gaussian process (GP) with a variance that grows cubically so that no asymptotic overconfidence can occur. While this may seem of mostly theoretical interest, in this work, we show that it can be used in practice to the benefit of BNNs. We extend finite ReLU BNNs with infinite ReLU features via the GP and show that the resulting model is asymptotically maximally uncertain far away from the data while the BNNs' predictive power is unaffected near the data. Although the resulting model approximates a full GP posterior, thanks to its structure, it can be applied \emph{post-hoc} to any pre-trained ReLU BNN at a low cost.

統計效率 · 統計量 · 泛函 · 優化器 · 再縮放 ·

2022 年 1 月 23 日

Polyak-Ruppert Averaged Q-Leaning is Statistically Efficient

Xiang Li,Wenhao Yang,Jiadong Liang,Zhihua Zhang,Michael I. Jordan

We study synchronous Q-learning with Polyak-Ruppert averaging (a.k.a., averaged Q-learning) in a $\gamma$-discounted MDP. We establish a functional central limit theorem (FCLT) for the averaged iteration $\bar{\boldsymbol{Q}}_T$ and show its standardized partial-sum process weakly converges to a rescaled Brownian motion. Furthermore, we show that $\bar{\boldsymbol{Q}}_T$ is actually a regular asymptotically linear (RAL) estimator for the optimal Q-value function $\boldsymbol{Q}^*$ with the most efficient influence function. This implies the averaged Q-learning iteration has the smallest asymptotic variance among all RAL estimators. In addition, we present a non-asymptotic analysis for the $\ell_{\infty}$ error $\mathbb{E}\|\bar{\boldsymbol{Q}}_T-\boldsymbol{Q}^*\|_{\infty}$, showing for polynomial step sizes it matches the instance-dependent lower bound as well as the optimal minimax complexity lower bound. In short, our theoretical analysis shows averaged Q-learning is statistically efficient.

線性的 · Weight · 后向 · 近似 · 數值分析 ·

2022 年 1 月 22 日

Analysis of a new type of fractional linear multistep method of order two with improved stability

H. M. Nasir,Khadija Al Hasani

from arxiv, 15 pages 5 images in 2 Figures

We present and investigate a new type of implicit fractional linear multistep method of order two for fractional initial value problems. The method is obtained from the second order super convergence of the Gr\"unwald-Letnikov approximation of the fractional derivative at a non-integer shift point. The proposed method is of order two consistency and coincides with the backward difference method of order two for classical initial value problems when the order of the derivative is one. The weight coefficients of the proposed method are obtained from the Gr\"unwald weights and hence computationally efficient compared with that of the fractional backward difference formula of order two. The stability properties are analyzed and shown that the stability region of the method is larger than that of the fractional Adams-Moulton method of order two and the fractional trapezoidal method. Numerical result and illustrations are presented to justify the analytical theories.

線性的 · 可約的 · 向量化 · Atom（文本編輯器） · 情景 ·

2022 年 1 月 22 日

Solvability of orbit-finite systems of linear equations

Arka Ghosh,Piotr Hofman,S?awomir Lasota

We study orbit-finite systems of linear equations, in the setting of sets with atoms. Our principal contribution is a decision procedure for solvability of such systems. The procedure works for every field (and even commutative ring) under mild effectiveness assumptions, and reduces a given orbit-finite system to a number of finite ones: exponentially many in general, but polynomially many when atom dimension of input systems is fixed. Towards obtaining the procedure we push further the theory of vector spaces generated by orbit-finite sets, and show that each such vector space admits an orbit-finite basis. This fundamental property is a key tool in our development, but should be also of wider interest.

貪心逐層預訓練 · 貪心 · 正交 · Neural Networks · 優化器 ·

2022 年 1 月 21 日

Optimal Convergence Rates for the Orthogonal Greedy Algorithm

Jonathan W. Siegel,Jinchao Xu

We analyze the orthogonal greedy algorithm when applied to dictionaries $\mathbb{D}$ whose convex hull has small entropy. We show that if the metric entropy of the convex hull of $\mathbb{D}$ decays at a rate of $O(n^{-\frac{1}{2}-\alpha})$ for $\alpha > 0$, then the orthogonal greedy algorithm converges at the same rate on the variation space of $\mathbb{D}$. This improves upon the well-known $O(n^{-\frac{1}{2}})$ convergence rate of the orthogonal greedy algorithm in many cases, most notably for dictionaries corresponding to shallow neural networks. These results hold under no additional assumptions on the dictionary beyond the decay rate of the entropy of its convex hull. In addition, they are robust to noise in the target function and can be extended to convergence rates on the interpolation spaces of the variation norm. We show empirically that the predicted rates are obtained for the dictionary corresponding to shallow neural networks with Heaviside activation function in two dimensions. Finally, we show that these improved rates are sharp and prove a negative result showing that the iterates generated by the orthogonal greedy algorithm cannot in general be bounded in the variation norm of $\mathbb{D}$.

注意力機制 · 卡爾曼濾波 · CTR · Extensibility · MoDELS ·

2020 年 10 月 2 日

Kalman Filtering Attention for User Behavior Modeling in CTR Prediction

Hu Liu,Jing Lu,Xiwei Zhao,Sulong Xu,Hao Peng,Yutong Liu,Zehua Zhang,Jian Li,Junsheng Jin,Yongjun Bao,Weipeng Yan

Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of a leading e commerce website, serving the main traffic of hundreds of millions of active users everyday.