97SE亚洲国产综合在线,日本一区不卡在线观看,国产免费啪嗒啪嗒视频看看

When a finite order vector autoregressive model is fitted to VAR($\infty$) data the asymptotic distribution of statistics obtained via smooth functions of least-squares estimates requires care. L\"utkepohl and Poskitt (1991) provide a closed-form expression for the limiting distribution of (structural) impulse responses for sieve VAR models based on the Delta method. Yet, numerical simulations have shown that confidence intervals built in such way appear overly conservative. In this note I argue that these results stem naturally from the limit arguments used in L\"utkepohl and Poskitt (1991), that they manifest when sieve inference is improperly applied, and that they can be "remedied" by either using bootstrap resampling or, simply, by using standard (non-sieve) asymptotics.

相關內容

估計/估計量

關注 3

估計/估計量 · 隨機采樣 · 向量化 · 自回歸過程 · 狀態轉移矩陣 ·

2021 年 6 月 17 日

Minimax Estimation of Partially-Observed Vector AutoRegressions

Guillaume Dalle,Yohann de Castro

To understand the behavior of large dynamical systems like transportation networks, one must often rely on measurements transmitted by a set of sensors, for instance individual vehicles. Such measurements are likely to be incomplete and imprecise, which makes it hard to recover the underlying signal of interest.Hoping to quantify this phenomenon, we study the properties of a partially-observed state-space model. In our setting, the latent state $X$ follows a high-dimensional Vector AutoRegressive process $X_t = \theta X_{t-1} + \varepsilon_t$. Meanwhile, the observations $Y$ are given by a noise-corrupted random sample from the state $Y_t = \Pi_t X_t + \eta_t$. Several random sampling mechanisms are studied, allowing us to investigate the effect of spatial and temporal correlations in the distribution of the sampling matrices $\Pi_t$.We first prove a lower bound on the minimax estimation error for the transition matrix $\theta$. We then describe a sparse estimator based on the Dantzig selector and upper bound its non-asymptotic error, showing that it achieves the optimal convergence rate for most of our sampling mechanisms. Numerical experiments on simulated time series validate our theoretical findings, while an application to open railway data highlights the relevance of this model for public transport traffic analysis.

MoDELS · 學成 · 優化器 · 樣本復雜度 · INTERACT ·

2021 年 6 月 16 日

Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models

Enric Boix-Adsera,Guy Bresler,Frederic Koehler

from arxiv, 49 pages, 3 figures

We consider the problem of learning a tree-structured Ising model from data, such that subsequent predictions computed using the model are accurate. Concretely, we aim to learn a model such that posteriors $P(X_i|X_S)$ for small sets of variables $S$ are accurate. Since its introduction more than 50 years ago, the Chow-Liu algorithm, which efficiently computes the maximum likelihood tree, has been the benchmark algorithm for learning tree-structured graphical models. A bound on the sample complexity of the Chow-Liu algorithm with respect to the prediction-centric local total variation loss was shown in [BK19]. While those results demonstrated that it is possible to learn a useful model even when recovering the true underlying graph is impossible, their bound depends on the maximum strength of interactions and thus does not achieve the information-theoretic optimum. In this paper, we introduce a new algorithm that carefully combines elements of the Chow-Liu algorithm with tree metric reconstruction methods to efficiently and optimally learn tree Ising models under a prediction-centric loss. Our algorithm is robust to model misspecification and adversarial corruptions. In contrast, we show that the celebrated Chow-Liu algorithm can be arbitrarily suboptimal.

馬爾可夫鏈 · 估計/估計量 · 核化 · 振蕩 · 可辨認的 ·

2021 年 6 月 16 日

Central limit theorem for kernel estimator of invariant density in bifurcating Markov chains models

S. Valère Bitseki Penda,Jean-Fran?ois Delmas

from arxiv, 40 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:2012.04741; text overlap with arXiv:2106.07711

Bifurcating Markov chains (BMC) are Markov chains indexed by a full binary tree representing the evolution of a trait along a population where each individual has two children. Motivated by the functional estimation of the density of the invariant probability measure which appears as the asymptotic distribution of the trait, we prove the consistence and the Gaussian fluctuations for a kernel estimator of this density based on late generations. In this setting, it is interesting to note that the distinction of the three regimes on the ergodic rate identified in a previous work (for fluctuations of average over large generations) disappears. This result is a first step to go beyond the threshold condition on the ergodic rate given in previous statistical papers on functional estimation.

估計/估計量 · 極小點 · 估計誤差 · 可約的 · 約束 ·

2021 年 6 月 16 日

Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

Wei-Ning Chen,Peter Kairouz,Ayfer ?zgür

We consider the problem of estimating a $d$-dimensional $s$-sparse discrete distribution from its samples observed under a $b$-bit communication constraint. The best-known previous result on $\ell_2$ estimation error for this problem is $O\left( \frac{s\log\left( {d}/{s}\right)}{n2^b}\right)$. Surprisingly, we show that when sample size $n$ exceeds a minimum threshold $n^*(s, d, b)$, we can achieve an $\ell_2$ estimation error of $O\left( \frac{s}{n2^b}\right)$. This implies that when $n>n^*(s, d, b)$ the convergence rate does not depend on the ambient dimension $d$ and is the same as knowing the support of the distribution beforehand. We next ask the question: ``what is the minimum $n^*(s, d, b)$ that allows dimension-free convergence?''. To upper bound $n^*(s, d, b)$, we develop novel localization schemes to accurately and efficiently localize the unknown support. For the non-interactive setting, we show that $n^*(s, d, b) = O\left( \min \left( {d^2\log^2 d}/{2^b}, {s^4\log^2 d}/{2^b}\right) \right)$. Moreover, we connect the problem with non-adaptive group testing and obtain a polynomial-time estimation scheme when $n = \tilde{\Omega}\left({s^4\log^4 d}/{2^b}\right)$. This group testing based scheme is adaptive to the sparsity parameter $s$, and hence can be applied without knowing it. For the interactive setting, we propose a novel tree-based estimation scheme and show that the minimum sample-size needed to achieve dimension-free convergence can be further reduced to $n^*(s, d, b) = \tilde{O}\left( {s^2\log^2 d}/{2^b} \right)$.

估計/估計量 · 優化器 · Weight · 設計 · Continuity ·

2021 年 6 月 16 日

Optimal sampling for design-based estimators of regression models

Tong Chen,Thomas Lumley

Two-phase designs measure variables of interest on a subcohort where the outcome and covariates are readily available or cheap to collect on all individuals in the cohort. Given limited resource availability, it is of interest to find an optimal design that includes more informative individuals in the final sample. We explore the optimal designs and efficiencies for analysis by design-based estimators. Generalized raking is an efficient design-based estimator that improves on the inverse-probability weighted (IPW) estimator by adjusting weights based on the auxiliary information. We derive a closed-form solution of the optimal design for estimating regression coefficients from generalized raking estimators. We compare it with the optimal design for analysis via the IPW estimator and other two-phase designs in measurement-error settings. We consider general two-phase designs where the outcome variable and variables of interest can be continuous or discrete. Our results show that the optimal designs for analysis by the two design-based estimators can be very different. The optimal design for IPW estimation is optimal for analysis via the IPW estimator and typically gives near-optimal efficiency for generalized raking, though we show there is potential improvement in some settings.

Pair · 情景 · 基 · 離散數學 ·

2021 年 6 月 15 日

On Dualization over Distributive Lattices

Khaled Elbassioni

Given a partially order set (poset) $P$, and a pair of families of ideals $\cI$ and filters $\cF$ in $P$ such that each pair $(I,F)\in \cI\times\cF$ has a non-empty intersection, the dualization problem over $P$ is to check whether there is an ideal $X$ in $P$ which intersects every member of $\cF$ and does not contain any member of $\cI$. Equivalently, the problem is to check for a distributive lattice $L=L(P)$, given by the poset $P$ of its set of joint-irreducibles, and two given antichains $\cA,\cB\subseteq L$ such that no $a\in\cA$ is dominated by any $b\in\cB$, whether $\cA$ and $\cB$ cover (by domination) the entire lattice. We show that the problem can be solved in quasi-polynomial time in the sizes of $P$, $\cA$ and $\cB$, thus answering an open question in \cite{BK17}. As an application, we show that minimal infrequent closed sets of attributes in a rational database, with respect to a given implication base of maximum premise size of one, can be enumerated in incremental quasi-polynomial time.

估計/估計量 · 局部極大點 · Extensibility · 可約的 · Processing（編程語言） ·

2021 年 6 月 15 日

Estimating Atmospheric Motion Winds from Satellite Image Data using Space-time Drift Models

Indranil Sahoo,Joseph Guinness,Brian J. Reich

Geostationary satellites collect high-resolution weather data comprising a series of images which can be used to estimate wind speed and direction at different altitudes. The Derived Motion Winds (DMW) Algorithm is commonly used to process these data and estimate atmospheric winds by tracking features in images taken by the GOES-R series of the NOAA geostationary meteorological satellites. However, the wind estimates from the DMW Algorithm are sparse and do not come with uncertainty measures. This motivates us to statistically model wind motions as a spatial process drifting in time. We propose a covariance function that depends on spatial and temporal lags and a drift parameter to capture the wind speed and wind direction. We estimate the parameters by local maximum likelihood. Our method allows us to compute standard errors of the estimates, enabling spatial smoothing of the estimates using a Gaussian kernel weighted by the inverses of the estimated variances. We conduct extensive simulation studies to determine the situations where our method performs well. The proposed method is applied to the GOES-15 brightness temperature data over Colorado and reduces prediction error of brightness temperature compared to the DMW Algorithm.

估計/估計量 · 潛變量/隱變量 · Continuity · 再參數化/重參數化 · 離散化 ·

2021 年 6 月 15 日

Coupled Gradient Estimators for Discrete Latent Variables

Zhe Dong,Andriy Mnih,George Tucker

from arxiv, Under Review

Training models with discrete latent variables is challenging due to the high variance of unbiased gradient estimators. While low-variance reparameterization gradients of a continuous relaxation can provide an effective solution, a continuous relaxation is not always available or tractable. Dong et al. (2020) and Yin et al. (2020) introduced a performant estimator that does not rely on continuous relaxations; however, it is limited to binary random variables. We introduce a novel derivation of their estimator based on importance sampling and statistical couplings, which we extend to the categorical setting. Motivated by the construction of a stick-breaking coupling, we introduce gradient estimators based on reparameterizing categorical variables as sequences of binary variables and Rao-Blackwellization. In systematic experiments, we show that our proposed categorical gradient estimators provide state-of-the-art performance, whereas even with additional Rao-Blackwellization, previous estimators (Yin et al., 2019) underperform a simpler REINFORCE with a leave-one-out-baseline estimator (Kool et al., 2019).

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.