精品夜色国产国偷自产乱码,全部免费毛片AV,亚洲国产精品无码久久影院

Online statistical inference facilitates real-time analysis of sequentially collected data, making it different from traditional methods that rely on static datasets. This paper introduces a novel approach to online inference in high-dimensional generalized linear models, where we update regression coefficient estimates and their standard errors upon each new data arrival. In contrast to existing methods that either require full dataset access or large-dimensional summary statistics storage, our method operates in a single-pass mode, significantly reducing both time and space complexity. The core of our methodological innovation lies in an adaptive stochastic gradient descent algorithm tailored for dynamic objective functions, coupled with a novel online debiasing procedure. This allows us to maintain low-dimensional summary statistics while effectively controlling optimization errors introduced by the dynamically changing loss functions. We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance. Numerical experiments demonstrate that the proposed ADL method consistently exhibits robust performance across various covariance matrix structures.

相關內容

統計量

關注 3

未標記 · 有偏 · 標注 · 類別 · INFORMS ·

2024 年 7 月 14 日

Augmented prediction of a true class for Positive Unlabeled data under selection bias

Jan Mielniczuk,Adam Wawrzeńczyk

We introduce a new observational setting for Positive Unlabeled (PU) data where the observations at prediction time are also labeled. This occurs commonly in practice -- we argue that the additional information is important for prediction, and call this task "augmented PU prediction". We allow for labeling to be feature dependent. In such scenario, Bayes classifier and its risk is established and compared with a risk of a classifier which for unlabeled data is based only on predictors. We introduce several variants of the empirical Bayes rule in such scenario and investigate their performance. We emphasise dangers (and ease) of applying classical classification rule in the augmented PU scenario -- due to no preexisting studies, an unaware researcher is prone to skewing the obtained predictions. We conclude that the variant based on recently proposed variational autoencoder designed for PU scenario works on par or better than other considered variants and yields advantage over feature-only based methods in terms of accuracy for unlabeled samples.

可約的 · Processing（編程語言） · 相似度 · 類別 · 樣本 ·

2024 年 7 月 14 日

Multiple testing with anytime-valid Monte-Carlo p-values

Lasse Fischer,Aaditya Ramdas

from arxiv, 22 pages, 3 figures

In contemporary problems involving genetic or neuroimaging data, thousands of hypotheses need to be tested. Due to their high power, and finite sample guarantees on type-1 error under weak assumptions, Monte-Carlo permutation tests are often considered as gold standard for these settings. However, the enormous computational effort required for (thousands of) permutation tests is a major burden. Recently, Fischer and Ramdas (2024) constructed a permutation test for a single hypothesis in which the permutations are drawn sequentially one-by-one and the testing process can be stopped at any point without inflating the type I error. They showed that the number of permutations can be substantially reduced (under null and alternative) while the power remains similar. We show how their approach can be modified to make it suitable for a broad class of multiple testing procedures. In particular, we discuss its use with the Benjamini-Hochberg procedure and illustrate the application on a large dataset.

Tensor · 秩 · 優化器 · Performer · 可約的 ·

2024 年 7 月 13 日

Low-rank optimization on Tucker tensor varieties

Bin Gao,Renfeng Peng,Ya-xiang Yuan

from arxiv, 50 pages, 16 figures, 1 table

In the realm of tensor optimization, the low-rank Tucker decomposition is crucial for reducing the number of parameters and for saving storage. We explore the geometry of Tucker tensor varieties -- the set of tensors with bounded Tucker rank -- which is notably more intricate than the well-explored matrix varieties. We give an explicit parametrization of the tangent cone of Tucker tensor varieties and leverage its geometry to develop provable gradient-related line-search methods for optimization on Tucker tensor varieties. To the best of our knowledge, this is the first work concerning geometry and optimization on Tucker tensor varieties. In practice, low-rank tensor optimization suffers from the difficulty of choosing a reliable rank parameter. To this end, we incorporate the established geometry and propose a Tucker rank-adaptive method that aims to identify an appropriate rank with guaranteed convergence. Numerical experiments on tensor completion reveal that the proposed methods are in favor of recovering performance over other state-of-the-art methods. The rank-adaptive method performs the best across various rank parameter selections and is indeed able to find an appropriate rank.

統計量 · AIM · 秩 · 原點 · 優化器 ·

2024 年 7 月 12 日

Q statistics in data depth: fundamental theory revisited and variants

Min Gao,Yiting Chen,Xiaoping Shi,Wenzhi Yang

Recently, data depth has been widely used to rank multivariate data. The study of the depth-based $Q$ statistic, originally proposed by Liu and Singh (1993), has become increasingly popular when it can be used as a quality index to differentiate between two samples. Based on the existing theoretical foundations, more and more variants have been developed for increasing power in the two sample test. However, the asymptotic expansion of the $Q$ statistic in the important foundation work of Zuo and He (2006) currently has an optimal rate $m^{-3/4}$ slower than the target $m^{-1}$, leading to limitations in higher-order expansions for developing more powerful tests. We revisit the existing assumptions and add two new plausible assumptions to obtain the target rate by applying a new proof method based on the Hoeffding decomposition and the Cox-Reid expansion. The aim of this paper is to rekindle interest in asymptotic data depth theory, to place Q-statistical inference on a firmer theoretical basis, to show its variants in current research, to open the door to the development of new theories for further variants requiring higher-order expansions, and to explore more of its potential applications.

可理解性 · SimPLe · 樣例 · Agent · 操作 ·

2024 年 7 月 12 日

Thermodynamically rational decision making under uncertainty

Dorian Daimer,Susanne Still

from arxiv, 7 pages, 3 figures

An analytical characterization of thermodynamically rational agent behaviour is obtained for a simple, yet non--trivial example of a ``Maxwell's demon" operating with partial information. Our results provide the first fully transparent physical understanding of a decision problem under uncertainty.

秩 · 泛函 · 統計量 · GROUP · 得分 ·

2024 年 7 月 11 日

Doubly ranked tests of location for grouped functional data

Mark J. Meyer

Nonparametric tests for functional data are a challenging class of tests to work with because of the potentially high dimensional nature of the data. One of the main challenges for considering rank-based tests, like the Mann-Whitney or Wilcoxon Rank Sum tests (MWW), is that the unit of observation is typically a curve. Thus any rank-based test must consider ways of ranking curves. While several procedures, including depth-based methods, have recently been used to create scores for rank-based tests, these scores are not constructed under the null and often introduce additional, uncontrolled for variability. We therefore reconsider the problem of rank-based tests for functional data and develop an alternative approach that incorporates the null hypothesis throughout. Our approach first ranks realizations from the curves at each measurement occurrence, then calculates a summary statistic for the ranks of each subject, and finally re-ranks the summary statistic in a procedure we refer to as a doubly ranked test. We propose two summaries for the middle step: a sufficient statistic and the average rank. As we demonstrate, doubly rank tests are more powerful while maintaining ideal type I error in the two sample, MWW setting. We also extend our framework to more than two samples, developing a Kruskal-Wallis test for functional data which exhibits good test characteristics as well. Finally, we illustrate the use of doubly ranked tests in functional data contexts from material science, climatology, and public health policy.

語言模型化 · MoDELS · 監督 · 在線 · Automator ·

2024 年 7 月 11 日

Self-training Language Models for Arithmetic Reasoning

Marek Kadl?ík,Michal ?tefánik

from arxiv, Appeared in ICLR 2024 LLMAgents

Language models achieve impressive results in tasks involving complex multistep reasoning, but scaling these capabilities further traditionally requires expensive collection of more annotated data. In this work, we explore the potential of improving the capabilities of language models without new data, merely using automated feedback to the validity of their predictions in arithmetic reasoning (self-training). We find that models can substantially improve in both single-round (offline) and online self-training. In the offline setting, supervised methods are able to deliver gains comparable to preference optimization, but in online self-training, preference optimization shows to largely outperform supervised training thanks to superior stability and robustness on unseen types of problems.

泛函 · 樣例 · 均值 · 平穩的 · 估計/估計量 ·

2024 年 7 月 10 日

Gradual changes in functional time series

Patrick Bastian,Holger Dette

We consider the problem of detecting gradual changes in the sequence of mean functions from a not necessarily stationary functional time series. Our approach is based on the maximum deviation (calculated over a given time interval) between a benchmark function and the mean functions at different time points. We speak of a gradual change of size $\Delta $, if this quantity exceeds a given threshold $\Delta>0$. For example, the benchmark function could represent an average of yearly temperature curves from the pre-industrial time, and we are interested in the question if the yearly temperature curves afterwards deviate from the pre-industrial average by more than $\Delta =1.5$ degrees Celsius, where the deviations are measured with respect to the sup-norm. Using Gaussian approximations for high-dimensional data we develop a test for hypotheses of this type and estimators for the time where a deviation of size larger than $\Delta$ appears for the first time. We prove the validity of our approach and illustrate the new methods by a simulation study and a data example, where we analyze yearly temperature curves at different stations in Australia.

估計/估計量 · 流形 · 隨機采樣 · 樣本 · Analysis ·

2024 年 7 月 10 日

Convergence of Hessian estimator from random samples on a manifold with boundary

Chih-Wei Chen,Hau-Tieng Wu

A common method for estimating the Hessian operator from random samples on a low-dimensional manifold involves locally fitting a quadratic polynomial. Although widely used, it is unclear if this estimator introduces bias, especially in complex manifolds with boundaries and nonuniform sampling. Rigorous theoretical guarantees of its asymptotic behavior have been lacking. We show that, under mild conditions, this estimator asymptotically converges to the Hessian operator, with nonuniform sampling and curvature effects proving negligible, even near boundaries. Our analysis framework simplifies the intensive computations required for direct analysis.

樣例 · SimPLe · MoDELS · 極大 · 講稿 ·

2024 年 7 月 3 日

Examples and Counterexamples of Cost-efficiency in Incomplete Markets

Carole Bernard,Stephan Sturm

from arxiv, 21 pages, 2 tables, 3 figures. This paper contains the examples and counterexamples originally contained in arXiv:2206.12511, as per request of the journal editors

We present a number of examples and counterexamples to illustrate the results on cost-efficiency in an incomplete market obtained in [BS24]. These examples and counterexamples do not only illustrate the results obtained in [BS24], but show the limitations of the results and the sharpness of the key assumptions. In particular, we make use of a simple 3-state model in which we are able to recover and illustrate all key results of the paper. This example also shows how our characterization of perfectly cost-efficient claims allows to solve an expected utility maximization problem in a simple incomplete market (trinomial model) and recover results from [DS06, Chapter 3], there obtained using duality.