一区二区三区四区五区无码_中文字幕AV一区二区三区亭亭色_操污黄网站在线观看_亚洲精品国偷拍自产电影_国产在线视频一区二区欧美图片_国产丝袜一区二区_91自拍与自偷拍精品

The independence of noise and covariates is a standard assumption in online linear regression and linear bandit literature. This assumption and the following analysis are invalid in the case of endogeneity, i.e., when the noise and covariates are correlated. In this paper, we study the online setting of instrumental variable (IV) regression, which is widely used in economics to tackle endogeneity. Specifically, we analyse and upper bound regret of Two-Stage Least Squares (2SLS) approach to IV regression in the online setting. Our analysis shows that Online 2SLS (O2SLS) achieves $O(d^2 \log^2 T)$ regret after $T$ interactions, where d is the dimension of covariates. Following that, we leverage the O2SLS as an oracle to design OFUL-IV, a linear bandit algorithm. OFUL-IV can tackle endogeneity and achieves $O(d \sqrt{T} \log T)$ regret. For datasets with endogeneity, we experimentally demonstrate that O2SLS and OFUL-IV incur lower regrets than the state-of-the-art algorithms for both the online linear regression and linear bandit settings.

相關內容

賭(du)博機(ji)/老虎機(ji)

關注 0

個性化服務 · 算法 · 強化學習算法 · 在線 · 數據驅動 ·

2023 年 4 月 11 日

Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling

Susobhan Ghosh,Raphael Kim,Prasidh Chhabria,Raaz Dwivedi,Predrag Klasjna,Peng Liao,Kelly Zhang,Susan Murphy

from arxiv, The first two authors contributed equally

There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user's context (e.g., prior activity level, location, etc.). Online RL is a promising data-driven approach for this problem as it learns based on each user's historical responses and uses that knowledge to personalize these decisions. However, to decide whether the RL algorithm should be included in an ``optimized'' intervention for real-world deployment, we must assess the data evidence indicating that the RL algorithm is actually personalizing the treatments to its users. Due to the stochasticity in the RL algorithm, one may get a false impression that it is learning in certain states and using this learning to provide specific treatments. We use a working definition of personalization and introduce a resampling-based methodology for investigating whether the personalization exhibited by the RL algorithm is an artifact of the RL algorithm stochasticity. We illustrate our methodology with a case study by analyzing the data from a physical activity clinical trial called HeartSteps, which included the use of an online RL algorithm. We demonstrate how our approach enhances data-driven truth-in-advertising of algorithm personalization both across all users as well as within specific users in the study.

在線預測 · 在線 · 對抗 · 預測算法 · 下界 ·

2023 年 4 月 11 日

BanditQ -- No-Regret Learning with Guaranteed Per-User Rewards in Adversarial Environments

Abhishek Sinha

Classic online prediction algorithms, such as Hedge, are inherently unfair by design, as they try to play the most rewarding arm as many times as possible while ignoring the sub-optimal arms to achieve sublinear regret. In this paper, we consider a fair online prediction problem in the adversarial setting with hard lower bounds on the rate of accrual of rewards for all arms. By combining elementary queueing theory with online learning, we propose a new online prediction policy, called BanditQ, that achieves the target rate constraints while achieving a regret of $O(T^{3/4})$ in the full-information setting. The design and analysis of BanditQ involve a novel use of the potential function method and are of independent interest.

對抗 · 擾動 · 對抗樣本 · 線性模型 · 樣本 ·

2023 年 4 月 11 日

How many dimensions are required to find an adversarial example?

Charles Godfrey,Henry Kvinge,Elise Bishoff,Myles Mckay,Davis Brown,Tim Doster,Eleanor Byler

from arxiv, Comments welcome! V2: minor edits for clarity

Past work exploring adversarial vulnerability have focused on situations where an adversary can perturb all dimensions of model input. On the other hand, a range of recent works consider the case where either (i) an adversary can perturb a limited number of input parameters or (ii) a subset of modalities in a multimodal problem. In both of these cases, adversarial examples are effectively constrained to a subspace $V$ in the ambient input space $\mathcal{X}$. Motivated by this, in this work we investigate how adversarial vulnerability depends on $\dim(V)$. In particular, we show that the adversarial success of standard PGD attacks with $\ell^p$ norm constraints behaves like a monotonically increasing function of $\epsilon (\frac{\dim(V)}{\dim \mathcal{X}})^{\frac{1}{q}}$ where $\epsilon$ is the perturbation budget and $\frac{1}{p} + \frac{1}{q} =1$, provided $p > 1$ (the case $p=1$ presents additional subtleties which we analyze in some detail). This functional form can be easily derived from a simple toy linear model, and as such our results land further credence to arguments that adversarial examples are endemic to locally linear models on high dimensional spaces.

最優稀疏 · 稀疏回歸 · 最優 · 稀疏 · 聚類算法 ·

2023 年 4 月 10 日

Optimal Sparse Regression Trees

Rui Zhang,Rui Xin,Margo Seltzer,Cynthia Rudin

from arxiv, AAAI 2023, final archival version

Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.

高維 · 噪聲 · 線性回歸 · 擬合 · 過擬合 ·

2023 年 4 月 8 日

Benign Overfitting of Non-Sparse High-Dimensional Linear Regression with Correlated Noise

Toshiki Tsuda,Masaaki Imaizumi

from arxiv, 69 pages

We investigate the high-dimensional linear regression problem in situations where there is noise correlated with Gaussian covariates. In regression models, the phenomenon of the correlated noise is called endogeneity, which is due to unobserved variables and others, and has been a major problem setting in causal inference and econometrics. When the covariates are high-dimensional, it has been common to assume sparsity on the true parameters and estimate them using regularization, even with the endogeneity. However, when sparsity does not hold, it has not been well understood to control the endogeneity and high dimensionality simultaneously. In this paper, we demonstrate that an estimator without regularization can achieve consistency, i.e., benign overfitting, under certain assumptions on the covariance matrix. Specifically, we show that the error of this estimator converges to zero when covariance matrices of the correlated noise and instrumental variables satisfy a condition on their eigenvalues. We consider several extensions to relax these conditions and conduct experiments to support our theoretical findings. As a technical contribution, we utilize the convex Gaussian minimax theorem (CGMT) in our dual problem and extend the CGMT itself.

分類器 · 生成模型 · 魯棒 · 對抗 · 攻擊 ·

2023 年 4 月 8 日

Exploring the Connection between Robust and Generative Models

Senad Beadini,Iacopo Masi

from arxiv, technical report, 6 pages, 6 figures

We offer a study that connects robust discriminative classifiers trained with adversarial training (AT) with generative modeling in the form of Energy-based Models (EBM). We do so by decomposing the loss of a discriminative classifier and showing that the discriminative model is also aware of the input data density. Though a common assumption is that adversarial points leave the manifold of the input data, our study finds out that, surprisingly, untargeted adversarial points in the input space are very likely under the generative model hidden inside the discriminative classifier -- have low energy in the EBM. We present two evidence: untargeted attacks are even more likely than the natural data and their likelihood increases as the attack strength increases. This allows us to easily detect them and craft a novel attack called High-Energy PGD that fools the classifier yet has energy similar to the data set.

矩陣補全 · 子空間 · 矩陣恢復 · 重構誤差 · 核范數 ·

2023 年 4 月 7 日

Near-Optimal Weighted Matrix Completion

Oscar López

from arxiv, 41 pages, 2 figures

Recent work in the matrix completion literature has shown that prior knowledge of a matrix's row and column spaces can be successfully incorporated into reconstruction programs to substantially benefit matrix recovery. This paper proposes a novel methodology that exploits more general forms of known matrix structure in terms of subspaces. The work derives reconstruction error bounds that are informative in practice, providing insight to previous approaches in the literature while introducing novel programs that severely reduce sampling complexity. The main result shows that a family of weighted nuclear norm minimization programs incorporating a $M_1 r$-dimensional subspace of $n\times n$ matrices (where $M_1\geq 1$ conveys structural properties of the subspace) allow accurate approximation of a rank $r$ matrix aligned with the subspace from a near-optimal number of observed entries (within a logarithmic factor of $M_1 r)$. The result is robust, where the error is proportional to measurement noise, applies to full rank matrices, and reflects degraded output when erroneous prior information is utilized. Numerical experiments are presented that validate the theoretical behavior derived for several example weighted programs.

可行 · 元模型 · 指數和 · 靈敏度 · 擬合 ·

2023 年 4 月 7 日

Estimating Shapley effects for moderate-to-large input dimensions

Akira Horiguchi,Matthew T. Pratola

from arxiv, 19 pages, 3 figures

Sobol' indices and Shapley effects are attractive methods of assessing how a function depends on its various inputs. The existing literature contains various estimators for these two classes of sensitivity indices, but few estimators of Sobol' indices and no estimators of Shapley effects are computationally tractable for moderate-to-large input dimensions. This article provides a Shapley-effect estimator that is computationally tractable for a moderate-to-large input dimension. The estimator uses a metamodel-based approach by first fitting a Bayesian Additive Regression Trees model which is then used to compute Shapley-effect estimates. This article also establishes posterior contraction rates on a large function class for this Shapley-effect estimator and for the analogous existing Sobol'-index estimator. Finally, this paper explores the performance of these Shapley-effect estimators on four different test functions for moderate-to-large input dimensions and number of observations.

魯棒 · 穩健 · 高維 · 損失函數 · 穩健估計 ·

2023 年 4 月 7 日

Robust adaptive Lasso in high-dimensional logistic regression

Ayanendranath Basu,Abhik Ghosh,María Jaenada,Leandro Pardo

from arxiv, 27 pages

Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.

攻擊 · 推斷 · 推斷攻擊 · Learning · 梯度 ·

2023 年 4 月 7 日

Label Inference Attack against Split Learning under Regression Setting

Shangyu Xie,Xin Yang,Yuanshun Yao,Tianyi Liu,Taiqing Wang,Jiankai Sun

from arxiv, 9 pages

As a crucial building block in vertical Federated Learning (vFL), Split Learning (SL) has demonstrated its practice in the two-party model training collaboration, where one party holds the features of data samples and another party holds the corresponding labels. Such method is claimed to be private considering the shared information is only the embedding vectors and gradients instead of private raw data and labels. However, some recent works have shown that the private labels could be leaked by the gradients. These existing attack only works under the classification setting where the private labels are discrete. In this work, we step further to study the leakage in the scenario of the regression model, where the private labels are continuous numbers (instead of discrete labels in classification). This makes previous attacks harder to infer the continuous labels due to the unbounded output range. To address the limitation, we propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives in aspects of model training properties, which can infer the labels under regression settings effectively. The comprehensive experiments on various datasets and models have demonstrated the effectiveness of our proposed attack. We hope our work can pave the way for future analyses that make the vFL framework more secure.