欧美综合一本热第九页,亚洲WW无码专区在线观看,无码免费不卡视频

In sparse large-scale testing problems where the false discovery proportion (FDP) is highly variable, the false discovery exceedance (FDX) provides a valuable alternative to the widely used false discovery rate (FDR). We develop an empirical Bayes approach to controlling the FDX. We show that for independent hypotheses from a two-group model and dependent hypotheses from a Gaussian model fulfilling the exchangeability condition, an oracle decision rule based on ranking and thresholding the local false discovery rate (lfdr) is optimal in the sense that the power is maximized subject to FDX constraint. We propose a data-driven FDX procedure that emulates the oracle via carefully designed computational shortcuts. We investigate the empirical performance of the proposed method using simulations and illustrate the merits of FDX control through an application for identifying abnormal stock trading strategies.

相關內容

控制器

關注 5

RAVEN · 統計量 · 欠估計 · MoDELS · Color ·

2022 年 1 月 10 日

A nonBayesian view of Hempel's paradox of the ravens

Yudi Pawitan

from arxiv, 10 pages, 0 figures

In Hempel's paradox of the ravens, seeing a red pencil is considered as supporting evidence that all ravens are black. Also known as the Paradox of Confirmation, the paradox and its many resolutions indicate that we cannot underestimate the logical and statistical elements needed in the assessment of evidence in support of a hypothesis. Most of the previous analyses of the paradox are within the Bayesian framework. These analyses and Hempel himself generally accept the paradoxical conclusion; it feels paradoxical supposedly because the amount of evidence is extremely small. Here I describe a nonBayesian analysis of various statistical models with an accompanying likelihood-based reasoning. The analysis shows that the paradox feels paradoxical because there are natural models where observing a red pencil has no relevance to the color of ravens. In general the value of the evidence depends crucially on the sampling scheme and on the assumption about the underlying parameters of the relevant model.

平滑 · 代價函數 · 非凸 · Networking · 泛函 ·

2022 年 1 月 9 日

Distributed Random Reshuffling over Networks

Kun Huang,Xiao Li,Andre Milzarek,Shi Pu,Junwen Qiu

from arxiv, 27 pages, 5 figures

In this paper, we consider the distributed optimization problem where $n$ agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that combines the classical distributed gradient descent (DGD) method and Random Reshuffling (RR). We show that D-RR inherits the superiority of RR for both smooth strongly convex and smooth nonconvex objective functions. In particular, for smooth strongly convex objective functions, D-RR achieves $\mathcal{O}(1/T^2)$ rate of convergence (here, $T$ counts the total number of iterations) in terms of the squared distance between the iterate and the unique minimizer. When the objective function is assumed to be smooth nonconvex and has Lipschitz continuous component functions, we show that D-RR drives the squared norm of gradient to $0$ at a rate of $\mathcal{O}(1/T^{2/3})$. These convergence results match those of centralized RR (up to constant factors).

FAST · 可辨認的 · Networking · Networks · 估計/估計量 ·

2022 年 1 月 8 日

A Fast Detection Method of Break Points in Effective Connectivity Networks

Peiliang Bai,Abolfazl Safikhani,George Michailidis

from arxiv, 45 pages

There is increasing interest in identifying changes in the underlying states of brain networks. The availability of large scale neuroimaging data creates a strong need to develop fast, scalable methods for detecting and localizing in time such changes and also identify their drivers, thus enabling neuroscientists to hypothesize about potential mechanisms. This paper presents a fast method for detecting break points in exceedingly long time series neuroimaging data, based on vector autoregressive (Granger causal) models. It uses a multi-step strategy based on a regularized objective function that leads to fast identification of candidate break points, followed by clustering steps to select the final set of break points and subsequent estimation with false positives control of the underlying Granger causal networks. The latter provides insights into key changes in network connectivity that led to the presence of break points. The proposed methodology is illustrated on synthetic data varying in their length, dimensionality, number of break points, strength of the signal, and also applied to EEG data related to visual tasks.

MoDELS · 核化 · 子采樣 · 樣本 · 控制器 ·

2022 年 1 月 7 日

Kernel Knockoffs Selection for Nonparametric Additive Models

Xiaowu Dai,Xiang Lyu,Lexin Li

Thanks to its fine balance between model flexibility and interpretability, the nonparametric additive model has been widely used, and variable selection for this type of model has been frequently studied. However, none of the existing solutions can control the false discovery rate (FDR) unless the sample size tends to infinity. The knockoff framework is a recent proposal that can address this issue, but few knockoff solutions are directly applicable to nonparametric models. In this article, we propose a novel kernel knockoffs selection procedure for the nonparametric additive model. We integrate three key components: the knockoffs, the subsampling for stability, and the random feature mapping for nonparametric function approximation. We show that the proposed method is guaranteed to control the FDR for any sample size, and achieves a power that approaches one as the sample size tends to infinity. We demonstrate the efficacy of our method through intensive simulations and comparisons with the alternative solutions. Our proposal thus makes useful contributions to the methodology of nonparametric variable selection, FDR-based inference, as well as knockoffs.

控制器 · 可交換的 · 可辨認的 · Oracle · Performer ·

2022 年 1 月 7 日

An Empirical Bayes Approach to Controlling the False Discovery Exceedance

Pallavi Basu,Luella Fu,Alessio Saretto,Wenguang Sun

from arxiv, Minor updates. 34 pages

In large-scale multiple hypothesis testing problems, the false discovery exceedance (FDX) provides a desirable alternative to the widely used false discovery rate (FDR) when the false discovery proportion (FDP) is highly variable. We develop an empirical Bayes approach to control the FDX. We show that, for independent hypotheses from a two-group model and dependent hypotheses from a Gaussian model fulfilling the exchangeability condition, an oracle decision rule based on ranking and thresholding the local false discovery rate (lfdr) is optimal in the sense that the power is maximized subject to the FDX constraint. We propose a data-driven FDX procedure that uses carefully designed computational shortcuts to emulate the oracle rule. We investigate the empirical performance of the proposed method using both simulated and real data and study the merits of FDX control through an application for identifying abnormal stock trading strategies.

損失函數（機器學習） · 平滑 · 泛函 · 優化器 · 秩 ·

2022 年 1 月 6 日

Federated Optimization of Smooth Loss Functions

Ali Jadbabaie,Anuran Makur,Devavrat Shah

from arxiv, 30 pages

In this work, we study empirical risk minimization (ERM) within a federated learning framework, where a central server minimizes an ERM objective function using training data that is stored across $m$ clients. In this setting, the Federated Averaging (FedAve) algorithm is the staple for determining $\epsilon$-approximate solutions to the ERM problem. Similar to standard optimization algorithms, the convergence analysis of FedAve only relies on smoothness of the loss function in the optimization parameter. However, loss functions are often very smooth in the training data too. To exploit this additional smoothness, we propose the Federated Low Rank Gradient Descent (FedLRGD) algorithm. Since smoothness in data induces an approximate low rank structure on the loss function, our method first performs a few rounds of communication between the server and clients to learn weights that the server can use to approximate clients' gradients. Then, our method solves the ERM problem at the server using inexact gradient descent. To show that FedLRGD can have superior performance to FedAve, we present a notion of federated oracle complexity as a counterpart to canonical oracle complexity. Under some assumptions on the loss function, e.g., strong convexity in parameter, $\eta$-H\"older smoothness in data, etc., we prove that the federated oracle complexity of FedLRGD scales like $\phi m(p/\epsilon)^{\Theta(d/\eta)}$ and that of FedAve scales like $\phi m(p/\epsilon)^{3/4}$ (neglecting sub-dominant factors), where $\phi\gg 1$ is a "communication-to-computation ratio," $p$ is the parameter dimension, and $d$ is the data dimension. Then, we show that when $d$ is small and the loss function is sufficiently smooth in the data, FedLRGD beats FedAve in federated oracle complexity. Finally, in the course of analyzing FedLRGD, we also establish a result on low rank approximation of latent variable models.

泛化理論 · state-of-the-art · contrastive · MoDELS · 學成 ·

2022 年 1 月 6 日

Efficiently Disentangle Causal Representations

Yuanpeng Li,Joel Hestness,Mohamed Elhoseiny,Liang Zhao,Kenneth Church

from arxiv, 17 pages, 7 figures

This paper proposes an efficient approach to learning disentangled representations with causal mechanisms based on the difference of conditional probabilities in original and new distributions. We approximate the difference with models' generalization abilities so that it fits in the standard machine learning framework and can be efficiently computed. In contrast to the state-of-the-art approach, which relies on the learner's adaptation speed to new distribution, the proposed approach only requires evaluating the model's generalization ability. We provide a theoretical explanation for the advantage of the proposed method, and our experiments show that the proposed technique is 1.9--11.0$\times$ more sample efficient and 9.4--32.4 times quicker than the previous method on various tasks. The source code is available at \url{//github.com/yuanpeng16/EDCR}.

近似 · 蒙特卡羅 · 近似誤差 · FAST · SimPLe ·

2022 年 1 月 4 日

Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections

Kimia Nadjahi,Alain Durmus,Pierre E. Jacob,Roland Badeau,Umut ?im?ekli

from arxiv, Published at NeurIPS 2021

The Sliced-Wasserstein distance (SW) is being increasingly used in machine learning applications as an alternative to the Wasserstein distance and offers significant computational and statistical benefits. Since it is defined as an expectation over random projections, SW is commonly approximated by Monte Carlo. We adopt a new perspective to approximate SW by making use of the concentration of measure phenomenon: under mild assumptions, one-dimensional projections of a high-dimensional random vector are approximately Gaussian. Based on this observation, we develop a simple deterministic approximation for SW. Our method does not require sampling a number of random projections, and is therefore both accurate and easy to use compared to the usual Monte Carlo approximation. We derive nonasymptotical guarantees for our approach, and show that the approximation error goes to zero as the dimension increases, under a weak dependence condition on the data distribution. We validate our theoretical findings on synthetic datasets, and illustrate the proposed approximation on a generative modeling problem.

估計/估計量 · 參數化模型 · 泛函 · Performer · 特化 ·

2022 年 1 月 3 日

An extreme value approach to CoVaR estimation

Natalia Nolde,Chen Zhou,Menglin Zhou

from arxiv, 32 pages, 10 figures, 7 tables

The global financial crisis of 2007-2009 highlighted the crucial role systemic risk plays in ensuring stability of financial markets. Accurate assessment of systemic risk would enable regulators to introduce suitable policies to mitigate the risk as well as allow individual institutions to monitor their vulnerability to market movements. One popular measure of systemic risk is the conditional value-at-risk (CoVaR), proposed in Adrian and Brunnermeier (2011). We develop a methodology to estimate CoVaR semi-parametrically within the framework of multivariate extreme value theory. According to its definition, CoVaR can be viewed as a high quantile of the conditional distribution of one institution's (or the financial system) potential loss, where the conditioning event corresponds to having large losses in the financial system (or the given financial institution). We relate this conditional distribution to the tail dependence function between the system and the institution, then use parametric modelling of the tail dependence function to address data sparsity in the joint tail regions. We prove consistency of the proposed estimator, and illustrate its performance via simulation studies and a real data example.

可約的 · Networking · Performer · 博弈論 · Boosting（一種模型訓練加速方式） ·

2022 年 1 月 3 日

Joint Sub-carrier and Power Allocation for Efficient Communication of Cellular UAVs

H. Hellaoui,M. Bagaa,A. Chelli,T. Taleb

Cellular networks are expected to be the main communication infrastructure to support the expanding applications of Unmanned Aerial Vehicles (UAVs). As these networks are deployed to serve ground User Equipment (UES), several issues need to be addressed to enhance cellular UAVs'services.In this paper, we propose a realistic communication model on the downlink,and we show that the Quality of Service (QoS)for the users is affected by the number of interfering BSs and the impact they cause. The joint problem of sub-carrier and power allocation is therefore addressed. Given its complexity, which is known to be NP-hard, we introduce a solution based on game theory. First, we argue that separating between UAVs and UEs in terms of the assigned sub-carriers reduces the interference impact on the users. This is materialized through a matching game. Moreover, in order to boost the partition, we propose a coalitional game that considers the outcome of the first one and enables users to change their coalitions and enhance their QoS. Furthermore, a power optimization solution is introduced, which is considered in the two games. Performance evaluations are conducted, and the obtained results demonstrate the effectiveness of the propositions.