高清国产三级在线播放_国产男女无套内谢免费视频_综合精品国产蜜芽_国内自拍无码区在线播放2021_十八禁网站在线观看亚洲精品_91精品丝袜人妻久久久久久不卡_欧美精品人妻aⅴ在线观视频免费

Establishing limiting distributions of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly awaited to many. This paper shows that (a) Chatterjee's rank correlation is asymptotically normal as long as one variable is not a measurable function of the other, and (b) the corresponding asymptotic variance is uniformly bounded by 36. Similar results also hold for Azadkia-Chatterjee's graph-based correlation coefficient, a multivariate analogue of Chatterjee's original proposal. The proof is given by appealing to H\'ajek representation and Chatterjee's nearest-neighbor CLT.

相關內容

相關系(xi)數

關注 0

再生核希爾伯特空間 · 核化 · 經驗風險最小化 · 經驗風險 · 近似 ·

2022 年 6 月 6 日

RFN: A Random-Feature Based Newton Method for Empirical Risk Minimization in Reproducing Kernel Hilbert Spaces

Ting-Jui Chang,Shahin Shahrampour

In supervised learning using kernel methods, we often encounter a large-scale finite-sum minimization over a reproducing kernel Hilbert space (RKHS). Large-scale finite-sum problems can be solved using efficient variants of Newton method, where the Hessian is approximated via sub-samples of data. In RKHS, however, the dependence of the penalty function to kernel makes standard sub-sampling approaches inapplicable, since the gram matrix is not readily available in a low-rank form. In this paper, we observe that for this class of problems, one can naturally use kernel approximation to speed up the Newton method. Focusing on randomized features for kernel approximation, we provide a novel second-order algorithm that enjoys local superlinear convergence and global linear convergence (with high probability). We derive the theoretical lower bound for the number of random features required for the approximated Hessian to be close to the true Hessian in the norm sense. Our numerical experiments on real-world data verify the efficiency of our method compared to several benchmarks.

Performer · Buffer（公司） · 塊 · 在線 · 簇 ·

2022 年 6 月 6 日

Online Neural Diarization of Unlimited Numbers of Speakers

Shota Horiguchi,Shinji Watanabe,Paola Garcia,Yuki Takashima,Yohei Kawaguchi

A method to perform offline and online speaker diarization for an unlimited number of speakers is described in this paper. End-to-end neural diarization (EEND) has achieved overlap-aware speaker diarization by formulating it as a multi-label classification problem. It has also been extended for a flexible number of speakers by introducing speaker-wise attractors. However, the output number of speakers of attractor-based EEND is empirically capped; it cannot deal with cases where the number of speakers appearing during inference is higher than that during training because its speaker counting is trained in a fully supervised manner. Our method, EEND-GLA, solves this problem by introducing unsupervised clustering into attractor-based EEND. In the method, the input audio is first divided into short blocks, then attractor-based diarization is performed for each block, and finally the results of each blocks are clustered on the basis of the similarity between locally-calculated attractors. While the number of output speakers is limited within each block, the total number of speakers estimated for the entire input can be higher than the limitation. To use EEND-GLA in an online manner, our method also extends the speaker-tracing buffer, which was originally proposed to enable online inference of conventional EEND. We introduces a block-wise buffer update to make the speaker-tracing buffer compatible with EEND-GLA. Finally, to improve online diarization, our method improves the buffer update method and revisits the variable chunk-size training of EEND. The experimental results demonstrate that EEND-GLA can perform speaker diarization of an unseen number of speakers in both offline and online inferences.

Markovian · 估計/估計量 · state-of-the-art · Learning · 有偏 ·

2022 年 6 月 6 日

Markovian Interference in Experiments

Vivek F. Farias,Andrew A. Li,Tianyi Peng,Andrew T. Zheng

We consider experiments in dynamical systems where interventions on some experimental units impact other units through a limiting constraint (such as a limited inventory). Despite outsize practical importance, the best estimators for this `Markovian' interference problem are largely heuristic in nature, and their bias is not well understood. We formalize the problem of inference in such experiments as one of policy evaluation. Off-policy estimators, while unbiased, apparently incur a large penalty in variance relative to state-of-the-art heuristics. We introduce an on-policy estimator: the Differences-In-Q's (DQ) estimator. We show that the DQ estimator can in general have exponentially smaller variance than off-policy evaluation. At the same time, its bias is second order in the impact of the intervention. This yields a striking bias-variance tradeoff so that the DQ estimator effectively dominates state-of-the-art alternatives. From a theoretical perspective, we introduce three separate novel techniques that are of independent interest in the theory of Reinforcement Learning (RL). Our empirical evaluation includes a set of experiments on a city-scale ride-hailing simulator.

列 · Subspace · motivation · Unstructured · 離散化 ·

2022 年 6 月 5 日

Point-hyperplane incidence geometry and the log-rank conjecture

Noah Singer,Madhu Sudan

from arxiv, 14 pages, no figures; revised discussion, to appear in ACM Transactions on Computation Theory

We study the log-rank conjecture from the perspective of point-hyperplane incidence geometry. We formulate the following conjecture: Given a point set in $\mathbb{R}^d$ that is covered by constant-sized sets of parallel hyperplanes, there exists an affine subspace that accounts for a large (i.e., $2^{-{\operatorname{polylog}(d)}}$) fraction of the incidences. Alternatively, our conjecture may be interpreted linear-algebraically as follows: Any rank-$d$ matrix containing at most $O(1)$ distinct entries in each column contains a submatrix of fractional size $2^{-{\operatorname{polylog}(d)}}$, in which each column contains one distinct entry. We prove that our conjecture is equivalent to the log-rank conjecture. Motivated by the connections above, we revisit well-studied questions in point-hyperplane incidence geometry without structural assumptions (i.e., the existence of partitions). We give an elementary argument for the existence of complete bipartite subgraphs of density $\Omega(\epsilon^{2d}/d)$ in any $d$-dimensional configuration with incidence density $\epsilon$. We also improve an upper-bound construction of Apfelbaum and Sharir (SIAM J. Discrete Math. '07), yielding a configuration whose complete bipartite subgraphs are exponentially small and whose incidence density is $\Omega(1/\sqrt d)$. Finally, we discuss various constructions (due to others) which yield configurations with incidence density $\Omega(1)$ and bipartite subgraph density $2^{-\Omega(\sqrt d)}$. Our framework and results may help shed light on the difficulty of improving Lovett's $\tilde{O}(\sqrt{\operatorname{rank}(f)})$ bound (J. ACM '16) for the log-rank conjecture; in particular, any improvement on this bound would imply the first bipartite subgraph size bounds for parallel $3$-partitioned configurations which beat our generic bounds for unstructured configurations.

估計/估計量 · Weight · Extensibility · Analysis · Performer ·

2022 年 6 月 5 日

A weighted average distributed estimator for high dimensional parameter

Jun Lu,Mengyao Li,Chenping Hou

In this paper, a new weighted average estimator (WAVE) is proposed to enhance the performance of the simple-averaging based distributed estimator, under a general loss with a high dimensional parameter. To obtain an efficient estimator, a weighted least-square ensemble framework plus an adaptive $L_1$ penalty is proposed, in which the local estimator is estimated via the adaptive-lasso and the weight is inversely proportional to the variance of local estimators. It can be proved that WAVE enjoys the same asymptotic properties as the global estimator and simultaneously spend a very low communication cost, only requiring the local worker to deliver two vectors to the master. Moreover, it is shown that WAVE is effective even when the samples across local workers have different mean and covariance. In particular, the asymptotic normality is established under such conditions, while other competitors may not own this property. The effectiveness of WAVE is further illustrated by an extensive numerical study and a real data analysis.

頻率主義學派 · 估計/估計量 · 控制器 · 線性因子模型 · 推斷 ·

2022 年 6 月 3 日

Bayesian and Frequentist Inference for Synthetic Controls

Ignacio Martinez,Jaume Vives-i-Bastida

The synthetic control method has become a widely popular tool to estimate causal effects with observational data. Despite this, inference for synthetic control methods remains challenging. Often, inferential results rely on linear factor model data generating processes. In this paper, we characterize the conditions on the factor model primitives (the factor loadings) for which the statistical risk minimizers are synthetic controls (in the simplex). Then, we propose a Bayesian alternative to the synthetic control method that preserves the main features of the standard method and provides a new way of doing valid inference. We explore a Bernstein-von Mises style result to link our Bayesian inference to the frequentist inference. For linear factor model frameworks we show that a maximum likelihood estimator (MLE) of the synthetic control weights can consistently estimate the predictive function of the potential outcomes for the treated unit and that our Bayes estimator is asymptotically close to the MLE in the total variation sense. Through simulations, we show that there is convergence between the Bayes and frequentist approach even in sparse settings. Finally, we apply the method to re-visit the study of the economic costs of the German re-unification. The Bayesian synthetic control method is available in the bsynth R-package.

Weight · 易處理的 · SAT · MoDELS · FOCS ·

2022 年 6 月 3 日

Weighted Model Counting with Twin-Width

Robert Ganian,Filip Pokryvka,André Schidler,Kirill Simonov,Stefan Szeider

Bonnet et al. (FOCS 2020) introduced the graph invariant twin-width and showed that many NP-hard problems are tractable for graphs of bounded twin-width, generalizing similar results for other width measures, including treewidth and clique-width. In this paper, we investigate the use of twin-width for solving the propositional satisfiability problem (SAT) and propositional model counting. We particularly focus on Bounded-ones Weighted Model Counting (BWMC), which takes as input a CNF formula $F$ along with a bound $k$ and asks for the weighted sum of all models with at most $k$ positive literals. BWMC generalizes not only SAT but also (weighted) model counting. We develop the notion of "signed" twin-width of CNF formulas and establish that BWMC is fixed-parameter tractable when parameterized by the certified signed twin-width of $F$ plus $k$. We show that this result is tight: it is neither possible to drop the bound $k$ nor use the vanilla twin-width instead if one wishes to retain fixed-parameter tractability, even for the easier problem SAT. Our theoretical results are complemented with an empirical evaluation and comparison of signed twin-width on various classes of CNF formulas.

可交換的 · CASES · Integration · 最優化 · Processing（編程語言） ·

2022 年 6 月 3 日

ChaTEAU: A Universal Toolkit for Applying the Chase

Tanja Auge,Nic Scharlau,Andreas G?rres,Jakob Zimmer,Andreas Heuer

What do applications like semantic optimization, data exchange and integration, answering queries under dependencies, query reformulation with constraints, and data cleaning have in common? All these applications can be processed by the Chase, a family of algorithms for reasoning with constraints. While the theory of the Chase is well understood, existing implementations are confined to specific use cases and application scenarios, making it difficult to reuse them in other settings. ChaTEAU overcomes this limitation: It takes the logical core of the Chase, generalizes it, and provides a software library for different Chase applications in a single toolkit.

Learning · 核化 · 核嶺回歸 · 早停 · 嶺回歸 ·

2022 年 6 月 3 日

On the Benefits of Large Learning Rates for Kernel Methods

Gaspard Beugnot,Julien Mairal,Alessandro Rudi

from arxiv, Accepted paper at Conference COLT 2022. To be published to Proceedings of Machine Learning Research (PMLR)

This paper studies an intriguing phenomenon related to the good generalization performance of estimators obtained by using large learning rates within gradient descent algorithms. First observed in the deep learning literature, we show that a phenomenon can be precisely characterized in the context of kernel methods, even though the resulting optimization problem is convex. Specifically, we consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution on the Hessian's eigenvectors. This extends an intuition described by Nakkiran (2020) on a two-dimensional toy problem to realistic learning scenarios such as kernel ridge regression. While large learning rates may be proven beneficial as soon as there is a mismatch between the train and test objectives, we further explain why it already occurs in classification tasks without assuming any particular mismatch between train and test data distributions.

Learning · Weight · 優化器 · Performer · Agent ·

2022 年 6 月 3 日

Towards Group Learning: Distributed Weighting of Experts

Ben Abramowitz,Nicholas Mattei

Aggregating signals from a collection of noisy sources is a fundamental problem in many domains including crowd-sourcing, multi-agent planning, sensor networks, signal processing, voting, ensemble learning, and federated learning. The core question is how to aggregate signals from multiple sources (e.g. experts) in order to reveal an underlying ground truth. While a full answer depends on the type of signal, correlation of signals, and desired output, a problem common to all of these applications is that of differentiating sources based on their quality and weighting them accordingly. It is often assumed that this differentiation and aggregation is done by a single, accurate central mechanism or agent (e.g. judge). We complicate this model in two ways. First, we investigate the setting with both a single judge, and one with multiple judges. Second, given this multi-agent interaction of judges, we investigate various constraints on the judges' reporting space. We build on known results for the optimal weighting of experts and prove that an ensemble of sub-optimal mechanisms can perform optimally under certain conditions. We then show empirically that the ensemble approximates the performance of the optimal mechanism under a broader range of conditions.