亚洲AV永久无码精品九之,国产在线观看成永久免费视频

from arxiv, Appears in: Advances in Neural Information Processing Systems 35 (NeurIPS 2022). Minor modifications with respect to the NeurIPS version. 58 pages, 6 algorithms, 9 figures, 4 tables

Variational inequalities are a formalism that includes games, minimization, saddle point, and equilibrium problems as special cases. Methods for variational inequalities are therefore universal approaches for many applied tasks, including machine learning problems. This work concentrates on the decentralized setting, which is increasingly important but not well understood. In particular, we consider decentralized stochastic (sum-type) variational inequalities over fixed and time-varying networks. We present lower complexity bounds for both communication and local iterations and construct optimal algorithms that match these lower bounds. Our algorithms are the best among the available literature not only in the decentralized stochastic case, but also in the decentralized deterministic and non-distributed stochastic cases. Experimental results confirm the effectiveness of the presented algorithms.

相關內容

變分不等式

關注 0

統計量 · Learning · 相似度 · 相同 · 相互獨立的 ·

2023 年 5 月 23 日

Statistical Indistinguishability of Learning Algorithms

Alkis Kalavasis,Amin Karbasi,Shay Moran,Grigoris Velegkas

When two different parties use the same learning rule on their own data, how can we test whether the distributions of the two outcomes are similar? In this paper, we study the similarity of outcomes of learning rules through the lens of the Total Variation (TV) distance of distributions. We say that a learning rule is TV indistinguishable if the expected TV distance between the posterior distributions of its outputs, executed on two training data sets drawn independently from the same distribution, is small. We first investigate the learnability of hypothesis classes using TV indistinguishable learners. Our main results are information-theoretic equivalences between TV indistinguishability and existing algorithmic stability notions such as replicability and approximate differential privacy. Then, we provide statistical amplification and boosting algorithms for TV indistinguishable learners.

Lipschitz · 平滑 · 在線 · CASES · 正則化項 ·

2023 年 5 月 23 日

Data-Dependent Bounds for Online Portfolio Selection Without Lipschitzness and Smoothness

Chung-En Tsai,Ying-Ting Lin,Yen-Huan Li

from arxiv, 34 pages

This work introduces the first small-loss and gradual-variation regret bounds for online portfolio selection, marking the first instances of data-dependent bounds for online convex optimization with non-Lipschitz, non-smooth losses. The algorithms we propose exhibit sublinear regret rates in the worst cases and achieve logarithmic regrets when the data is "easy," with per-iteration time almost linear in the number of investment alternatives. The regret bounds are derived using novel smoothness characterizations of the logarithmic loss, a local norm-based analysis of following the regularized leader (FTRL) with self-concordant regularizers, which are not necessarily barriers, and an implicit variant of optimistic FTRL with the log-barrier.

泛化理論 · 推斷 · MoDELS · 向量空間 · 近似 ·

2023 年 5 月 23 日

Federated Variational Inference: Towards Improved Personalization and Generalization

Elahe Vedadi,Joshua V. Dillon,Philip Andrew Mansfield,Karan Singhal,Arash Afkanpour,Warren Richard Morningstar

from arxiv, 16 pages, 6 figures

Conventional federated learning algorithms train a single global model by leveraging all participating clients' data. However, due to heterogeneity in client generative distributions and predictive models, these approaches may not appropriately approximate the predictive process, converge to an optimal state, or generalize to new clients. We study personalization and generalization in stateless cross-device federated learning setups assuming heterogeneity in client data distributions and predictive models. We first propose a hierarchical generative model and formalize it using Bayesian Inference. We then approximate this process using Variational Inference to train our model efficiently. We call this algorithm Federated Variational Inference (FedVI). We use PAC-Bayes analysis to provide generalization bounds for FedVI. We evaluate our model on FEMNIST and CIFAR-100 image classification and show that FedVI beats the state-of-the-art on both tasks.

近似 · Oracle · 泛函 · binary · 極小點 ·

2023 年 5 月 22 日

Approximate degree lower bounds for oracle identification problems

Mark Bun,Nadezhda Voronova

from arxiv, 39 Pages. To appear at TQC 2023. v2: This update adds the generalization of our results to the weakly unbounded error setting

The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its quantum query complexity, and generically lifts to a quantum communication lower bound for a related function. We introduce a framework for proving approximate degree lower bounds for certain oracle identification problems, where the goal is to recover a hidden binary string $x \in \{0, 1\}^n$ given possibly non-standard oracle access to it. Our lower bounds apply to decision versions of these problems, where the goal is to compute the parity of $x$. We apply our framework to the ordered search and hidden string problems, proving nearly tight approximate degree lower bounds of $\Omega(n/\log^2 n)$ for each. These lower bounds generalize to the weakly unbounded error setting, giving a new quantum query lower bound for the hidden string problem in this regime. Our lower bounds are driven by randomized communication upper bounds for the greater-than and equality functions.

相關系數 · CCC · SICOMP · 輸出 · 分解的 ·

2023 年 5 月 20 日

Tight Correlation Bounds for Circuits Between AC0 and TC0

Vinayak M. Kumar

from arxiv, 43 pages; improved presentation and layout, added circuit constructions matching lower bounds

We initiate the study of generalized AC0 circuits comprised of negations and arbitrary unbounded fan-in gates that only need to be constant over inputs of Hamming weight $\ge k$, which we denote GC0$(k)$. The gate set of this class includes biased LTFs like the $k$-$OR$ (output $1$ iff $\ge k$ bits are 1) and $k$-$AND$ (output $0$ iff $\ge k$ bits are 0), and thus can be seen as an interpolation between AC0 and TC0. We establish a tight multi-switching lemma for GC0$(k)$ circuits, which bounds the probability that several depth-2 GC0$(k)$ circuits do not simultaneously simplify under a random restriction. We also establish a new depth reduction lemma such that coupled with our multi-switching lemma, we can show many results obtained from the multi-switching lemma for depth-$d$ size-$s$ AC0 circuits lifts to depth-$d$ size-$s^{.99}$ GC0$(.01\log s)$ circuits with no loss in parameters (other than hidden constants). Our result has the following applications: 1.Size-$2^{\Omega(n^{1/d})}$ depth-$d$ GC0$(\Omega(n^{1/d}))$ circuits do not correlate with parity (extending a result of H{\aa}stad (SICOMP, 2014)). 2. Size-$n^{\Omega(\log n)}$ GC0$(\Omega(\log^2 n))$ circuits with $n^{.249}$ arbitrary threshold gates or $n^{.499}$ arbitrary symmetric gates exhibit exponentially small correlation against an explicit function (extending a result of Tan and Servedio (RANDOM, 2019)). 3. There is a seed length $O((\log m)^{d-1}\log(m/\varepsilon)\log\log(m))$ pseudorandom generator against size-$m$ depth-$d$ GC0$(\log m)$ circuits, matching the AC0 lower bound of H{\aa}stad stad up to a $\log\log m$ factor (extending a result of Lyu (CCC, 2022)). 4. Size-$m$ GC0$(\log m)$ circuits have exponentially small Fourier tails (extending a result of Tal (CCC, 2017)).

隨機梯度下降 · 損失 · 噪聲 · Analysis · SGD ·

2023 年 5 月 20 日

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Lingjiong Zhu,Mert Gurbuzbalaban,Anant Raj,Umut Simsekli

from arxiv, 47 pages

Algorithmic stability is an important notion that has proven powerful for deriving generalization bounds for practical algorithms. The last decade has witnessed an increasing number of stability bounds for different algorithms applied on different classes of loss functions. While these bounds have illuminated various properties of optimization algorithms, the analysis of each case typically required a different proof technique with significantly different mathematical tools. In this study, we make a novel connection between learning theory and applied probability and introduce a unified guideline for proving Wasserstein stability bounds for stochastic optimization algorithms. We illustrate our approach on stochastic gradient descent (SGD) and we obtain time-uniform stability bounds (i.e., the bound does not increase with the number of iterations) for strongly convex losses and non-convex losses with additive noise, where we recover similar results to the prior art or extend them to more general cases by using a single proof technique. Our approach is flexible and can be generalizable to other popular optimizers, as it mainly requires developing Lyapunov functions, which are often readily available in the literature. It also illustrates that ergodicity is an important component for obtaining time-uniform bounds -- which might not be achieved for convex or non-convex losses unless additional noise is injected to the iterates. Finally, we slightly stretch our analysis technique and prove time-uniform bounds for SGD under convex and non-convex losses (without additional additive noise), which, to our knowledge, is novel.

Facebook AI Research · state-of-the-art · Agent · 可約的 · 啟發式算法 ·

2023 年 5 月 19 日

Practical algorithms and experimentally validated incentives for equilibrium-based fair division (A-CEEI)

Eric Budish,Ruiquan Gao,Abraham Othman,Aviad Rubinstein,Qianfan Zhang

from arxiv, To appear in EC 2023

Approximate Competitive Equilibrium from Equal Incomes (A-CEEI) is an equilibrium-based solution concept for fair division of discrete items to agents with combinatorial demands. In theory, it is known that in asymptotically large markets: 1. For incentives, the A-CEEI mechanism is Envy-Free-but-for-Tie-Breaking (EF-TB), which implies that it is Strategyproof-in-the-Large (SP-L). 2. From a computational perspective, computing the equilibrium solution is unfortunately a computationally intractable problem (in the worst-case, assuming $\textsf{PPAD}\ne \textsf{FP}$). We develop a new heuristic algorithm that outperforms the previous state-of-the-art by multiple orders of magnitude. This new, faster algorithm lets us perform experiments on real-world inputs for the first time. We discover that with real-world preferences, even in a realistic implementation that satisfies the EF-TB and SP-L properties, agents may have surprisingly simple and plausible deviations from truthful reporting of preferences. To this end, we propose a novel strengthening of EF-TB, which dramatically reduces the potential for strategic deviations from truthful reporting in our experiments. A (variant of) our algorithm is now in production: on real course allocation problems it is much faster, has zero clearing error, and has stronger incentive properties than the prior state-of-the-art implementation.

Performer · 成對型 · contrastive · Conformer · Analysis ·

2023 年 4 月 28 日

ALMERIA: Boosting pairwise molecular contrasts with scalable methods

Rafael Mena-Yedra,Juana L. Redondo,Horacio Pérez-Sánchez,Pilar M. Ortigosa

Searching for potential active compounds in large databases is a necessary step to reduce time and costs in modern drug discovery pipelines. Such virtual screening methods seek to provide predictions that allow the search space to be narrowed down. Although cheminformatics has made great progress in exploiting the potential of available big data, caution is needed to avoid introducing bias and provide useful predictions with new compounds. In this work, we propose the decision-support tool ALMERIA (Advanced Ligand Multiconformational Exploration with Robust Interpretable Artificial Intelligence) for estimating compound similarities and activity prediction based on pairwise molecular contrasts while considering their conformation variability. The methodology covers the entire pipeline from data preparation to model selection and hyperparameter optimization. It has been implemented using scalable software and methods to exploit large volumes of data -- in the order of several terabytes -- , offering a very quick response even for a large batch of queries. The implementation and experiments have been performed in a distributed computer cluster using a benchmark, the public access DUD-E database. In addition to cross-validation, detailed data split criteria have been used to evaluate the models on different data partitions to assess their true generalization ability with new compounds. Experiments show state-of-the-art performance for molecular activity prediction (ROC AUC: $0.99$, $0.96$, $0.87$), proving that the chosen data representation and modeling have good properties to generalize. Molecular conformations -- prediction performance and sensitivity analysis -- have also been evaluated. Finally, an interpretability analysis has been performed using the SHAP method.

矩陣論 · 線性的 · 歐氏空間 · 反向傳播算法 · AIM ·

2022 年 1 月 1 日

Matrix Decomposition and Applications

Jun Lu

from arxiv, arXiv admin note: substantial text overlap with arXiv:2107.02579

In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.