夏娃韩剧电视剧在剧免费韩剧TV_午夜男女爽爽爽免费大片_亚洲一区中文字幕无限乱码_我好爽好紧视频国产免费_成人影院在线观看免费_91之国产精品久久_国产一级A毛一级A看免费伦理视频国产精品

We provide an interior point method based on quasi-Newton iterations, which only requires first-order access to a strongly self-concordant barrier function. To achieve this, we extend the techniques of Dunagan-Harvey [STOC '07] to maintain a preconditioner, while using only first-order information. We measure the quality of this preconditioner in terms of its relative excentricity to the unknown Hessian matrix, and we generalize these techniques to convex functions with a slowly-changing Hessian. We combine this with an interior point method to show that, given first-order access to an appropriate barrier function for a convex set $K$, we can solve well-conditioned linear optimization problems over $K$ to $\varepsilon$ precision in time $\widetilde{O}\left(\left(\mathcal{T}+n^{2}\right)\sqrt{n\nu}\log\left(1/\varepsilon\right)\right)$, where $\nu$ is the self-concordance parameter of the barrier function, and $\mathcal{T}$ is the time required to make a gradient query. As a consequence we show that: $\bullet$ Linear optimization over $n$-dimensional convex sets can be solved in time $\widetilde{O}\left(\left(\mathcal{T}n+n^{3}\right)\log\left(1/\varepsilon\right)\right)$. This parallels the running time achieved by state of the art algorithms for cutting plane methods, when replacing separation oracles with first-order oracles for an appropriate barrier function. $\bullet$ We can solve semidefinite programs involving $m\geq n$ matrices in $\mathbb{R}^{n\times n}$ in time $\widetilde{O}\left(mn^{4}+m^{1.25}n^{3.5}\log\left(1/\varepsilon\right)\right)$, improving over the state of the art algorithms, in the case where $m=\Omega\left(n^{\frac{3.5}{\omega-1.25}}\right)$. Along the way we develop a host of tools allowing us to control the evolution of our potential functions, using techniques from matrix analysis and Schur convexity.

相關內容

障礙函數

關注 0

近似誤差 · 近似 · MoDELS · 生成模型 · Performer ·

2023 年 5 月 26 日

Error Bounds for Flow Matching Methods

Joe Benton,George Deligiannidis,Arnaud Doucet

Score-based generative models are a popular class of generative modelling techniques relying on stochastic differential equations (SDE). From their inception, it was realized that it was also possible to perform generation using ordinary differential equations (ODE) rather than SDE. This led to the introduction of the probability flow ODE approach and denoising diffusion implicit models. Flow matching methods have recently further extended these ODE-based approaches and approximate a flow between two arbitrary probability distributions. Previous work derived bounds on the approximation error of diffusion models under the stochastic sampling regime, given assumptions on the $L^2$ loss. We present error bounds for the flow matching procedure using fully deterministic sampling, assuming an $L^2$ bound on the approximation error and a certain regularity condition on the data distributions.

估計/估計量 · 圖 · 流 · 稀疏 · 情景 ·

2023 年 5 月 26 日

Sublinear-Space Streaming Algorithms for Estimating Graph Parameters on Sparse Graphs

Xiuge Chen,Rajesh Chitnis,Patrick Eades,Anthony Wirth

In this paper, we design sub-linear space streaming algorithms for estimating three fundamental parameters -- maximum independent set, minimum dominating set and maximum matching -- on sparse graph classes, i.e., graphs which satisfy $m=O(n)$ where $m,n$ is the number of edges, vertices respectively. Each of the three graph parameters we consider can have size $\Omega(n)$ even on sparse graph classes, and hence for sublinear-space algorithms we are restricted to parameter estimation instead of attempting to find a solution.

估計/估計量 · Less · SimPLe · 方陣 · tuning ·

2023 年 5 月 25 日

Cross-validation for change-point regression: pitfalls and solutions

Florian Pein,Rajen D. Shah

Cross-validation is the standard approach for tuning parameter selection in many non-parametric regression problems. However its use is less common in change-point regression, perhaps as its prediction error-based criterion may appear to permit small spurious changes and hence be less well-suited to estimation of the number and location of change-points. We show that in fact the problems of cross-validation with squared error loss are more severe and can lead to systematic under- or over-estimation of the number of change-points, and highly suboptimal estimation of the mean function in simple settings where changes are easily detectable. We propose two simple approaches to remedy these issues, the first involving the use of absolute error rather than squared error loss, and the second involving modifying the holdout sets used. For the latter, we provide conditions that permit consistent estimation of the number of change-points for a general change-point estimation procedure. We show these conditions are satisfied for optimal partitioning using new results on its performance when supplied with the incorrect number of change-points. Numerical experiments show that the absolute error approach in particular is competitive with common change-point methods using classical tuning parameter choices when error distributions are well-specified, but can substantially outperform these in misspecified models. An implementation of our methodology is available in the R package crossvalidationCP on CRAN.

Markovian · 噪聲 · 優化器 · 混合時間 · Extensibility ·

2023 年 5 月 25 日

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Aleksandr Beznosikov,Sergey Samsonov,Marina Sheshukova,Alexander Gasnikov,Alexey Naumov,Eric Moulines

from arxiv, 47 pages, 3 algorithms, 2 tables

This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.

在線 · Learning · 可行 · 優化器 · 約束 ·

2023 年 5 月 25 日

Online Learning under Budget and ROI Constraints and Applications to Bidding in Non-Truthful Auctions

Matteo Castiglioni,Andrea Celli,Christian Kroer

We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Previous work requires the decision maker to know beforehand some specific parameters related to the degree of strict feasibility of the offline problem. Moreover, when inputs are adversarial, it requires the existence of a strictly feasible solution to the offline optimization problem at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. We propose a best-of-both-worlds primal-dual framework which circumvents both assumptions by exploiting the notion of interval regret, providing guarantees under both stochastic and adversarial inputs. Our proof techniques can be applied to both input models with minimal modifications, thereby providing a unified perspective on the two problems. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.

Networking · Performer · 統計量 · Neural Networks · 估計/估計量 ·

2023 年 5 月 25 日

Simulating first-order phase transition with hierarchical autoregressive networks

Piotr Bia?as,Paulina Czarnota,Piotr Korcyl,Tomasz Stebel

from arxiv, 14 pages, 12 figures, published version

We apply the Hierarchical Autoregressive Neural (HAN) network sampling algorithm to the two-dimensional $Q$-state Potts model and perform simulations around the phase transition at $Q=12$. We quantify the performance of the approach in the vicinity of the first-order phase transition and compare it with that of the Wolff cluster algorithm. We find a significant improvement as far as the statistical uncertainty is concerned at a similar numerical effort. In order to efficiently train large neural networks we introduce the technique of pre-training. It allows to train some neural networks using smaller system sizes and then employing them as starting configurations for larger system sizes. This is possible due to the recursive construction of our hierarchical approach. Our results serve as a demonstration of the performance of the hierarchical approach for systems exhibiting bimodal distributions. Additionally, we provide estimates of the free energy and entropy in the vicinity of the phase transition with statistical uncertainties of the order of $10^{-7}$ for the former and $10^{-3}$ for the latter based on a statistics of $10^6$ configurations.

動量 · 非凸 · 相互獨立的 · 樣本復雜度 · 正則的 ·

2023 年 5 月 24 日

Momentum Provably Improves Error Feedback!

Ilyas Fatkhullin,Alexander Tyurin,Peter Richtárik

Due to the high communication overhead when training machine learning models in a distributed environment, modern algorithms invariably rely on lossy communication compression. However, when untreated, the errors caused by compression propagate, and can lead to severely unstable behavior, including exponential divergence. Almost a decade ago, Seide et al [2014] proposed an error feedback (EF) mechanism, which we refer to as EF14, as an immensely effective heuristic for mitigating this issue. However, despite steady algorithmic and theoretical advances in the EF field in the last decade, our understanding is far from complete. In this work we address one of the most pressing issues. In particular, in the canonical nonconvex setting, all known variants of EF rely on very large batch sizes to converge, which can be prohibitive in practice. We propose a surprisingly simple fix which removes this issue both theoretically, and in practice: the application of Polyak's momentum to the latest incarnation of EF due to Richt\'{a}rik et al. [2021] known as EF21. Our algorithm, for which we coin the name EF21-SGDM, improves the communication and sample complexities of previous error feedback algorithms under standard smoothness and bounded variance assumptions, and does not require any further strong assumptions such as bounded gradient dissimilarity. Moreover, we propose a double momentum version of our method that improves the complexities even further. Our proof seems to be novel even when compression is removed from the method, and as such, our proof technique is of independent interest in the study of nonconvex stochastic optimization enriched with Polyak's momentum.

集成 · INTERACT · 推斷 · Learning · 向量空間 ·

2023 年 5 月 24 日

A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods

Veit David Wild,Sahra Ghalebikesabi,Dino Sejdinovic,Jeremias Knoblauch

We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning -- including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures.

優化器 · Performer · 劃分 · Storage · 代價 ·

2023 年 5 月 24 日

Towards Optimizing Storage Costs on the Cloud

Koyel Mukherjee,Raunak Shah,Shiv Kumar Saini,Karanpreet Singh, Khushi,Harsh Kesarwani,Kavya Barnwal,Ayush Chauhan

from arxiv, The first two authors contributed equally. 12 pages, Accepted to the International Conference on Data Engineering (ICDE) 2023

We study the problem of optimizing data storage and access costs on the cloud while ensuring that the desired performance or latency is unaffected. We first propose an optimizer that optimizes the data placement tier (on the cloud) and the choice of compression schemes to apply, for given data partitions with temporal access predictions. Secondly, we propose a model to learn the compression performance of multiple algorithms across data partitions in different formats to generate compression performance predictions on the fly, as inputs to the optimizer. Thirdly, we propose to approach the data partitioning problem fundamentally differently than the current default in most data lakes where partitioning is in the form of ingestion batches. We propose access pattern aware data partitioning and formulate an optimization problem that optimizes the size and reading costs of partitions subject to access patterns. We study the various optimization problems theoretically as well as empirically, and provide theoretical bounds as well as hardness results. We propose a unified pipeline of cost minimization, called SCOPe that combines the different modules. We extensively compare the performance of our methods with related baselines from the literature on TPC-H data as well as enterprise datasets (ranging from GB to PB in volume) and show that SCOPe substantially improves over the baselines. We show significant cost savings compared to platform baselines, of the order of 50% to 83% on enterprise Data Lake datasets that range from terabytes to petabytes in volume.

Extensibility · Analysis · 線性的 · 論文 ·

2023 年 5 月 23 日

The Membership Problem for Hypergeometric Sequences with Quadratic Parameters

George Kenison,Klara Nosan,Mahsa Shirmohammadi,James Worrell

from arxiv, 18 pages (including appendices). Accepted at ISSAC 2023

Hypergeometric sequences are rational-valued sequences that satisfy first-order linear recurrence relations with polynomial coefficients; that is, a hypergeometric sequence $\langle u_n \rangle_{n=0}^{\infty}$ is one that satisfies a recurrence of the form $f(n)u_n = g(n)u_{n-1}$ where $f,g \in \mathbb{Z}[x]$. In this paper, we consider the Membership Problem for hypergeometric sequences: given a hypergeometric sequence $\langle u_n \rangle_{n=0}^{\infty}$ and a target value $t\in \mathbb{Q}$, determine whether $u_n=t$ for some index $n$. We establish decidability of the Membership Problem under the assumption that either (i) $f$ and $g$ have distinct splitting fields or (ii) $f$ and $g$ are monic polynomials that both split over a quadratic extension of $\mathbb{Q}$. Our results are based on an analysis of the prime divisors of polynomial sequences $\langle f(n) \rangle_{n=1}^\infty$ and $\langle g(n) \rangle_{n=1}^\infty$ appearing in the recurrence relation.