久久香蕉国产线看观看亚洲卡_女人让男人桶爽在线观看_最新亚洲中文字幕_久久久久综合香蕉久久久久久久_樱桃视频影院在线观看_欧洲亚洲国产日韩妇女_欧美精品一区二区三区3

We study the fundamental problem of estimating the mean of a $d$-dimensional distribution with covariance $\Sigma \preccurlyeq \sigma^2 I_d$ given $n$ samples. When $d = 1$, Catoni \cite{catoni} showed an estimator with error $(1+o(1)) \cdot \sigma \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$, with probability $1 - \delta$, matching the Gaussian error rate. For $d>1$, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a $1-\delta$ confidence radius of $\sqrt{\frac{2 d}{d+1}} \cdot \sigma \left(\sqrt{\frac{d}{n}} + \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}\right)$, incurring a $\sqrt{\frac{2d}{d+1}}$-factor loss over the Gaussian rate. When the $\sqrt{\frac{d}{n}}$ term dominates by a $\sqrt{\log \frac{1}{\delta}}$ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: is the Gaussian rate achievable in general? Or is the $\sqrt{\frac{2 d}{d+1}}$ loss \emph{necessary} when the $\sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$ term dominates? We show that the answer to both these questions is \emph{no} -- we show that \emph{some} constant-factor loss over the Gaussian rate is necessary, but construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an $\epsilon$-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the $\sqrt{\frac{2d}{d+1}}$-factor \emph{is} optimal in the infinite-sample limit.

相關內容

估計/估計量

關注 3

Learning · 分解的 · 自編碼器 · 向量化 · 變分自編碼 ·

2024 年 1 月 19 日

CFASL: Composite Factor-Aligned Symmetry Learning for Disentanglement in Variational AutoEncoder

Hee-Jun Jung,Jaehyoung Jeong,Kangil Kim

from arxiv, 21 pages, 14 figures

Symmetries of input and latent vectors have provided valuable insights for disentanglement learning in VAEs.However, only a few works were proposed as an unsupervised method, and even these works require known factor information in training data. We propose a novel method, Composite Factor-Aligned Symmetry Learning (CFASL), which is integrated into VAEs for learning symmetry-based disentanglement in unsupervised learning without any knowledge of the dataset factor information.CFASL incorporates three novel features for learning symmetry-based disentanglement: 1) Injecting inductive bias to align latent vector dimensions to factor-aligned symmetries within an explicit learnable symmetry codebook 2) Learning a composite symmetry to express unknown factors change between two random samples by learning factor-aligned symmetries within the codebook 3) Inducing group equivariant encoder and decoder in training VAEs with the two conditions. In addition, we propose an extended evaluation metric for multi-factor changes in comparison to disentanglement evaluation in VAEs. In quantitative and in-depth qualitative analysis, CFASL demonstrates a significant improvement of disentanglement in single-factor change, and multi-factor change conditions compared to state-of-the-art methods.

損失函數（機器學習） · 泛函 · 損失 · 圖 · CASES ·

2024 年 1 月 18 日

Bounding the Interleaving Distance for Mapper Graphs with a Loss Function

Erin W. Chambers,Elizabeth Munch,Sarah Percival,Bei Wang

from arxiv, Title and focused changed since we realized that the loss function applied to a broader class of inputs than simply geometric graphs

Data consisting of a graph with a function to $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances or metrics between them. In this work, we study the interleaving distance on discretizations of these objects, $\mathbb{R}^d$-mapper graphs, where functor representations of the data can be compared by finding pairs of natural transformations between them. However, in many cases, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from the work of Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation, called assignments. We then endow the functor images with the extra structure of a metric space and define a loss function which measures how far an assignment is from making the required diagrams of an interleaving commute. Finally we show that the computation of the loss function is polynomial. We believe this idea is both powerful and translatable, with the potential to be used for approximation and bounds on interleavings in a broad array of contexts.

Oracle · 分離的 · 相似度 · 論文 ·

2024 年 1 月 18 日

Classical vs Quantum Advice and Proofs under Classically-Accessible Oracle

Xingjian Li,Qipeng Liu,Angelos Pelecanos,Takashi Yamakawa

from arxiv, 31 pages. Added classically-accessible classical oracle separation of QMA and QCMA and updated the abstract. v4: Fixed an issue with the proof of Claim 5.2

It is a long-standing open question to construct a classical oracle relative to which BQP/qpoly $\neq$ BQP/poly or QMA $\neq$ QCMA. In this paper, we construct classically-accessible classical oracles relative to which BQP/qpoly $\neq$ BQP/poly and QMA $\neq$ QCMA. Here, classically-accessible classical oracles are oracles that can be accessed only classically even for quantum algorithms. Based on a similar technique, we also show an alternative proof for the separation of QMA and QCMA relative to a distributional quantumly-accessible classical oracle, which was recently shown by Natarajan and Nirkhe.

線性的 · 優化器 · MoDELS · 平穩的 · 類別 ·

2024 年 1 月 18 日

Interpolatory Necessary Optimality Conditions for Reduced-order Modeling of Parametric Linear Time-invariant Systems

Petar Mlinari?,Peter Benner,Serkan Gugercin

from arxiv, 8 pages

Interpolatory necessary optimality conditions for $\mathcal{H}_2$-optimal reduced-order modeling of non-parametric linear time-invariant (LTI) systems are known and well-investigated. In this work, using the general framework of $\mathcal{L}_2$-optimal reduced-order modeling of parametric stationary problems, we derive interpolatory $\mathcal{H}_2 \otimes \mathcal{L}_2$-optimality conditions for parametric LTI systems with a general pole-residue form. We then specialize this result to recover known conditions for systems with parameter-independent poles and develop new conditions for a certain class of systems with parameter-dependent poles.

樣本 · 離散化 · Batch Size · 計算學習理論 ·

2024 年 1 月 17 日

Tight Group-Level DP Guarantees for DP-SGD with Sampling via Mixture of Gaussians Mechanisms

Arun Ganesh

We give a procedure for computing group-level $(\epsilon, \delta)$-DP guarantees for DP-SGD, when using Poisson sampling or fixed batch size sampling. Up to discretization errors in the implementation, the DP guarantees computed by this procedure are tight (assuming we release every intermediate iterate).

Networking · Neural Networks · 重要性采樣 · 樣本 · 均方誤差 ·

2024 年 1 月 17 日

DMIS: Dynamic Mesh-based Importance Sampling for Training Physics-Informed Neural Networks

Zijiang Yang,Zhongwei Qiu,Dongmei Fu

from arxiv, Accepted to AAAl-23

Modeling dynamics in the form of partial differential equations (PDEs) is an effectual way to understand real-world physics processes. For complex physics systems, analytical solutions are not available and numerical solutions are widely-used. However, traditional numerical algorithms are computationally expensive and challenging in handling multiphysics systems. Recently, using neural networks to solve PDEs has made significant progress, called physics-informed neural networks (PINNs). PINNs encode physical laws into neural networks and learn the continuous solutions of PDEs. For the training of PINNs, existing methods suffer from the problems of inefficiency and unstable convergence, since the PDE residuals require calculating automatic differentiation. In this paper, we propose Dynamic Mesh-based Importance Sampling (DMIS) to tackle these problems. DMIS is a novel sampling scheme based on importance sampling, which constructs a dynamic triangular mesh to estimate sample weights efficiently. DMIS has broad applicability and can be easily integrated into existing methods. The evaluation of DMIS on three widely-used benchmarks shows that DMIS improves the convergence speed and accuracy in the meantime. Especially in solving the highly nonlinear Schr\"odinger Equation, compared with state-of-the-art methods, DMIS shows up to 46% smaller root mean square error and five times faster convergence speed. Code are available at //github.com/MatrixBrain/DMIS.

泛函 · Performer · 可約的 · Learning · state-of-the-art ·

2024 年 1 月 16 日

Shabari: Delayed Decision-Making for Faster and Efficient Serverless Function

Prasoon Sinha,Kostis Kaffes,Neeraja J. Yadwadkar

from arxiv, 17 pages, 14 figures

Serverless computing relieves developers from the burden of resource management, thus providing ease-of-use to the users and the opportunity to optimize resource utilization for the providers. However, today's serverless systems lack performance guarantees for function invocations, thus limiting support for performance-critical applications: we observed severe performance variability (up to 6x). Providers lack visibility into user functions and hence find it challenging to right-size them: we observed heavy resource underutilization (up to 80%). To understand the causes behind the performance variability and underutilization, we conducted a measurement study of commonly deployed serverless functions and learned that the function performance and resource utilization depend crucially on function semantics and inputs. Our key insight is to delay making resource allocation decisions until after the function inputs are available. We introduce Shabari, a resource management framework for serverless systems that makes decisions as late as possible to right-size each invocation to meet functions' performance objectives (SLOs) and improve resource utilization. Shabari uses an online learning agent to right-size each function invocation based on the features of the function input and makes cold-start-aware scheduling decisions. For a range of serverless functions and inputs, Shabari reduces SLO violations by 11-73% while not wasting any vCPUs and reducing wasted memory by 64-94% in the median case, compared to state-of-the-art systems, including Aquatope, Parrotfish, and Cypress.

Lipschitz · Lipschitz連續 · Continuity · 再縮放 · 離散化 ·

2024 年 1 月 16 日

A Continuous-Time Perspective on Global Acceleration for Monotone Equation Problems

Tianyi Lin,Michael. I. Jordan

from arxiv, Accepted by Communications in Optimization Theory; 29 Pages

We propose a new framework to design and analyze accelerated methods that solve general monotone equation (ME) problems $F(x)=0$. Traditional approaches include generalized steepest descent methods and inexact Newton-type methods. If $F$ is uniformly monotone and twice differentiable, these methods achieve local convergence rates while the latter methods are globally convergent thanks to line search and hyperplane projection. However, a global rate is unknown for these methods. The variational inequality methods can be applied to yield a global rate that is expressed in terms of $\|F(x)\|$ but these results are restricted to first-order methods and a Lipschitz continuous operator. It has not been clear how to obtain global acceleration using high-order Lipschitz continuity. This paper takes a continuous-time perspective where accelerated methods are viewed as the discretization of dynamical systems. Our contribution is to propose accelerated rescaled gradient systems and prove that they are equivalent to closed-loop control systems. Based on this connection, we establish the properties of solution trajectories. Moreover, we provide a unified algorithmic framework obtained from discretization of our system, which together with two approximation subroutines yields both existing high-order methods and new first-order methods. We prove that the $p^{th}$-order method achieves a global rate of $O(k^{-p/2})$ in terms of $\|F(x)\|$ if $F$ is $p^{th}$-order Lipschitz continuous and the first-order method achieves the same rate if $F$ is $p^{th}$-order strongly Lipschitz continuous. If $F$ is strongly monotone, the restarted versions achieve local convergence with order $p$ when $p \geq 2$. Our discrete-time analysis is largely motivated by the continuous-time analysis and demonstrates the fundamental role that rescaled gradients play in global acceleration for solving ME problems.

可約的 · 分解的 · 線性的 · 標量 · CASE ·

2024 年 1 月 16 日

Hypergeometric Solutions of Linear Difference Systems

Moulay Barkatou,Mark van Hoeij,Johannes Middeke,Yi Zhou

from arxiv, 24 pages

We extend Petkov\v{s}ek's algorithm for computing hypergeometric solutions of scalar difference equations to the case of difference systems $\tau(Y) = M Y$, with $M \in {\rm GL}_n(C(x))$, where $\tau$ is the shift operator. Hypergeometric solutions are solutions of the form $\gamma P$ where $P \in C(x)^n$ and $\gamma$ is a hypergeometric term over $C(x)$, i.e. ${\tau(\gamma)}/{\gamma} \in C(x)$. Our contributions concern efficient computation of a set of candidates for ${\tau(\gamma)}/{\gamma}$ which we write as $\lambda = c\frac{A}{B}$ with monic $A, B \in C[x]$, $c \in C^*$. Factors of the denominators of $M^{-1}$ and $M$ give candidates for $A$ and $B$, while another algorithm is needed for $c$. We use the super-reduction algorithm to compute candidates for $c$, as well as other ingredients to reduce the list of candidates for $A/B$. To further reduce the number of candidates $A/B$, we bound the so-called type of $A/B$ by bounding local types. Our algorithm has been implemented in Maple and experiments show that our implementation can handle systems of high dimension, which is useful for factoring operators.

學成 · 泛化理論 · AIM · state-of-the-art · 強化學習 ·

2019 年 10 月 24 日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Tianhe Yu,Deirdre Quillen,Zhanpeng He,Ryan Julian,Karol Hausman,Chelsea Finn,Sergey Levine

from arxiv, CoRL 2019. Videos are here: meta-world.github.io and open-sourced codes are available at: //github.com/rlworkgroup/metaworld

Meta-reinforcement learning algorithms can enable robots to acquire new skills much more quickly, by leveraging prior experience to learn how to learn. However, much of the current research on meta-reinforcement learning focuses on task distributions that are very narrow. For example, a commonly used meta-reinforcement learning benchmark uses different running velocities for a simulated robot as different tasks. When policies are meta-trained on such narrow task distributions, they cannot possibly generalize to more quickly acquire entirely new tasks. Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors. In this paper, we propose an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. Our aim is to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks. We evaluate 6 state-of-the-art meta-reinforcement learning and multi-task learning algorithms on these tasks. Surprisingly, while each task and its variations (e.g., with different object positions) can be learned with reasonable success, these algorithms struggle to learn with multiple tasks at the same time, even with as few as ten distinct training tasks. Our analysis and open-source environments pave the way for future research in multi-task learning and meta-learning that can enable meaningful generalization, thereby unlocking the full potential of these methods.