露脸视频一区二区三区在线播放_日韩精品国产阿V免费在线观看网址_久久99精品久久久久久精东影业_欧美日韩国产一级视频_欧美激情A级视频在线看_欧美全黄色一级片_一区二区免费有码在线看

It is well known that for general linear systems, only optimal Krylov methods with long recurrences exist. For special classes of linear systems it is possible to find optimal Krylov methods with short recurrences. In this paper we consider the important class of linear systems with a shifted skew-symmetric coefficient matrix. We present the MRS3 solver, a minimal residual method that solves these problems using short vector recurrences. We give an overview of existing Krylov solvers that can be used to solve these problems, and compare them with the MRS3 method, both theoretically and by numerical experiments. From this comparison we argue that the MRS3 solver is the fastest and most robust of these Krylov method for systems with a shifted skew-symmetric coefficient matrix.

相關內容

線(xian)性系統

關注 0

自動問答 · MoDELS · INFORMS · 語言模型化 · Analysis ·

2023 年 5 月 26 日

An Empirical Comparison of LM-based Question and Answer Generation Methods

Asahi Ushio,Fernando Alva-Manchego,Jose Camacho-Collados

from arxiv, Accepted by ACL 2023 Findings

Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.

估計/估計量 · 圖 · 流 · 稀疏 · 情景 ·

2023 年 5 月 26 日

Sublinear-Space Streaming Algorithms for Estimating Graph Parameters on Sparse Graphs

Xiuge Chen,Rajesh Chitnis,Patrick Eades,Anthony Wirth

In this paper, we design sub-linear space streaming algorithms for estimating three fundamental parameters -- maximum independent set, minimum dominating set and maximum matching -- on sparse graph classes, i.e., graphs which satisfy $m=O(n)$ where $m,n$ is the number of edges, vertices respectively. Each of the three graph parameters we consider can have size $\Omega(n)$ even on sparse graph classes, and hence for sublinear-space algorithms we are restricted to parameter estimation instead of attempting to find a solution.

秩 · 列 · 行 · 正交 · Projection ·

2023 年 5 月 26 日

Kaczmarz-Type Method for Solving Matrix Equation $AXB=C$

Xing Lili,Bao Wendi,Li Weiguo

In this paper, several row and column orthogonal projection methods are proposed for solving matrix equation $AXB=C$, where the matrix $A$ and $B$ are full rank or rank deficient and equation is consistent or not. These methods are iterative methods without matrix multiplication. It is theoretically proved these methods converge to the solution or least-squares solution of the matrix equation. Numerical results show that these methods are more efficient than iterative methods involving matrix multiplication for high-dimensional matrix.

Learning · 判別器 · 回合 · 獎勵函數 · 泛函 ·

2023 年 5 月 25 日

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

Ademi Adeniji,Amber Xie,Pieter Abbeel

from arxiv, 16 pages

While unsupervised skill discovery has shown promise in autonomously acquiring behavioral primitives, there is still a large methodological disconnect between task-agnostic skill pretraining and downstream, task-aware finetuning. We present Intrinsic Reward Matching (IRM), which unifies these two phases of learning via the $\textit{skill discriminator}$, a pretraining model component often discarded during finetuning. Conventional approaches finetune pretrained agents directly at the policy level, often relying on expensive environment rollouts to empirically determine the optimal skill. However, often the most concise yet complete description of a task is the reward function itself, and skill learning methods learn an $\textit{intrinsic}$ reward function via the discriminator that corresponds to the skill policy. We propose to leverage the skill discriminator to $\textit{match}$ the intrinsic and downstream task rewards and determine the optimal skill for an unseen task without environment samples, consequently finetuning with greater sample-efficiency. Furthermore, we generalize IRM to sequence skills for complex, long-horizon tasks and demonstrate that IRM enables us to utilize pretrained skills far more effectively than previous skill selection methods on both the Fetch tabletop and Franka Kitchen robot manipulation benchmarks.

Continuity · Learning · MoDELS · 近似誤差 · INFORMS ·

2023 年 5 月 25 日

SketchOGD: Memory-Efficient Continual Learning

Benjamin Wright,Youngjae Min,Jeremy Bernstein,Navid Azizan

When machine learning models are trained continually on a sequence of tasks, they are liable to forget what they learned on previous tasks -- a phenomenon known as catastrophic forgetting. Proposed solutions to catastrophic forgetting tend to involve storing information about past tasks, meaning that memory usage is a chief consideration in determining their practicality. This paper proposes a memory-efficient solution to catastrophic forgetting, improving upon an established algorithm known as orthogonal gradient descent (OGD). OGD utilizes prior model gradients to find weight updates that preserve performance on prior datapoints. However, since the memory cost of storing prior model gradients grows with the runtime of the algorithm, OGD is ill-suited to continual learning over arbitrarily long time horizons. To address this problem, this paper proposes SketchOGD. SketchOGD employs an online sketching algorithm to compress model gradients as they are encountered into a matrix of a fixed, user-determined size. In contrast to existing memory-efficient variants of OGD, SketchOGD runs online without the need for advance knowledge of the total number of tasks, is simple to implement, and is more amenable to analysis. We provide theoretical guarantees on the approximation error of the relevant sketches under a novel metric suited to the downstream task of OGD. Experimentally, we find that SketchOGD tends to outperform current state-of-the-art variants of OGD given a fixed memory budget.

GROUP · 穩健性 · 可行 · 可約的 · Less ·

2023 年 5 月 25 日

Rectifying Group Irregularities in Explanations for Distribution Shift

Adam Stein,Yinjun Wu,Eric Wong,Mayur Naik

from arxiv, 19 pages, 5 figures

It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these two distributions. However, these methods can introduce group irregularities, leading to explanations that are less feasible and robust. To address these issues, we propose Group-aware Shift Explanations (GSE), a method that produces interpretable explanations by leveraging worst-group optimization to rectify group irregularities. We demonstrate how GSE not only maintains group structures, such as demographic and hierarchical subpopulations, but also enhances feasibility and robustness in the resulting explanations in a wide range of tabular, language, and image settings.

潛變量/隱變量 · 自由能 · 極大似然 · 最大似然估計 · 向量空間 ·

2023 年 5 月 24 日

CoinEM: Tuning-Free Particle-Based Variational Inference for Latent Variable Models

Louis Sharrock,Daniel Dodd,Christopher Nemeth

We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning.

變換 · MoDELS · Networking · Learning · Neural Networks ·

2023 年 5 月 24 日

Can Transformers Learn to Solve Problems Recursively?

Shizhuo Dylan Zhang,Curt Tigges,Stella Biderman,Maxim Raginsky,Talia Ringer

Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them. While semantic information plays a crucial part in these processes, it remains unclear to what degree popular neural architectures like transformers are capable of modeling that information. This paper examines the behavior of neural networks learning algorithms relevant to programs and formal verification proofs through the lens of mechanistic interpretability, focusing in particular on structural recursion. Structural recursion is at the heart of tasks on which symbolic tools currently outperform neural models, like inferring semantic relations between datatypes and emulating program behavior. We evaluate the ability of transformer models to learn to emulate the behavior of structurally recursive functions from input-output examples. Our evaluation includes empirical and conceptual analyses of the limitations and capabilities of transformer models in approximating these functions, as well as reconstructions of the ``shortcut" algorithms the model learns. By reconstructing these algorithms, we are able to correctly predict 91 percent of failure cases for one of the approximated functions. Our work provides a new foundation for understanding the behavior of neural networks that fail to solve the very tasks they are trained for.

Extensibility · Analysis · 線性的 · 論文 ·

2023 年 5 月 23 日

The Membership Problem for Hypergeometric Sequences with Quadratic Parameters

George Kenison,Klara Nosan,Mahsa Shirmohammadi,James Worrell

from arxiv, 18 pages (including appendices). Accepted at ISSAC 2023

Hypergeometric sequences are rational-valued sequences that satisfy first-order linear recurrence relations with polynomial coefficients; that is, a hypergeometric sequence $\langle u_n \rangle_{n=0}^{\infty}$ is one that satisfies a recurrence of the form $f(n)u_n = g(n)u_{n-1}$ where $f,g \in \mathbb{Z}[x]$. In this paper, we consider the Membership Problem for hypergeometric sequences: given a hypergeometric sequence $\langle u_n \rangle_{n=0}^{\infty}$ and a target value $t\in \mathbb{Q}$, determine whether $u_n=t$ for some index $n$. We establish decidability of the Membership Problem under the assumption that either (i) $f$ and $g$ have distinct splitting fields or (ii) $f$ and $g$ are monic polynomials that both split over a quadratic extension of $\mathbb{Q}$. Our results are based on an analysis of the prime divisors of polynomial sequences $\langle f(n) \rangle_{n=1}^\infty$ and $\langle g(n) \rangle_{n=1}^\infty$ appearing in the recurrence relation.

無監督 · 表示學習 · 損失函數（機器學習） · 學成 · 未標記 ·

2020 年 2 月 26 日

Evolving Losses for Unsupervised Video Representation Learning

AJ Piergiovanni,Anelia Angelova,Michael S. Ryoo

from arxiv, arXiv admin note: text overlap with arXiv:1906.03248

We present a new method to learn video representations from large-scale unlabeled video data. Ideally, this representation will be generic and transferable, directly usable for new tasks such as action recognition and zero or few-shot learning. We formulate unsupervised representation learning as a multi-modal, multi-task learning problem, where the representations are shared across different modalities via distillation. Further, we introduce the concept of loss function evolution by using an evolutionary search algorithm to automatically find optimal combination of loss functions capturing many (self-supervised) tasks and modalities. Thirdly, we propose an unsupervised representation evaluation metric using distribution matching to a large unlabeled dataset as a prior constraint, based on Zipf's law. This unsupervised constraint, which is not guided by any labeling, produces similar results to weakly-supervised, task-specific ones. The proposed unsupervised representation learning results in a single RGB network and outperforms previous methods. Notably, it is also more effective than several label-based methods (e.g., ImageNet), with the exception of large, fully labeled video datasets.