亚洲精品无码国产爽快A片百度,在线国产视频9999,又大又粗又硬又猛又黄又爽,国产精品大秀视频日韩无码,精品国产美女在线观看午夜

We study active learning methods for single index models of the form $F({\mathbf x}) = f(\langle {\mathbf w}, {\mathbf x}\rangle)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\mathbf x,\mathbf w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of \cite{gajjar2023active}. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley's inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.

相關內容

主動學習

關注 240

主動學習是機器學習（更普遍的說是人工智能）的一個子領域，在統計學領域也叫查詢學習、最優實驗設計。“學習模塊”和“選擇策略”是主動學習算法的2個基本且重要的模塊。主動學習是“一種學習方法，在這種方法中，學生會主動或體驗性地參與學習過程，并且根據學生的參與程度，有不同程度的主動學習。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“學生除了被動地聽課以外，還從事其他活動。” 在高等教育研究協會（ASHE）的一份報告中，作者討論了各種促進主動學習的方法。他們引用了一些文獻，這些文獻表明學生不僅要做聽，還必須做更多的事情才能學習。他們必須閱讀，寫作，討論并參與解決問題。此過程涉及三個學習領域，即知識，技能和態度（KSA）。這種學習行為分類法可以被認為是“學習過程的目標”。特別是，學生必須從事諸如分析，綜合和評估之類的高級思維任務。

相互獨立的 · 圖 · FPT · 秩 · 約束 ·

2024 年 6 月 27 日

Cuts in Graphs with Matroid Constraints

Aritra Banik,Fedor V. Fomin,Petr A. Golovach,Tanmay Inamdar,Satyabrata Jana,Saket Saurabh

{\sc Vertex $(s, t)$-Cut} and {\sc Vertex Multiway Cut} are two fundamental graph separation problems in algorithmic graph theory. We study matroidal generalizations of these problems, where in addition to the usual input, we are given a representation $R \in \mathbb{F}^{r \times n}$ of a linear matroid $\mathcal{M} = (V(G), \mathcal{I})$ of rank $r$ in the input, and the goal is to determine whether there exists a vertex subset $S \subseteq V(G)$ that has the required cut properties, as well as is independent in the matroid $\mathcal{M}$. We refer to these problems as {\sc Independent Vertex $(s, t)$-cut}, and {\sc Independent Multiway Cut}, respectively. We show that these problems are fixed-parameter tractable ({\sf FPT}) when parameterized by the solution size (which can be assumed to be equal to the rank of the matroid $\mathcal{M}$). These results are obtained by exploiting the recent technique of flow augmentation [Kim et al.~STOC '22], combined with a dynamic programming algorithm on flow-paths \'a la [Feige and Mahdian,~STOC '06] that maintains a representative family of solutions w.r.t.~the given matroid [Marx, TCS '06; Fomin et al., JACM]. As a corollary, we also obtain {\sf FPT} algorithms for the independent version of {\sc Odd Cycle Transversal}. Further, our results can be generalized to other variants of the problems, e.g., weighted versions, or edge-deletion versions.

類別 · Performer · 情景 · INFORMS · 標注 ·

2024 年 6 月 26 日

Exclusive Style Removal for Cross Domain Novel Class Discovery

Yicheng Wang,Feng Liu,Junmin Liu,Zhen Fang,Kai Sun

As a promising field in open-world learning, \textit{Novel Class Discovery} (NCD) is usually a task to cluster unseen novel classes in an unlabeled set based on the prior knowledge of labeled data within the same domain. However, the performance of existing NCD methods could be severely compromised when novel classes are sampled from a different distribution with the labeled ones. In this paper, we explore and establish the solvability of NCD in cross domain setting with the necessary condition that style information must be removed. Based on the theoretical analysis, we introduce an exclusive style removal module for extracting style information that is distinctive from the baseline features, thereby facilitating inference. Moreover, this module is easy to integrate with other NCD methods, acting as a plug-in to improve performance on novel classes with different distributions compared to the seen labeled set. Additionally, recognizing the non-negligible influence of different backbones and pre-training strategies on the performance of the NCD methods, we build a fair benchmark for future NCD research. Extensive experiments on three common datasets demonstrate the effectiveness of our proposed module.

樣本 · 線性的 · 近似 · 得分 · 均勻采樣 ·

2024 年 6 月 25 日

Gradient Coding with Iterative Block Leverage Score Sampling

Neophytos Charalambides,Mert Pilanci,Alfred Hero

from arxiv, 26 pages, 6 figures, 1 table

We generalize the leverage score sampling sketch for $\ell_2$-subspace embeddings, to accommodate sampling subsets of the transformed data, so that the sketching approach is appropriate for distributed settings. This is then used to derive an approximate coded computing approach for first-order methods; known as gradient coding, to accelerate linear regression in the presence of failures in distributed computational networks, \textit{i.e.} stragglers. We replicate the data across the distributed network, to attain the approximation guarantees through the induced sampling distribution. The significance and main contribution of this work, is that it unifies randomized numerical linear algebra with approximate coded computing, while attaining an induced $\ell_2$-subspace embedding through uniform sampling. The transition to uniform sampling is done without applying a random projection, as in the case of the subsampled randomized Hadamard transform. Furthermore, by incorporating this technique to coded computing, our scheme is an iterative sketching approach to approximately solving linear regression. We also propose weighting when sketching takes place through sampling with replacement, for further compression.

Minimax · 極大似然 · 似然 · 最大似然估計 · 估計/估計量 ·

2024 年 6 月 25 日

Asymptotically Minimax Regret by Bayes Mixtures

Jun'ichi Takeuchi,Andrew R. Barron

We study the problems of data compression, gambling and prediction of a sequence $x^n=x_1x_2...x_n$ from an alphabet ${\cal X}$, in terms of regret and expected regret (redundancy) with respect to various smooth families of probability distributions. We evaluate the regret of Bayes mixture distributions compared to maximum likelihood, under the condition that the maximum likelihood estimate is in the interior of the parameter space. For general exponential families (including the non-i.i.d.\ case) the asymptotically mimimax value is achieved when variants of the prior of Jeffreys are used. %under the condition that the maximum likelihood estimate is in the interior of the parameter space. Interestingly, we also obtain a modification of Jeffreys prior which has measure outside the given family of densities, to achieve minimax regret with respect to non-exponential type families. This modification enlarges the family using local exponential tilting (a fiber bundle). Our conditions are confirmed for certain non-exponential families, including curved families and mixture families (where either the mixture components or their weights of combination are parameterized) as well as contamination models. Furthermore for mixture families we show how to deal with the full simplex of parameters. These results also provide characterization of Rissanen's stochastic complexity.

Learning · 類別 · 模型評估 · 多樣性 · 線性的 ·

2024 年 6 月 25 日

Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective

Sungmin Cha,Jihwan Kwak,Dongsub Shim,Hyunwoo Kim,Moontae Lee,Honglak Lee,Taesup Moon

from arxiv, CoLLAs 2024 camera-ready version

Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data while not forgetting past learned classes. The common evaluation protocol for CIL algorithms is to measure the average test accuracy across all classes learned so far -- however, we argue that solely focusing on maximizing the test accuracy may not necessarily lead to developing a CIL algorithm that also continually learns and updates the representations, which may be transferred to the downstream tasks. To that end, we experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning and propose new analysis methods. Our experiments show that most state-of-the-art algorithms prioritize high stability and do not significantly change the learned representation, and sometimes even learn a representation of lower quality than a naive baseline. However, we observe that these algorithms can still achieve high test accuracy because they enable a model to learn a classifier that closely resembles an estimated linear classifier trained for linear probing. Furthermore, the base model learned in the first task, which involves single-task learning, exhibits varying levels of representation quality across different algorithms, and this variance impacts the final performance of CIL algorithms. Therefore, we suggest that the representation-level evaluation should be considered as an additional recipe for more diverse evaluation for CIL algorithms.

線性的 · 特化 · 離散化 · 近似 · 優化器 ·

2024 年 6 月 25 日

Optimization of Approximate Maps for Linear Systems Arising in Discretized PDEs

Rishad Islam,Arielle Carr,Colin Jacobs

from arxiv, 12 pages, 11 figures, submitted to Proceedings in Applied Mathematics and Mechanics (PAMM)

Generally, discretization of partial differential equations (PDEs) creates a sequence of linear systems $A_k x_k = b_k, k = 0, 1, 2, ..., N$ with well-known and structured sparsity patterns. Preconditioners are often necessary to achieve fast convergence When solving these linear systems using iterative solvers. We can use preconditioner updates for closely related systems instead of computing a preconditioner for each system from scratch. One such preconditioner update is the sparse approximate map (SAM), which is based on the sparse approximate inverse preconditioner using a least squares approximation. A SAM then acts as a map from one matrix in the sequence to another nearby one for which we have an effective preconditioner. To efficiently compute an effective SAM update (i.e., one that facilitates fast convergence of the iterative solver), we seek to compute an optimal sparsity pattern. In this paper, we examine several sparsity patterns for computing the SAM update to characterize optimal or near-optimal sparsity patterns for linear systems arising from discretized PDEs.

樣本 · 規范化的 · 近似 · 逼真度 · state-of-the-art ·

2024 年 6 月 25 日

Scalable Sampling of Truncated Multivariate Normals Using Sequential Nearest-Neighbor Approximation

Jian Cao,Matthias Katzfuss

We propose a linear-complexity method for sampling from truncated multivariate normal (TMVN) distributions with high fidelity by applying nearest-neighbor approximations to a product-of-conditionals decomposition of the TMVN density. To make the sequential sampling based on the decomposition feasible, we introduce a novel method that avoids the intractable high-dimensional TMVN distribution by sampling sequentially from $m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling the fidelity. This allows us to overcome the existing methods' crucial problem of rapidly decreasing acceptance rates for increasing dimension. Throughout our experiments with up to tens of thousands of dimensions, we can produce high-fidelity samples with $m$ in the dozens, achieving superior scalability compared to existing state-of-the-art methods. We study a tetrachloroethylene concentration dataset that has $3{,}971$ observed responses and $20{,}730$ undetected responses, together modeled as a partially censored Gaussian process, where our method enables posterior inference for the censored responses through sampling a $20{,}730$-dimensional TMVN distribution.

圖 · 奇異的 · 可理解性 · CASE · Better ·

2024 年 6 月 25 日

The Spectrum of the Singular Values of Z-Shaped Graph Matrices

Wenjun Cai,Aaron Potechin

Graph matrices are a type of matrix which has played a crucial role in analyzing the sum of squares hierarchy on average case problems. However, except for rough norm bounds, little is known about graph matrices. In this paper, we take a step towards better understanding graph matrices by determining the limiting distribution of the spectrum of the singular values of Z-shaped graph matrices. We then give a partial generalization of our results for $m$-layer Z-shaped graph matrices.

估計/估計量 · 穩健性 · 最優化 · Learning · 優化器 ·

2024 年 6 月 24 日

Robust Distribution Learning with Local and Global Adversarial Corruptions

Sloan Nietert,Ziv Goldfeld,Soroosh Shafiee

from arxiv, Accepted for presentation at the Conference on Learning Theory (COLT) 2024

We consider learning in an adversarial environment, where an $\varepsilon$-fraction of samples from a distribution $P$ are arbitrarily modified (global corruptions) and the remaining perturbations have average magnitude bounded by $\rho$ (local corruptions). Given access to $n$ such corrupted samples, we seek a computationally efficient estimator $\hat{P}_n$ that minimizes the Wasserstein distance $\mathsf{W}_1(\hat{P}_n,P)$. In fact, we attack the fine-grained task of minimizing $\mathsf{W}_1(\Pi_\# \hat{P}_n, \Pi_\# P)$ for all orthogonal projections $\Pi \in \mathbb{R}^{d \times d}$, with performance scaling with $\mathrm{rank}(\Pi) = k$. This allows us to account simultaneously for mean estimation ($k=1$), distribution estimation ($k=d$), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + \rho + \tilde{O}(d\sqrt{k}n^{-1/(k \lor 2)})$ when $P$ has bounded covariance. This guarantee holds uniformly in $k$ and is minimax optimal up to the sub-optimality of the plug-in estimator when $\rho = \varepsilon = 0$. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.

跡 · 相關系數 · Ray · 分離的 · 樣例 ·

2024 年 6 月 22 日

A note on MDS Property of Circulant Matrices

Tapas Chatterjee,Ayantika Laha

from arxiv, 15 pages. arXiv admin note: text overlap with arXiv:2406.15872

In $2014$, Gupta and Ray proved that the circulant involutory matrices over the finite field $\mathbb{F}_{2^m}$ can not be maximum distance separable (MDS). This non-existence also extends to circulant orthogonal matrices of order $2^d \times 2^d$ over finite fields of characteristic $2$. These findings inspired many authors to generalize the circulant property for constructing lightweight MDS matrices with practical applications in mind. Recently, in $2022,$ Chatterjee and Laha initiated a study of circulant matrices by considering semi-involutory and semi-orthogonal properties. Expanding on their work, this article delves into circulant matrices possessing these characteristics over the finite field $\mathbb{F}_{2^m}.$ Notably, we establish a correlation between the trace of associated diagonal matrices and the MDS property of the matrix. We prove that this correlation holds true for even order semi-orthogonal matrices and semi-involutory matrices of all orders. Additionally, we provide examples that for circulant, semi-orthogonal matrices of odd orders over a finite field with characteristic $2$, the trace of associated diagonal matrices may possess non-zero values.