我的美女教师在线观看免费_日韩理论图片网止_激情天堂在线WWW天堂在线最_精品免费一区二区三区蜜桃_国产免费一区二区三区最新不卡_五月丁香激色婷五月天_精品国产亚洲AV色欲三区

We focus on constrained, $L$-smooth, nonconvex-nonconcave min-max problems either satisfying $\rho$-cohypomonotonicity or admitting a solution to the $\rho$-weakly Minty Variational Inequality (MVI), where larger values of the parameter $\rho>0$ correspond to a greater degree of nonconvexity. These problem classes include examples in two player reinforcement learning, interaction dominant min-max problems, and certain synthetic test problems on which classical min-max algorithms fail. It has been conjectured that first-order methods can tolerate value of $\rho$ no larger than $\frac{1}{L}$, but existing results in the literature have stagnated at the tighter requirement $\rho < \frac{1}{2L}$. With a simple argument, we obtain optimal or best-known complexity guarantees with cohypomonotonicity or weak MVI conditions for $\rho < \frac{1}{L}$. The algorithms we analyze are inexact variants of Halpern and Krasnosel'ski\u{\i}-Mann (KM) iterations. We also provide algorithms and complexity guarantees in the stochastic case with the same range on $\rho$. Our main insight for the improvements in the convergence analyses is to harness the recently proposed "conic nonexpansiveness" property of operators. As byproducts, we provide a refined analysis for inexact Halpern iteration and propose a stochastic KM iteration with a multilevel Monte Carlo estimator.

相關內容

非(fei)凸

關注 0

賭博機/老虎機 · 貪心 · 貪心逐層預訓練 · 子采樣 · 優化器 ·

2024 年 3 月 20 日

The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

Mohsen Bayati,Nima Hamidi,Ramesh Johari,Khashayar Khosravi

We investigate a Bayesian $k$-armed bandit problem in the \emph{many-armed} regime, where $k \geq \sqrt{T}$ and $T$ represents the time horizon. Initially, and aligned with recent literature on many-armed bandit problems, we observe that subsampling plays a key role in designing optimal algorithms; the conventional UCB algorithm is sub-optimal, whereas a subsampled UCB (SS-UCB), which selects $\Theta(\sqrt{T})$ arms for execution under the UCB framework, achieves rate-optimality. However, despite SS-UCB's theoretical promise of optimal regret, it empirically underperforms compared to a greedy algorithm that consistently chooses the empirically best arm. This observation extends to contextual settings through simulations with real-world data. Our findings suggest a new form of \emph{free exploration} beneficial to greedy algorithms in the many-armed context, fundamentally linked to a tail event concerning the prior distribution of arm rewards. This finding diverges from the notion of free exploration, which relates to covariate variation, as recently discussed in contextual bandit literature. Expanding upon these insights, we establish that the subsampled greedy approach not only achieves rate-optimality for Bernoulli bandits within the many-armed regime but also attains sublinear regret across broader distributions. Collectively, our research indicates that in the many-armed regime, practitioners might find greater value in adopting greedy algorithms.

線性的 · 類別 · 圖 · 分離的 · 向量化 ·

2024 年 3 月 20 日

Towards the Characterization of Terminal Cut Functions: a Condition for Laminar Families

Yu Chen,Zihan Tan

We study the following characterization problem. Given a set $T$ of terminals and a $(2^{|T|}-2)$-dimensional vector $\pi$ whose coordinates are indexed by proper subsets of $T$, is there a graph $G$ that contains $T$, such that for all subsets $\emptyset\subsetneq S\subsetneq T$, $\pi_S$ equals the value of the min-cut in $G$ separating $S$ from $T\setminus S$? The only known necessary conditions are submodularity and a special class of linear inequalities given by Chaudhuri, Subrahmanyam, Wagner and Zaroliagis. Our main result is a new class of linear inequalities concerning laminar families, that generalize all previous ones. Using our new class of inequalities, we can generalize Karger's approximate min-cut counting result to graphs with terminals.

損失函數（機器學習） · 泛函 · 損失 · 圖 · CASES ·

2024 年 3 月 19 日

Bounding the Interleaving Distance for Mapper Graphs with a Loss Function

Erin W. Chambers,Elizabeth Munch,Sarah Percival,Bei Wang

from arxiv, Updates for typos. Added content on connection to Reeb graphs

Data consisting of a graph with a function mapping into $\R^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances between them. In this work, we study the interleaving distance on discretization of these objects, $\R^d$-mapper graphs, where functor representations of the data can be compared by finding pairs of natural transformations between them. However, in many cases, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from recent work by Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation, called assignments. We then endow the functor images with the extra structure of a metric space and define a loss function which measures how far an assignment is from making the required diagrams of an interleaving commute. Finally we show that the computation of the loss function is polynomial with a given assignment. We believe this idea is both powerful and translatable, with the potential to provide approximations and bounds on interleavings in a broad array of contexts.

Tensor · Integration · 流形 · 近似 · 講稿 ·

2024 年 3 月 19 日

Cross Algorithms for Cost-Effective Time Integration of Nonlinear Tensor Differential Equations on Low-Rank Tucker Tensor and Tensor Train Manifolds

Behzad Ghahremani,Hessam Babaee

Dynamical low-rank approximation (DLRA) provides a rigorous, cost-effective mathematical framework for solving high-dimensional tensor differential equations (TDEs) on low-rank tensor manifolds. Despite their effectiveness, DLRA-based low-rank approximations lose their computational efficiency when applied to nonlinear TDEs, particularly those exhibiting non-polynomial nonlinearity. In this paper, we present a novel algorithm for the time integration of TDEs on the tensor train and Tucker tensor low-rank manifolds, which are the building blocks of many tensor network decompositions. This paper builds on our previous work (Donello et al., Proceedings of the Royal Society A, Vol. 479, 2023) on solving nonlinear matrix differential equations on low-rank matrix manifolds using CUR decompositions. The methodology we present offers multiple advantages: (i) it leverages cross algorithms based on the discrete empirical interpolation method to strategically sample sparse entries of the time-discrete TDEs to advance the solution in low-rank form. As a result, it offers near-optimal computational savings both in terms of memory and floating-point operations. (ii) The time integration is robust in the presence of small or zero singular values. (iii) The algorithm is remarkably easy to implement, as it requires the evaluation of the full-order model TDE at strategically selected entries and it does not use tangent space projections, whose efficient implementation is intrusive and time-consuming. (iv) We develop high-order explicit Runge-Kutta schemes for the time integration of TDEs on low-rank manifolds. We demonstrate the efficiency of the presented algorithm for several test cases, including a 100-dimensional TDE with non-polynomial nonlinearity.

Networking · ReLU · MoDELS · 全局優化 · 線性的 ·

2024 年 3 月 18 日

A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features

Emi Zeger,Yifei Wang,Aaron Mishkin,Tolga Ergen,Emmanuel Candès,Mert Pilanci

We prove that training neural networks on 1-D data is equivalent to solving a convex Lasso problem with a fixed, explicitly defined dictionary matrix of features. The specific dictionary depends on the activation and depth. We consider 2-layer networks with piecewise linear activations, deep narrow ReLU networks with up to 4 layers, and rectangular and tree networks with sign activation and arbitrary depth. Interestingly in ReLU networks, a fourth layer creates features that represent reflections of training data about themselves. The Lasso representation sheds insight to globally optimal networks and the solution landscape.

樣本復雜度 · 優化器 · 樣本 · Minimax · 張成子空間 ·

2024 年 3 月 18 日

Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs

Matthew Zurek,Yudong Chen

from arxiv, 42 pages, 2 figures; this article supersedes arXiv:2311.13469

We study the sample complexity of learning an $\epsilon$-optimal policy in an average-reward Markov decision process (MDP) under a generative model. For weakly communicating MDPs, we establish the complexity bound $\tilde{O}(SA\frac{H}{\epsilon^2})$, where $H$ is the span of the bias function of the optimal policy and $SA$ is the cardinality of the state-action space. Our result is the first that is minimax optimal (up to log factors) in all parameters $S,A,H$ and $\epsilon$, improving on existing work that either assumes uniformly bounded mixing times for all policies or has suboptimal dependence on the parameters. We further investigate sample complexity in general (non-weakly-communicating) average-reward MDPs. We argue a new transient time parameter $B$ is necessary, establish an $\tilde{O}(SA\frac{B+H}{\epsilon^2})$ complexity bound, and prove a matching (up to log factors) minimax lower bound. Both results are based on reducing the average-reward MDP to a discounted MDP, which requires new ideas in the general setting. To establish the optimality of this reduction, we develop improved bounds for $\gamma$-discounted MDPs, showing that $\tilde{\Omega}\left(SA\frac{H}{(1-\gamma)^2\epsilon^2}\right)$ samples suffice to learn an $\epsilon$-optimal policy in weakly communicating MDPs under the regime that $\gamma\geq 1-1/H$, and $\tilde{\Omega}\left(SA\frac{B+H}{(1-\gamma)^2\epsilon^2}\right)$ samples suffice in general MDPs when $\gamma\geq 1-\frac{1}{B+H}$. Both these results circumvent the well-known lower bound of $\tilde{\Omega}\left(SA\frac{1}{(1-\gamma)^3\epsilon^2}\right)$ for arbitrary $\gamma$-discounted MDPs. Our analysis develops upper bounds on certain instance-dependent variance parameters in terms of the span and transient time parameters. The weakly communicating bounds are tighter than those based on the mixing time or diameter of the MDP and may be of broader use.

可約的 · 情景 · INTERACT · AIM · Integration ·

2024 年 3 月 16 日

Reduced Basis Method for the Elastic Scattering by Multiple Shape-Parametric Open Arcs in Two Dimensions

Fernando Henríquez,José Pinto

We consider the elastic scattering problem by multiple disjoint arcs or \emph{cracks} in two spatial dimensions. A key aspect of our approach lies in the parametric description of each arc's shape, which is controlled by a potentially high-dimensional, possibly countably infinite, set of parameters. We are interested in the efficient approximation of the parameter-to-solution map employing model order reduction techniques, specifically the reduced basis method. Initially, we utilize boundary potentials to transform the boundary value problem, originally posed in an unbounded domain, into a system of boundary integral equations set on the parametrically defined open arcs. Our aim is to construct a rapid surrogate for solving this problem. To achieve this, we adopt the two-phase paradigm of the reduced basis method. In the offline phase, we compute solutions for this problem under the assumption of complete decoupling among arcs for various shapes. Leveraging these high-fidelity solutions and Proper Orthogonal Decomposition (POD), we construct a reduced-order basis tailored to the single arc problem. Subsequently, in the online phase, when computing solutions for the multiple arc problem with a new parametric input, we utilize the aforementioned basis for each individual arc. To expedite the offline phase, we employ a modified version of the Empirical Interpolation Method (EIM) to compute a precise and cost-effective affine representation of the interaction terms between arcs. Finally, we present a series of numerical experiments demonstrating the advantages of our proposed method in terms of both accuracy and computational efficiency.

貪心逐層預訓練 · 貪心 · 相互獨立的 · 圖 · 近似 ·

2024 年 3 月 16 日

Approximation Ratio of the Min-Degree Greedy Algorithm for Maximum Independent Set on Interval and Chordal Graphs

Steven Chaplick,Martin Frohn,Steven Kelk,Johann Lottermoser,Matus Mihalak

from arxiv, 11 pages, 2 figures, submitted to journal

In this article we prove that the minimum-degree greedy algorithm, with adversarial tie-breaking, is a $(2/3)$-approximation for the Maximum Independent Set problem on interval graphs. We show that this is tight, even on unit interval graphs of maximum degree 3. We show that on chordal graphs, the greedy algorithm is a $(1/2)$-approximation and that this is again tight. These results contrast with the known (tight) approximation ratio of $\frac{3}{\Delta+2}$ of the greedy algorithm for general graphs of maximum degree $\Delta$.

Tensor · 數值分析 ·

2024 年 3 月 15 日

A New Div-Div-Conforming Symmetric Tensor Finite Element Space with Applications to the Biharmonic Equation

Long Chen,Xuehai Huang

from arxiv, 38 pages, 2 figures

A new $H(\textrm{divdiv})$-conforming finite element is presented, which avoids the need for super-smoothness by redistributing the degrees of freedom to edges and faces. This leads to a hybridizable mixed method with superconvergence for the biharmonic equation. Moreover, new finite element divdiv complexes are established. Finally, new weak Galerkin and $C^0$ discontinuous Galerkin methods for the biharmonic equation are derived.

Faster R-CNN · domain shift · R-CNN · 目標檢測 · 可約的 ·

2018 年 3 月 8 日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Yuhua Chen,Wen Li,Christos Sakaridis,Dengxin Dai,Luc Van Gool

from arxiv, Accepted to CVPR 2018

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.