亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Inspired by branch-and-bound and cutting plane proofs in mixed-integer optimization and proof complexity, we develop a general approach via Hoffman's Helly systems. This helps to distill the main ideas behind optimality and infeasibility certificates in optimization. The first part of the paper formalizes the notion of a certificate and its size in this general setting. The second part of the paper establishes lower and upper bounds on the sizes of these certificates in various different settings. We show that some important techniques existing in the literature are purely combinatorial in nature and do not depend on any underlying geometric notions.

相關內容

This paper addresses theory and applications of $\ell_p$-based Laplacian regularization in semi-supervised learning. The graph $p$-Laplacian for $p>2$ has been proposed recently as a replacement for the standard ($p=2$) graph Laplacian in semi-supervised learning problems with very few labels, where Laplacian learning is degenerate. In the first part of the paper we prove new discrete to continuum convergence results for $p$-Laplace problems on $k$-nearest neighbor ($k$-NN) graphs, which are more commonly used in practice than random geometric graphs. Our analysis shows that, on $k$-NN graphs, the $p$-Laplacian retains information about the data distribution as $p\to \infty$ and Lipschitz learning ($p=\infty$) is sensitive to the data distribution. This situation can be contrasted with random geometric graphs, where the $p$-Laplacian forgets the data distribution as $p\to \infty$. We also present a general framework for proving discrete to continuum convergence results in graph-based learning that only requires pointwise consistency and monotonicity. In the second part of the paper, we develop fast algorithms for solving the variational and game-theoretic $p$-Laplace equations on weighted graphs for $p>2$. We present several efficient and scalable algorithms for both formulations, and present numerical results on synthetic data indicating their convergence properties. Finally, we conduct extensive numerical experiments on the MNIST, FashionMNIST and EMNIST datasets that illustrate the effectiveness of the $p$-Laplacian formulation for semi-supervised learning with few labels. In particular, we find that Lipschitz learning ($p=\infty$) performs well with very few labels on $k$-NN graphs, which experimentally validates our theoretical findings that Lipschitz learning retains information about the data distribution (the unlabeled data) on $k$-NN graphs.

Fitting geometric models onto outlier contaminated data is provably intractable. Many computer vision systems rely on random sampling heuristics to solve robust fitting, which do not provide optimality guarantees and error bounds. It is therefore critical to develop novel approaches that can bridge the gap between exact solutions that are costly, and fast heuristics that offer no quality assurances. In this paper, we propose a hybrid quantum-classical algorithm for robust fitting. Our core contribution is a novel robust fitting formulation that solves a sequence of integer programs and terminates with a global solution or an error bound. The combinatorial subproblems are amenable to a quantum annealer, which helps to tighten the bound efficiently. While our usage of quantum computing does not surmount the fundamental intractability of robust fitting, by providing error bounds our algorithm is a practical improvement over randomised heuristics. Moreover, our work represents a concrete application of quantum computing in computer vision. We present results obtained using an actual quantum computer (D-Wave Advantage) and via simulation. Source code: //github.com/dadung/HQC-robust-fitting

The filtering-clustering models, including trend filtering and convex clustering, have become an important source of ideas and modeling tools in machine learning and related fields. The statistical guarantee of optimal solutions in these models has been extensively studied yet the investigations on the computational aspect have remained limited. In particular, practitioners often employ the first-order algorithms in real-world applications and are impressed by their superior performance regardless of ill-conditioned structures of difference operator matrices, thus leaving open the problem of understanding the convergence property of first-order algorithms. This paper settles this open problem and contributes to the broad interplay between statistics and optimization by identifying a \textit{global error bound} condition, which is satisfied by a large class of dual filtering-clustering problems, and designing a class of \textit{generalized dual gradient ascent} algorithm, which is \textit{optimal} first-order algorithms in deterministic, finite-sum and online settings. Our results are new and help explain why the filtering-clustering models can be efficiently solved by first-order algorithms. We also provide the detailed convergence rate analysis for the proposed algorithms in different settings, shedding light on their potential to solve the filtering-clustering models efficiently. We also conduct experiments on real datasets and the numerical results demonstrate the effectiveness of our algorithms.

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve much lower exploitability than state-of-the-art solvers by finding effective and diverse strategies.

As we seek to deploy machine learning models beyond virtual and controlled domains, it is critical to analyze not only the accuracy or the fact that it works most of the time, but if such a model is truly robust and reliable. This paper studies strategies to implement adversary robustly trained algorithms towards guaranteeing safety in machine learning algorithms. We provide a taxonomy to classify adversarial attacks and defenses, formulate the Robust Optimization problem in a min-max setting and divide it into 3 subcategories, namely: Adversarial (re)Training, Regularization Approach, and Certified Defenses. We survey the most recent and important results in adversarial example generation, defense mechanisms with adversarial (re)Training as their main defense against perturbations. We also survey mothods that add regularization terms that change the behavior of the gradient, making it harder for attackers to achieve their objective. Alternatively, we've surveyed methods which formally derive certificates of robustness by exactly solving the optimization problem or by approximations using upper or lower bounds. In addition, we discuss the challenges faced by most of the recent algorithms presenting future research perspectives.

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to solve the Ising energy minimization problem, which is equivalent to the Maximum Cut problem. The agent controls the algorithm by tuning one of its parameters with the goal of improving recently seen solutions. We propose a new Rescaled Ranked Reward (R3) method that enables stable single-player version of self-play training that helps the agent to escape local optima. The training on any problem instance can be accelerated by applying transfer learning from an agent trained on randomly generated problems. Our approach allows sampling high-quality solutions to the Ising problem with high probability and outperforms both baseline heuristics and a black-box hyperparameter optimization approach.

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

This paper surveys the machine learning literature and presents machine learning as optimization models. Such models can benefit from the advancement of numerical optimization techniques which have already played a distinctive role in several machine learning settings. Particularly, mathematical optimization models are presented for commonly used machine learning approaches for regression, classification, clustering, and deep neural networks as well new emerging applications in machine teaching and empirical model learning. The strengths and the shortcomings of these models are discussed and potential research directions are highlighted.

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

北京阿比特科技有限公司