97SE亚洲国产综合在线,销魂美女一区二区三区AV

We propose a novel concise function representation for graphical models, a central theoretical framework that provides the basis for many reasoning tasks. We then show how we exploit our concise representation based on deterministic finite state automata within Bucket Elimination (BE), a general approach based on the concept of variable elimination that can be used to solve many inference and optimisation tasks, such as most probable explanation and constrained optimisation. We denote our version of BE as FABE. By using our concise representation within FABE, we dramatically improve the performance of BE in terms of runtime and memory requirements. Results achieved by comparing FABE with state of the art approaches for most probable explanation (i.e., recursive best-first and structured message passing) and constrained optimisation (i.e., CPLEX, GUROBI, and toulbar2) following an established methodology confirm the efficacy of our concise function representation, showing runtime improvements of up to 5 orders of magnitude in our tests.

相關內容

泛函

關注 0

損失 · 優化器 · 正則化項 · 損失函數（機器學習） · MoDELS ·

2022 年 6 月 6 日

On the Convergence of Optimizing Persistent-Homology-Based Losses

Yikai Zhang,Jiachen Yao,Yusu Wang,Chao Chen

Topological loss based on persistent homology has shown promise in various applications. A topological loss enforces the model to achieve certain desired topological property. Despite its empirical success, less is known about the optimization behavior of the loss. In fact, the topological loss involves combinatorial configurations that may oscillate during optimization. In this paper, we introduce a general purpose regularized topology-aware loss. We propose a novel regularization term and also modify existing topological loss. These contributions lead to a new loss function that not only enforces the model to have desired topological behavior, but also achieves satisfying convergence behavior. Our main theoretical result guarantees that the loss can be optimized efficiently, under mild assumptions.

Learning · 控制器 · 回合 · 線性的 · 優化器 ·

2022 年 6 月 6 日

Learning to Control under Time-Varying Environment

Yuzhen Han,Ruben Solozabal,Jing Dong,Xingyu Zhou,Martin Takac,Bin Gu

This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from nonparametric rate of regret. In this paper, we propose the first computationally tractable online algorithm with regret guarantees that avoids offline planning over the state linear feedback policies. Our algorithm is based on the optimism in the face of uncertainty (OFU) principle in which we optimistically select the best model in a high confidence region. Our algorithm is then more explorative when compared to previous approaches. To overcome non-stationarity, we propose either a restarting strategy (R-OFU) or a sliding window (SW-OFU) strategy. With proper configuration, our algorithm is attains sublinear regret $O(T^{2/3})$. These algorithms utilize data from the current phase for tracking variations on the system dynamics. We corroborate our theoretical findings with numerical experiments, which highlight the effectiveness of our methods. To the best of our knowledge, our study establishes the first model-based online algorithm with regret guarantees under LTV dynamical systems.

圖 · Performer · 核化 · Learning · Networking ·

2022 年 6 月 6 日

A Simple yet Effective Method for Graph Classification

Junran Wu,Shangzhe Li,Jianhao Li,Yicheng Pan,Ke Xu

from arxiv, Accepted by IJCAI2022. arXiv admin note: substantial text overlap with arXiv:2109.02027

In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models. Intuitively, given a problem, a simpler data structure comes with a simpler algorithm. Here, we investigate the feasibility of improving graph classification performance while simplifying the learning process. Inspired by structural entropy on graphs, we transform the data sample from graphs to coding trees, which is a simpler but essential structure for graph data. Furthermore, we propose a novel message passing scheme, termed hierarchical reporting, in which features are transferred from leaf nodes to root nodes by following the hierarchical structure of coding trees. We then present a tree kernel and a convolutional network to implement our scheme for graph classification. With the designed message passing scheme, the tree kernel and convolutional network have a lower runtime complexity of $O(n)$ than Weisfeiler-Lehman subtree kernel and other graph neural networks of at least $O(hm)$. We empirically validate our methods with several graph classification benchmarks and demonstrate that they achieve better performance and lower computational consumption than competing approaches.

可微函數 · 估計/估計量 · 講稿 · 泛函 · SimPLe ·

2022 年 6 月 6 日

A new method for estimating the real roots of real differentiable functions

Hassan Khandani,Farshid Khojasteh

from arxiv, 11 pages

We introduce a new type of Krasnoselskii's result. Using a simple differentiability condition, we relax the nonexpansive condition in Krasnoselskii's theorem. More clearly, we analyze the convergence of the sequence $x_{n+1}=\frac{x_n+g(x_n)}{2}$ based on some differentiability condition of $g$ and present some fixed point results. We introduce some iterative sequences that for any real differentiable function $g$ and any starting point $x_0\in \mathbb [a,b]$ converge monotonically to the nearest root of $g$ in $[a,b]$ that lay to the right or left side of $x_0$. Based on this approach, we present an efficient and novel method for finding the real roots of real functions. We prove that no root will be missed in our method. It is worth mentioning that our iterative method is free from the derivative evaluation which can be regarded as an advantage of this method in comparison with many other methods. Finally, we illustrate our results with some numerical examples.

Markov · 線性的 · 優化器 · Processing（編程語言） · Agent ·

2022 年 6 月 3 日

Algorithm for Constrained Markov Decision Process with Linear Convergence

Egor Gladin,Maksim Lavrik-Karmazin,Karina Zainullina,Varvara Rudenko,Alexander Gasnikov,Martin Taká?

from arxiv, 26 pages, 2 figures, 2 tables

The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual approach is proposed with the integration of two ingredients: entropy regularized policy optimizer and Vaidya's dual optimizer, both of which are critical to achieve faster convergence. The finite-time error bound of the proposed approach is provided. Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge (with linear rate) to the global optimum. The complexity expressed in terms of the optimality gap and the constraint violation significantly improves upon the existing primal-dual approaches.

Learning · 核化 · 核嶺回歸 · 早停 · 嶺回歸 ·

2022 年 6 月 3 日

On the Benefits of Large Learning Rates for Kernel Methods

Gaspard Beugnot,Julien Mairal,Alessandro Rudi

from arxiv, Accepted paper at Conference COLT 2022. To be published to Proceedings of Machine Learning Research (PMLR)

This paper studies an intriguing phenomenon related to the good generalization performance of estimators obtained by using large learning rates within gradient descent algorithms. First observed in the deep learning literature, we show that a phenomenon can be precisely characterized in the context of kernel methods, even though the resulting optimization problem is convex. Specifically, we consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution on the Hessian's eigenvectors. This extends an intuition described by Nakkiran (2020) on a two-dimensional toy problem to realistic learning scenarios such as kernel ridge regression. While large learning rates may be proven beneficial as soon as there is a mismatch between the train and test objectives, we further explain why it already occurs in classification tasks without assuming any particular mismatch between train and test data distributions.

非凸 · CC · 優化器 · 正則化項 · Learning ·

2022 年 6 月 3 日

A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization

Ziyi Chen,Bhavya Kailkhura,Yi Zhou

from arxiv, 20 pages, 1 figure, 1 table

Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit differentiation (AID) scheme for nonconvex bi-level optimization with possibly nonconvex and nonsmooth regularizers. In particular, the algorithm applies the Nesterov's momentum to accelerate the computation of the implicit gradient involved in AID. We provide a comprehensive analysis of the global convergence properties of this algorithm through identifying its intrinsic potential function. In particular, we formally establish the convergence of the model parameters to a critical point of the bi-level problem, and obtain an improved computation complexity $\mathcal{O}(\kappa^{3.5}\epsilon^{-2})$ over the state-of-the-art result. Moreover, we analyze the asymptotic convergence rates of this algorithm under a class of local nonconvex geometries characterized by a {\L}ojasiewicz-type gradient inequality. Experiment on hyper-parameter optimization demonstrates the effectiveness of our algorithm.

圖 · 圖形處理器 · Neural Networks · MoDELS · Networking ·

2022 年 6 月 3 日

Instant Graph Neural Networks for Dynamic Graphs

Yanping Zheng,Hanzhi Wang,Zhewei Wei,Jiajun Liu,Sibo Wang

Graph Neural Networks (GNNs) have been widely used for modeling graph-structured data. With the development of numerous GNN variants, recent years have witnessed groundbreaking results in improving the scalability of GNNs to work on static graphs with millions of nodes. However, how to instantly represent continuous changes of large-scale dynamic graphs with GNNs is still an open problem. Existing dynamic GNNs focus on modeling the periodic evolution of graphs, often on a snapshot basis. Such methods suffer from two drawbacks: first, there is a substantial delay for the changes in the graph to be reflected in the graph representations, resulting in losses on the model's accuracy; second, repeatedly calculating the representation matrix on the entire graph in each snapshot is predominantly time-consuming and severely limits the scalability. In this paper, we propose Instant Graph Neural Network (InstantGNN), an incremental computation approach for the graph representation matrix of dynamic graphs. Set to work with dynamic graphs with the edge-arrival model, our method avoids time-consuming, repetitive computations and allows instant updates on the representation and instant predictions. Graphs with dynamic structures and dynamic attributes are both supported. The upper bounds of time complexity of those updates are also provided. Furthermore, our method provides an adaptive training strategy, which guides the model to retrain at moments when it can make the greatest performance gains. We conduct extensive experiments on several real-world and synthetic datasets. Empirical results demonstrate that our model achieves state-of-the-art accuracy while having orders-of-magnitude higher efficiency than existing methods.

易處理的 · 圖 · 講稿 · contrastive · 規范化的 ·

2022 年 6 月 2 日

On the Parallel Parameterized Complexity of MaxSAT Variants

Max Bannach,Malte Skambath,Till Tantau

from arxiv, SAT 2022

In the maximum satisfiability problem (MAX-SAT) we are given a propositional formula in conjunctive normal form and have to find an assignment that satisfies as many clauses as possible. We study the parallel parameterized complexity of various versions of MAX-SAT and provide the first constant-time algorithms parameterized either by the solution size or by the allowed excess relative to some guarantee ("above guarantee" versions). For the dual parameterized version where the parameter is the number of clauses we are allowed to leave unsatisfied, we present the first parallel algorithm for MAX-2SAT (known as ALMOST-2SAT). The difficulty in solving ALMOST-2SAT in parallel comes from the fact that the iterative compression method, originally developed to prove that the problem is fixed-parameter tractable at all, is inherently sequential. We observe that a graph flow whose value is a parameter can be computed in parallel and use this fact to develop a parallel algorithm for the vertex cover problem parameterized above the size of a given matching. Finally, we study the parallel complexity of MAX-SAT parameterized by the vertex cover number, the treedepth, the feedback vertex set number, and the treewidth of the input's incidence graph. While MAX-SAT is fixed-parameter tractable for all of these parameters, we show that they allow different degrees of possible parallelization. For all four we develop dedicated parallel algorithms that are constructive, meaning that they output an optimal assignment - in contrast to results that can be obtained by parallel meta-theorems, which often only solve the decision version.

簇 · 圖 · SC · 圖形處理器 · 匯聚 ·

2020 年 6 月 3 日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Filippo Maria Bianchi,Daniele Grattarola,Cesare Alippi

Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.