两个人的视频免费国语版_亚洲AV综合色无码国产精品区卡_国产亚洲欧美日韩综合一区二区_欧洲无线一二三四区第一页_亚洲97日日操天天干_美女色黄乱码视频_久久久久免费看黄A级毛片

We give an $\widetilde{O}({m^{3/2 - 1/762} \log (U+W))}$ time algorithm for minimum cost flow with capacities bounded by $U$ and costs bounded by $W$. For sparse graphs with general capacities, this is the first algorithm to improve over the $\widetilde{O}({m^{3/2} \log^{O(1)} (U+W)})$ running time obtained by an appropriate instantiation of an interior point method [Daitch-Spielman, 2008]. Our approach is extending the framework put forth in [Gao-Liu-Peng, 2021] for computing the maximum flow in graphs with large capacities and, in particular, demonstrates how to reduce the problem of computing an electrical flow with general demands to the same problem on a sublinear-sized set of vertices -- even if the demand is supported on the entire graph. Along the way, we develop new machinery to assess the importance of the graph's edges at each phase of the interior point method optimization process. This capability relies on establishing a new connections between the electrical flows arising inside that optimization process and vertex distances in the corresponding effective resistance metric.

相關內容

極小點

關注 0

樣本復雜度 · 優化器 · 正則化 · 樣本 · 價值函數 ·

2022 年 1 月 24 日

Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

Yan Li,Tuo Zhao,Guanghui Lan

We propose the homotopic policy mirror descent (HPMD) method for solving discounted, infinite horizon MDPs with finite state and action space, and study its policy convergence. We report three properties that seem to be new in the literature of policy gradient methods: (1) The policy first converges linearly, then superlinearly with order $\gamma^{-2}$ to the set of optimal policies, after $\mathcal{O}(\log(1/\Delta^*))$ number of iterations, where $\Delta^*$ is defined via a gap quantity associated with the optimal state-action value function; (2) HPMD also exhibits last-iterate convergence, with the limiting policy corresponding exactly to the optimal policy with the maximal entropy for every state. No regularization is added to the optimization objective and hence the second observation arises solely as an algorithmic property of the homotopic policy gradient method. (3) For the stochastic HPMD method, we demonstrate a better than $\mathcal{O}(|\mathcal{S}| |\mathcal{A}| / \epsilon^2)$ sample complexity for small optimality gap $\epsilon$, when assuming a generative model for policy evaluation.

方差減小 · 近端梯度下降 · 方差 · Networking · 損失函數（機器學習） ·

2022 年 1 月 23 日

Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks

Xuanjie Li,Yuedong Xu,Jessie Hui Wang,Xin Wang,John C. S. Lui

from arxiv, 16 pages, 14 figures

In decentralized learning, a network of nodes cooperate to minimize an overall objective function that is usually the finite-sum of their local objectives, and incorporates a non-smooth regularization term for the better generalization ability. Decentralized stochastic proximal gradient (DSPG) method is commonly used to train this type of learning models, while the convergence rate is retarded by the variance of stochastic gradients. In this paper, we propose a novel algorithm, namely DPSVRG, to accelerate the decentralized training by leveraging the variance reduction technique. The basic idea is to introduce an estimator in each node, which tracks the local full gradient periodically, to correct the stochastic gradient at each iteration. By transforming our decentralized algorithm into a centralized inexact proximal gradient algorithm with variance reduction, and controlling the bounds of error sequences, we prove that DPSVRG converges at the rate of $O(1/T)$ for general convex objectives plus a non-smooth term with $T$ as the number of iterations, while DSPG converges at the rate $O(\frac{1}{\sqrt{T}})$. Our experiments on different applications, network topologies and learning models demonstrate that DPSVRG converges much faster than DSPG, and the loss function of DPSVRG decreases smoothly along with the training epochs.

可約的 · 優化器 · 情景 · 樣本 · Performer ·

2022 年 1 月 23 日

Optimal Sampling Gaps for Adaptive Submodular Maximization

Shaojie Tang,Jing Yuan

Running machine learning algorithms on large and rapidly growing volumes of data is often computationally expensive, one common trick to reduce the size of a data set, and thus reduce the computational cost of machine learning algorithms, is \emph{probability sampling}. It creates a sampled data set by including each data point from the original data set with a known probability. Although the benefit of running machine learning algorithms on the reduced data set is obvious, one major concern is that the performance of the solution obtained from samples might be much worse than that of the optimal solution when using the full data set. In this paper, we examine the performance loss caused by probability sampling in the context of adaptive submodular maximization. We consider a simple probability sampling method which selects each data point with probability at least $r\in[0,1]$. If we set $r=1$, our problem reduces to finding a solution based on the original full data set. We define sampling gap as the largest ratio between the optimal solution obtained from the full data set and the optimal solution obtained from the samples, over independence systems. Our main contribution is to show that if the sampling probability of each data point is at least $r$ and the utility function is policywise submodular, then the sampling gap is both upper bounded and lower bounded by $1/r$. We show that the property of policywise submodular can be found in a wide range of real-world applications, including pool-based active learning and adaptive viral marketing.

優化器 · 示例 · Extensibility · Better · 極小點 ·

2022 年 1 月 22 日

Sub-1.5 Time-Optimal Multi-Robot Path Planning on Grids in Polynomial Time

Teng Guo,Jingjin Yu

Graph-based multi-robot path planning (MRPP) is NP-hard to optimally solve. In this work, we propose the first low polynomial-time algorithm for MRPP achieving 1--1.5 asymptotic optimality guarantees on solution makespan for random instances under very high robot density. Specifically, on an $m_1\times m_2$ gird, $m_1 \ge m_2$, our RTH (Rubik Table with Highways) algorithm computes solutions for routing up to $\frac{m_1m_2}{3}$ robots with uniformly randomly distributed start and goal configurations with a makespan of $m_1 + 2m_2 + o(m_1)$, with high probability. Because the minimum makespan for such instances is $m_1 + m_2 - o(m_1)$, also with high probability, RTH guarantees $\frac{m_1+2m_2}{m_1+m_2}$ optimality as $m_1 \to \infty$ for random instances with up to $\frac{1}{3}$ robot density, with high probability. $\frac{m_1+2m_2}{m_1+m_2} \in (1, 1.5]$. Alongside the above-mentioned key result, we also establish: (1) for completely filled grids, i.e., $m_1m_2$ robots, any MRPP instance may be solved in polynomial time under a makespan of $7m_1 + 14m_2$, (2) for $\frac{m_1m_2}{3}$ robots, RTH solves arbitrary MRPP instances with makespan of $3m_1+4m_2 + o(m_1)$, (3) for $\frac{m_1m_2}{2}$ robots, a variation of RTH solves a random MRPP instance with the same 1-1.5 optimality guarantee, and (4) the same $\frac{m_1+2m_2}{m_1+m_2}$ optimality guarantee holds for regularly distributed obstacles at $\frac{1}{9}$ density together with $\frac{2m_1m_2}{9}$ randomly distributed robots; such settings directly map to real-world parcel sorting scenarios. In extensive numerical evaluations, RTH and its variants demonstrate exceptional scalability as compared with methods including ECBS and DDM, scaling to over $450 \times 300$ grids with $45,000$ robots, and consistently achieves makespan around $1.5$ optimal or better, as predicted by our theoretical analysis.

壓縮感知 · 穩健性 · 樣本復雜度 · 噪聲 · Performer ·

2022 年 1 月 22 日

Robust spectral compressive sensing via vanilla gradient descent

Xunmeng Wu,Zai Yang,Zongben Xu

from arxiv, The definition of the Leave-One-Out sequence for the low-rank Hankel completion model in Section III-B is still uncertain

This paper investigates the recovery of a spectrally sparse signal from its partially revealed noisy entries within the framework of spectral compressive sensing. Nonconvex optimization approaches have recently been proposed based on low-rank Hankel matrix completion and projected gradient descent (PGD). The PGD however involves unknown tuning parameters and its theoretical analysis is available only in the absence of noise. In this paper, we propose a hyperparameter-free, vanilla gradient descent (VGD) algorithm and prove that the VGD enables robust recovery of an $N$-dimensional $K$-spectrally-sparse signal from order $K^2 log^2N$ number of noisy samples under coherence and other mild conditions. The above sample complexity increases by factor $logN$ as compared with PGD without noise. Numerical simulations are provided that corroborate our analysis and show advantageous performances of VGD.

貪心逐層預訓練 · 貪心 · 正交 · Neural Networks · 優化器 ·

2022 年 1 月 21 日

Optimal Convergence Rates for the Orthogonal Greedy Algorithm

Jonathan W. Siegel,Jinchao Xu

We analyze the orthogonal greedy algorithm when applied to dictionaries $\mathbb{D}$ whose convex hull has small entropy. We show that if the metric entropy of the convex hull of $\mathbb{D}$ decays at a rate of $O(n^{-\frac{1}{2}-\alpha})$ for $\alpha > 0$, then the orthogonal greedy algorithm converges at the same rate on the variation space of $\mathbb{D}$. This improves upon the well-known $O(n^{-\frac{1}{2}})$ convergence rate of the orthogonal greedy algorithm in many cases, most notably for dictionaries corresponding to shallow neural networks. These results hold under no additional assumptions on the dictionary beyond the decay rate of the entropy of its convex hull. In addition, they are robust to noise in the target function and can be extended to convergence rates on the interpolation spaces of the variation norm. We show empirically that the predicted rates are obtained for the dictionary corresponding to shallow neural networks with Heaviside activation function in two dimensions. Finally, we show that these improved rates are sharp and prove a negative result showing that the iterates generated by the orthogonal greedy algorithm cannot in general be bounded in the variation norm of $\mathbb{D}$.

估計/估計量 · Extensibility · 可約的 · 標注 · 樣本 ·

2022 年 1 月 21 日

SparseAlign: A Super-Resolution Algorithm for Automatic Marker Localization and Deformation Estimation in Cryo-Electron Tomography

Poulami Somanya Ganguly,Felix Lucka,Holger Kohr,Erik Franken,Hermen Jan Hupkes,K Joost Batenburg

Tilt-series alignment is crucial to obtaining high-resolution reconstructions in cryo-electron tomography. Beam-induced local deformation of the sample is hard to estimate from the low-contrast sample alone, and often requires fiducial gold bead markers. The state-of-the-art approach for deformation estimation uses (semi-)manually labelled marker locations in projection data to fit the parameters of a polynomial deformation model. Manually-labelled marker locations are difficult to obtain when data are noisy or markers overlap in projection data. We propose an alternative mathematical approach for simultaneous marker localization and deformation estimation by extending a grid-free super-resolution algorithm first proposed in the context of single-molecule localization microscopy. Our approach does not require labelled marker locations; instead, we use an image-based loss where we compare the forward projection of markers with the observed data. We equip this marker localization scheme with an additional deformation estimation component and solve for a reduced number of deformation parameters. Using extensive numerical studies on marker-only samples, we show that our approach automatically finds markers and reliably estimates sample deformation without labelled marker data. We further demonstrate the applicability of our approach for a broad range of model mismatch scenarios, including experimental electron tomography data of gold markers on ice.

線性的 · 周期的 · 前向 · 泛函 · 在線 ·

2022 年 1 月 21 日

A Complete algorithm for local inversion of maps: Application to Cryptanalysis

Virendra Sule

from arxiv, 31 pages

For a map (function) $F(x):\ftwo^n\rightarrow\ftwo^n$ and a given $y$ in the image of $F$ the problem of \emph{local inversion} of $F$ is to find all inverse images $x$ in $\ftwo^n$ such that $y=F(x)$. In Cryptology, such a problem arises in Cryptanalysis of One way Functions (OWFs). The well known TMTO attack in Cryptanalysis is a probabilistic algorithm for computing one solution of local inversion using $O(\sqrt N)$ order computation in offline as well as online for $N=2^n$. This paper proposes a complete algorithm for solving the local inversion problem which uses linear complexity for a unique solution in a periodic orbit. The algorithm is shown to require an offline computation to solve a hard problem (possibly requiring exponential computation) and an online computation dependent on $y$ that of repeated forward evaluation $F(x)$ on points $x$ in $\ff_{2^n}$ which is polynomial time at each evaluation. However the forward evaluation is repeated at most as many number of times as the Linear Complexity of the sequence $\{y,F(y),\ldots\}$ to get one possible solution when this sequence is periodic. All other solutions are obtained in chains $\{e,F(e),\ldots\}$ for all points $e$ in the Garden of Eden (GOE) of the map $F$. Hence a solution $x$ exists iff either the former sequence is periodic or a solution occurs in a chain starting from a point in GOE. The online computation then turns out to be polynomial time $O(L^k)$ in the linear complexity $L$ of the sequence to compute one possible solution in a periodic orbit or $O(l)$ the chain length for a fixed $n$. Hence this is a complete algorithm for solving the problem of finding all rational solutions $x$ of the equation $F(x)=y$ for a given $y$ and a map $F$ in $\ff_{2^n}$.

簇 · 圖 · SC · 圖形處理器 · 匯聚 ·

2020 年 6 月 3 日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Filippo Maria Bianchi,Daniele Grattarola,Cesare Alippi

Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.

Networking · MoDELS · Neural Networks · 潛變量/隱變量 · Continuity ·

2018 年 10 月 3 日

Neural Ordinary Differential Equations

Ricky T. Q. Chen,Yulia Rubanova,Jesse Bettencourt,David Duvenaud

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.