亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A common technique to speed up shortest path queries in graphs is to use a bidirectional search, i.e., performing a forward search from the start and a backward search from the destination until a common vertex on a shortest path is found. In practice, this has a tremendous impact on the performance on some real-world networks, while it only seems to save a constant factor on other types of networks. Even though finding shortest paths is a ubiquitous problem, there are only few studies attempting to understand the apparently asymptotic speedups on some networks, using average case analysis on certain models for real-world networks. In this paper we give a new perspective on this, by analyzing deterministic properties that permit theoretical analysis and that can easily be checked on any particular instance. We prove that these parameters imply sublinear running time for the bidirectional breadth-first search in several regimes, some of which are tight. Moreover, we perform experiments on a large set of real-world networks showing that our parameters capture the concept of practical running time well.

相關內容

Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the TD error, an objective that is potentially decorrelated with the true goal of achieving a high reward with the actor. We address this mismatch by designing a joint objective for training the actor and critic in a decision-aware fashion. We use the proposed objective to design a generic, AC algorithm that can easily handle any function approximation. We explicitly characterize the conditions under which the resulting algorithm guarantees monotonic policy improvement, regardless of the choice of the policy and critic parameterization. Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective. Using simple bandit examples, we provably establish the benefit of the proposed critic objective over the standard squared error. Finally, we empirically demonstrate the benefit of our decision-aware actor-critic framework on simple RL problems.

The efficient representation of random fields on geometrically complex domains is crucial for Bayesian modelling in engineering and machine learning. Today's prevalent random field representations are restricted to unbounded domains or are too restrictive in terms of possible field properties. As a result, new techniques leveraging the historically established link between stochastic PDEs (SPDEs) and random fields are especially appealing for engineering applications with complex geometries which already have a finite element discretisation for solving the physical conservation equations. Unlike the dense covariance matrix of a random field, its inverse, the precision matrix, is usually sparse and equal to the stiffness matrix of a Helmholtz-like SPDE. In this paper, we use the SPDE representation to develop a scalable framework for large-scale statistical finite element analysis (statFEM) and Gaussian process (GP) regression on geometrically complex domains. We use the SPDE formulation to obtain the relevant prior probability densities with a sparse precision matrix. The properties of the priors are governed by the parameters and possibly fractional order of the Helmholtz-like SPDE so that we can model on bounded domains and manifolds anisotropic, non-homogeneous random fields with arbitrary smoothness. We use for assembling the sparse precision matrix the same finite element mesh used for solving the physical conservation equations. The observation models for statFEM and GP regression are such that the posterior probability densities are Gaussians with a closed-form mean and precision. The expressions for the mean vector and the precision matrix can be evaluated using only sparse matrix operations. We demonstrate the versatility of the proposed framework and its convergence properties with one and two-dimensional Poisson and thin-shell examples.

This paper advances the theory and practice of Domain Generalization (DG) in machine learning. We consider the typical DG setting where the hypothesis is composed of a representation mapping followed by a labeling function. Within this setting, the majority of popular DG methods aim to jointly learn the representation and the labeling functions by minimizing a well-known upper bound for the classification risk in the unseen domain. In practice, however, methods based on this theoretical upper bound ignore a term that cannot be directly optimized due to its dual dependence on both the representation mapping and the unknown optimal labeling function in the unseen domain. To bridge this gap between theory and practice, we introduce a new upper bound that is free of terms having such dual dependence, resulting in a fully optimizable risk upper bound for the unseen domain. Our derivation leverages classical and recent transport inequalities that link optimal transport metrics with information-theoretic measures. Compared to previous bounds, our bound introduces two new terms: (i) the Wasserstein-2 barycenter term that aligns distributions between domains, and (ii) the reconstruction loss term that assesses the quality of representation in reconstructing the original data. Based on this new upper bound, we propose a novel DG algorithm named Wasserstein Barycenter Auto-Encoder (WBAE) that simultaneously minimizes the classification loss, the barycenter loss, and the reconstruction loss. Numerical results demonstrate that the proposed method outperforms current state-of-the-art DG algorithms on several datasets.

This paper interprets the stabilized finite element method via residual minimization as a variational multiscale method. We approximate the solution to the partial differential equations using two discrete spaces that we build on a triangulation of the domain; we denote these spaces as coarse and enriched spaces. Building on the adaptive stabilized finite element method via residual minimization, we find a coarse-scale approximation in a continuous space by minimizing the residual on a dual discontinuous Galerkin norm; this process allows us to compute a robust error estimate to construct an on-the-fly adaptive method. We reinterpret the residual projection using the variational multiscale framework to derive a fine-scale approximation. As a result, on each mesh of the adaptive process, we obtain stable coarse- and fine-scale solutions derived from a symmetric saddle-point formulation and an a-posteriori error indicator to guide automatic adaptivity. We test our framework in several challenging scenarios for linear and nonlinear convection-dominated diffusion problems to demonstrate the framework's performance in providing stability in the solution with optimal convergence rates in the asymptotic regime and robust performance in the pre-asymptotic regime. Lastly, we introduce a heuristic dual-term contribution in the variational form to improve the full-scale approximation for symmetric formulations (e.g., diffusion problem).

Deep learning models can be vulnerable to recovery attacks, raising privacy concerns to users, and widespread algorithms such as empirical risk minimization (ERM) often do not directly enforce safety guarantees. In this paper, we study the safety of ERM-trained models against a family of powerful black-box attacks. Our analysis quantifies this safety via two separate terms: (i) the model stability with respect to individual training samples, and (ii) the feature alignment between the attacker query and the original data. While the first term is well established in learning theory and it is connected to the generalization error in classical work, the second one is, to the best of our knowledge, novel. Our key technical result provides a precise characterization of the feature alignment for the two prototypical settings of random features (RF) and neural tangent kernel (NTK) regression. This proves that privacy strengthens with an increase in the generalization capability, unveiling also the role of the activation function. Numerical experiments show a behavior in agreement with our theory not only for the RF and NTK models, but also for deep neural networks trained on standard datasets (MNIST, CIFAR-10).

Recently, several studies consider the stochastic optimization problem but in a heavy-tailed noise regime, i.e., the difference between the stochastic gradient and the true gradient is assumed to have a finite $p$-th moment (say being upper bounded by $\sigma^{p}$ for some $\sigma\geq0$) where $p\in(1,2]$, which not only generalizes the traditional finite variance assumption ($p=2$) but also has been observed in practice for several different tasks. Under this challenging assumption, lots of new progress has been made for either convex or nonconvex problems, however, most of which only consider smooth objectives. In contrast, people have not fully explored and well understood this problem when functions are nonsmooth. This paper aims to fill this crucial gap by providing a comprehensive analysis of stochastic nonsmooth convex optimization with heavy-tailed noises. We revisit a simple clipping-based algorithm, whereas, which is only proved to converge in expectation but under the additional strong convexity assumption. Under appropriate choices of parameters, for both convex and strongly convex functions, we not only establish the first high-probability rates but also give refined in-expectation bounds compared with existing works. Remarkably, all of our results are optimal (or nearly optimal up to logarithmic factors) with respect to the time horizon $T$ even when $T$ is unknown in advance. Additionally, we show how to make the algorithm parameter-free with respect to $\sigma$, in other words, the algorithm can still guarantee convergence without any prior knowledge of $\sigma$. Furthermore, an initial distance adaptive convergence rate is provided if $\sigma$ is assumed to be known.

Estimating the rigid transformation between two LiDAR scans through putative 3D correspondences is a typical point cloud registration paradigm. Current 3D feature matching approaches commonly lead to numerous outlier correspondences, making outlier-robust registration techniques indispensable. Many recent studies have adopted the branch and bound (BnB) optimization framework to solve the correspondence-based point cloud registration problem globally and deterministically. Nonetheless, BnB-based methods are time-consuming to search the entire 6-dimensional parameter space, since their computational complexity is exponential to the dimension of the solution domain. In order to enhance algorithm efficiency, existing works attempt to decouple the 6 degrees of freedom (DOF) original problem into two 3-DOF sub-problems, thereby reducing the dimension of the parameter space. In contrast, our proposed approach introduces a novel pose decoupling strategy based on residual projections, effectively decomposing the raw problem into three 2-DOF rotation search sub-problems. Subsequently, we employ a novel BnB-based search method to solve these sub-problems, achieving efficient and deterministic registration. Furthermore, our method can be adapted to address the challenging problem of simultaneous pose and correspondence registration (SPCR). Through extensive experiments conducted on synthetic and real-world datasets, we demonstrate that our proposed method outperforms state-of-the-art methods in terms of efficiency, while simultaneously ensuring robustness.

We develop deterministic algorithms for the problems of consensus, gossiping and checkpointing with nodes prone to failing. Distributed systems are modeled as synchronous complete networks. Failures are represented either as crashes or authenticated Byzantine faults. The algorithmic goal is to have both linear running time and linear amount of communication for as large an upper bound $t$ on the number of faults as possible, with respect to the number of nodes~$n$. For crash failures, these bounds of optimality are $t=\mathcal{O}(\frac{n}{\log n})$ for consensus and $t=\mathcal{O}(\frac{n}{\log^2 n})$ for gossiping and checkpointing, while the running time for each algorithm is $\Theta(t+\log n)$. For the authenticated Byzantine model of failures, we show how to accomplish both linear running time and communication for $t=\mathcal{O}(\sqrt{n})$. We show how to implement the algorithms in the single-port model, in which a node may choose only one other node to send/receive a message to/from in a round, such as to preserve the range of running time and communication optimality. We prove lower bounds to show the optimality of some performance bounds.

In order to overcome the expressive limitations of graph neural networks (GNNs), we propose the first method that exploits vector flows over graphs to develop globally consistent directional and asymmetric aggregation functions. We show that our directional graph networks (DGNs) generalize convolutional neural networks (CNNs) when applied on a grid. Whereas recent theoretical works focus on understanding local neighbourhoods, local structures and local isomorphism with no global information flow, our novel theoretical framework allows directional convolutional kernels in any graph. First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting node-specific messages into the field. Then we propose the use of the Laplacian eigenvectors as such vector field, and we show that the method generalizes CNNs on an n-dimensional grid, and is provably more discriminative than standard GNNs regarding the Weisfeiler-Lehman 1-WL test. Finally, we bring the power of CNN data augmentation to graphs by providing a means of doing reflection, rotation and distortion on the underlying directional field. We evaluate our method on different standard benchmarks and see a relative error reduction of 8\% on the CIFAR10 graph dataset and 11% to 32% on the molecular ZINC dataset. An important outcome of this work is that it enables to translate any physical or biological problems with intrinsic directional axes into a graph network formalism with an embedded directional field.

Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data. Recently, there has been increased interest in designing GNN that scales to large graphs. Most existing methods use "graph sampling" or "layer-wise sampling" techniques to reduce training time. However, these methods still suffer from degrading performance and scalability problems when applying to graphs with billions of edges. This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes. Theoretical analysis shows that GBP is the first method that achieves sub-linear time complexity for both the precomputation and the training phases. An extensive empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time. Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1.8 billion edges in less than half an hour on a single machine.

北京阿比特科技有限公司