亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider a standard distributed optimisation setting where $N$ machines, each holding a $d$-dimensional function $f_i$, aim to jointly minimise the sum of the functions $\sum_{i = 1}^N f_i (x)$. This problem arises naturally in large-scale distributed optimisation, where a standard solution is to apply variants of (stochastic) gradient descent. We focus on the communication complexity of this problem: our main result provides the first fully unconditional bounds on total number of bits which need to be sent and received by the $N$ machines to solve this problem under point-to-point communication, within a given error-tolerance. Specifically, we show that $\Omega( Nd \log d / N\varepsilon)$ total bits need to be communicated between the machines to find an additive $\epsilon$-approximation to the minimum of $\sum_{i = 1}^N f_i (x)$. The result holds for both deterministic and randomised algorithms, and, importantly, requires no assumptions on the algorithm structure. The lower bound is tight under certain restrictions on parameter values, and is matched within constant factors for quadratic objectives by a new variant of quantised gradient descent, which we describe and analyse. Our results bring over tools from communication complexity to distributed optimisation, which has potential for further applications.

相關內容

For the problem of maximizing a monotone, submodular function with respect to a cardinality constraint $k$ on a ground set of size $n$, we provide an algorithm that achieves the state-of-the-art in both its empirical performance and its theoretical properties, in terms of adaptive complexity, query complexity, and approximation ratio; that is, it obtains, with high probability, query complexity of $O(n)$ in expectation, adaptivity of $O(\log(n))$, and approximation ratio of nearly $1-1/e$. The main algorithm is assembled from two components which may be of independent interest. The first component of our algorithm, LINEARSEQ, is useful as a preprocessing algorithm to improve the query complexity of many algorithms. Moreover, a variant of LINEARSEQ is shown to have adaptive complexity of $O( \log (n / k) )$ which is smaller than that of any previous algorithm in the literature. The second component is a parallelizable thresholding procedure THRESHOLDSEQ for adding elements with gain above a constant threshold. Finally, we demonstrate that our main algorithm empirically outperforms, in terms of runtime, adaptive rounds, total queries, and objective values, the previous state-of-the-art algorithm FAST in a comprehensive evaluation with six submodular objective functions.

One of the main reasons for query model's prominence in quantum complexity is the presence of concrete lower bounding techniques: polynomial method and adversary method. There have been considerable efforts to not just give lower bounds using these methods but even to compare and relate them. We explore the value of these bounds on quantum query complexity for the class of symmetric functions, arguably one of the most natural and basic set of Boolean functions. We show that the recently introduced measure of spectral sensitivity give the same value as both these bounds (positive adversary and approximate degree) for every total symmetric Boolean function. We also look at the quantum query complexity of Gap Majority, a partial symmetric function. It has gained importance recently in regard to understanding the composition of randomized query complexity. We characterize the quantum query complexity of Gap Majority and show a lower bound on noisy randomized query complexity (Ben-David and Blais, FOCS 2020) in terms of quantum query complexity. In addition, we study how large certificate complexity and block sensitivity can be as compared to sensitivity (even up to constant factors) for symmetric functions. We show tight separations, i.e., give upper bound on possible separations and construct functions achieving the same.

In this paper, we introduce reduced-bias estimators for the estimation of the tail index of a Pareto-type distribution. This is achieved through the use of a regularised weighted least squares with an exponential regression model for log-spacings of top order statistics. The asymptotic properties of the proposed estimators are investigated analytically and found to be asymptotically unbiased, consistent and normally distributed. Also, the finite sample behaviour of the estimators are studied through a simulations theory. The proposed estimators were found to yield low bias and MSE. In addition, the proposed estimators are illustrated through the estimation of the tail index of the underlying distribution of claims from the insurance industry.

The age of Information (AoI) has been introduced to capture the notion of freshness in real-time monitoring applications. However, this metric falls short in many scenarios, especially when quantifying the mismatch between the current and the estimated states. To circumvent this issue, in this paper, we adopt the age of incorrect of information metric (AoII) that considers the quantified mismatch between the source and the knowledge at the destination. We consider for that a problem where a central entity pulls the information from remote sources that evolve according to a Markovian Process. It selects at each time slot which sources should send their updates. As the scheduler does not know the real state of the remote sources, it estimates at each time the value of AoII based on the Markovian sources' parameters. Its goal is to keep the time average of AoII function as small as possible. We develop a scheduling scheme based on Whittle's index policy for that purpose. To that extent, we proceed by using the Lagrangian Relaxation Approach and establish that the dual problem has an optimal threshold policy. Building on that, we compute the expressions of Whittle's indices. Finally, we provide some numerical results to highlight the performance of our derived policy compared to the classical AoI metric.

We study distributed algorithms built around minor-based vertex sparsifiers, and give the first algorithm in the CONGEST model for solving linear systems in graph Laplacian matrices to high accuracy. Our Laplacian solver has a round complexity of $O(n^{o(1)}(\sqrt{n}+D))$, and thus almost matches the lower bound of $\widetilde{\Omega}(\sqrt{n}+D)$, where $n$ is the number of nodes in the network and $D$ is its diameter. We show that our distributed solver yields new sublinear round algorithms for several cornerstone problems in combinatorial optimization. This is achieved by leveraging the powerful algorithmic framework of Interior Point Methods (IPMs) and the Laplacian paradigm in the context of distributed graph algorithms, which entails numerically solving optimization problems on graphs via a series of Laplacian systems. Problems that benefit from our distributed algorithmic paradigm include exact mincost flow, negative weight shortest paths, maxflow, and bipartite matching on sparse directed graphs. For the maxflow problem, this is the first exact distributed algorithm that applies to directed graphs, while the previous work by [Ghaffari et al. SICOMP'18] considered the approximate setting and works only for undirected graphs. For the mincost flow and the negative weight shortest path problems, our results constitute the first exact distributed algorithms running in a sublinear number of rounds. Given that the hybrid between IPMs and the Laplacian paradigm has proven useful for tackling numerous optimization problems in the centralized setting, we believe that our distributed solver will find future applications.

Minimax problems have gained tremendous attentions across the optimization and machine learning community recently. In this paper, we introduce a new quasi-Newton method for minimax problems, which we call $J$-symmetric quasi-Newton method. The method is obtained by exploiting the $J$-symmetric structure of the second-order derivative of the objective function in minimax problem. We show that the Hessian estimation (as well as its inverse) can be updated by a rank-2 operation, and it turns out that the update rule is a natural generalization of the classic Powell symmetric Broyden (PSB) method from minimization problems to minimax problems. In theory, we show that our proposed quasi-Newton algorithm enjoys local Q-superlinear convergence to a desirable solution under standard regularity conditions. Furthermore, we introduce a trust-region variant of the algorithm that enjoys global R-superlinear convergence. Finally, we present numerical experiments that verify our theory and show the effectiveness of our proposed algorithms compared to Broyden's method and the extragradient method on three classes of minimax problems.

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.

In this paper, we propose new geometrically unfitted space-time Finite Element methods for partial differential equations posed on moving domains of higher order accuracy in space and time. As a model problem, the convection-diffusion problem on a moving domain is studied. For geometrically higher order accuracy, we apply a parametric mapping on a background space-time tensor-product mesh. Concerning discretisation in time, we consider discontinuous Galerkin, as well as related continuous (Petrov-)Galerkin and Galerkin collocation methods. For stabilisation with respect to bad cut configurations and as an extension mechanism that is required for the latter two schemes, a ghost penalty stabilisation is employed. The article puts an emphasis on the techniques that allow to achieve a robust but higher order geometry handling for smooth domains. We investigate the computational properties of the respective methods in a series of numerical experiments. These include studies in different dimensions for different polynomial degrees in space and time, validating the higher order accuracy in both variables.

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

北京阿比特科技有限公司