亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We {formalize the} definition of a stable algorithm that is (i) adapted to the use of multiple and variable precision arithmetic, (ii) sufficiently close to the actual practice of computing to be useful, and (iii) sufficiently robust from a mathematical point of view as to allow for the rigorous proof of theorems. This allows us to state some widely satisfied hypotheses, depending only on two functions $f$ and $g$, under which the composition of a stable algorithm for $f$ and a stable algorithm for $g$ is a stable algorithm for the composition $f \circ g$.

相關內容

Continuous DR-submodular functions are a class of generally non-convex/non-concave functions that satisfy the Diminishing Returns (DR) property, which implies that they are concave along non-negative directions. Existing work has studied monotone continuous DR-submodular maximization subject to a convex constraint and provided efficient algorithms with approximation guarantees. In many applications, such as computing the stability number of a graph, the monotone DR-submodular objective function has the additional property of being strongly concave along non-negative directions (i.e., strongly DR-submodular). In this paper, we consider a subclass of $L$-smooth monotone DR-submodular functions that are strongly DR-submodular and have a bounded curvature, and we show how to exploit such additional structure to obtain faster algorithms with stronger guarantees for the maximization problem. We propose a new algorithm that matches the provably optimal $1-\frac{c}{e}$ approximation ratio after only $\lceil\frac{L}{\mu}\rceil$ iterations, where $c\in[0,1]$ and $\mu\geq 0$ are the curvature and the strong DR-submodularity parameter. Furthermore, we study the Projected Gradient Ascent (PGA) method for this problem, and provide a refined analysis of the algorithm with an improved $\frac{1}{1+c}$ approximation ratio (compared to $\frac{1}{2}$ in prior works) and a linear convergence rate. Experimental results illustrate and validate the efficiency and effectiveness of our proposed algorithms.

Majority-SAT is the problem of determining whether an input $n$-variable formula in conjunctive normal form (CNF) has at least $2^{n-1}$ satisfying assignments. Majority-SAT and related problems have been studied extensively in various AI communities interested in the complexity of probabilistic planning and inference. Although Majority-SAT has been known to be PP-complete for over 40 years, the complexity of a natural variant has remained open: Majority-$k$SAT, where the input CNF formula is restricted to have clause width at most $k$. We prove that for every $k$, Majority-$k$SAT is in P. In fact, for any positive integer $k$ and rational $\rho \in (0,1)$ with bounded denominator, we give an algorithm that can determine whether a given $k$-CNF has at least $\rho \cdot 2^n$ satisfying assignments, in deterministic linear time (whereas the previous best-known algorithm ran in exponential time). Our algorithms have interesting positive implications for counting complexity and the complexity of inference, significantly reducing the known complexities of related problems such as E-MAJ-$k$SAT and MAJ-MAJ-$k$SAT. At the heart of our approach is an efficient method for solving threshold counting problems by extracting sunflowers found in the corresponding set system of a $k$-CNF. We also show that the tractability of Majority-$k$SAT is somewhat fragile. For the closely related GtMajority-SAT problem (where we ask whether a given formula has greater than $2^{n-1}$ satisfying assignments) which is known to be PP-complete, we show that GtMajority-$k$SAT is in P for $k\le 3$, but becomes NP-complete for $k\geq 4$. These results are counterintuitive, because the ``natural'' classifications of these problems would have been PP-completeness, and because there is a stark difference in the complexity of GtMajority-$k$SAT and Majority-$k$SAT for all $k\ge 4$.

Statistical machine learning models trained with stochastic gradient algorithms are increasingly being deployed in critical scientific applications. However, computing the stochastic gradient in several such applications is highly expensive or even impossible at times. In such cases, derivative-free or zeroth-order algorithms are used. An important question which has thus far not been addressed sufficiently in the statistical machine learning literature is that of equipping stochastic zeroth-order algorithms with practical yet rigorous inferential capabilities so that we not only have point estimates or predictions but also quantify the associated uncertainty via confidence intervals or sets. Towards this, in this work, we first establish a central limit theorem for Polyak-Ruppert averaged stochastic zeroth-order gradient algorithm. We then provide online estimators of the asymptotic covariance matrix appearing in the central limit theorem, thereby providing a practical procedure for constructing asymptotically valid confidence sets (or intervals) for parameter estimation (or prediction) in the zeroth-order setting.

The purpose of this paper is to perform an error analysis of the variational integrators of mechanical systems subject to external forcing. Essentially, we prove that when a discretization of contact order $r$ of the Lagrangian and force are used, the integrator has the same contact order. Our analysis is performed first for discrete forced mechanical systems defined over $TQ$, where we study the existence of flows, the construction and properties of discrete exact systems and the contact order of the flows (variational integrators) in terms of the contact order of the original systems. Then we use those results to derive the corresponding analysis for the analogous forced systems defined over $Q\times Q$.

We consider sensitivity of a generic stochastic optimization problem to model uncertainty. We take a non-parametric approach and capture model uncertainty using Wasserstein balls around the postulated model. We provide explicit formulae for the first order correction to both the value function and the optimizer and further extend our results to optimization under linear constraints. We present applications to statistics, machine learning, mathematical finance and uncertainty quantification. In particular, we provide explicit first-order approximation for square-root LASSO regression coefficients and deduce coefficient shrinkage compared to the ordinary least squares regression. We consider robustness of call option pricing and deduce a new Black-Scholes sensitivity, a non-parametric version of the so-called Vega. We also compute sensitivities of optimized certainty equivalents in finance and propose measures to quantify robustness of neural networks to adversarial examples.

In this paper, we study an initial-boundary value problem of Kirchhoff type involving memory term for non-homogeneous materials. The purpose of this research is threefold. First, we prove the existence and uniqueness of weak solutions to the problem using the Galerkin method. Second, to obtain numerical solutions efficiently, we develop a L1 type backward Euler-Galerkin FEM, which is $O(h+k^{2-\alpha})$ accurate, where $\alpha~ (0<\alpha<1)$ is the order of fractional time derivative, $h$ and $k$ are the discretization parameters for space and time directions, respectively. Next, to achieve the optimal rate of convergence in time, we propose a fractional Crank-Nicolson-Galerkin FEM based on L2-1$_{\sigma}$ scheme. We prove that the numerical solutions of this scheme converge to the exact solution with accuracy $O(h+k^{2})$. We also derive a priori bounds on numerical solutions for the proposed schemes. Finally, some numerical experiments are conducted to validate our theoretical claims.

Influence maximization is the task of selecting a small number of seed nodes in a social network to maximize the spread of the influence from these seeds, and it has been widely investigated in the past two decades. In the canonical setting, the whole social network as well as its diffusion parameters is given as input. In this paper, we consider the more realistic sampling setting where the network is unknown and we only have a set of passively observed cascades that record the set of activated nodes at each diffusion step. We study the task of influence maximization from these cascade samples (IMS), and present constant approximation algorithms for this task under mild conditions on the seed set distribution. To achieve the optimization goal, we also provide a novel solution to the network inference problem, that is, learning diffusion parameters and the network structure from the cascade data. Comparing with prior solutions, our network inference algorithm requires weaker assumptions and does not rely on maximum-likelihood estimation and convex programming. Our IMS algorithms enhance the learning-and-then-optimization approach by allowing a constant approximation ratio even when the diffusion parameters are hard to learn, and we do not need any assumption related to the network structure or diffusion parameters.

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

北京阿比特科技有限公司