亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we propose two novel inertial-like algorithms for solving the split common null point problem (SCNPP) with respect to set-valued maximal operators. The features of the presented algorithm are using new inertial structure (i.e, the design of the new inertial-like method does neither involve computation of the norm of the difference between $x_n$ and $x_{n-1}$ in advance, nor need to consider the special value of the inertial parameter $\theta_n$ to make the condition $\sum_{n=1}^\infty \alpha_n\|x_n-x_{n-1}\|^2<\infty$ valid) and the selection of the step-sizes does not need prior knowledge of operator norms. Numerical experiments are presented to illustrate the performance of the algorithms.

相關內容

We advance the state of the art in Mixed-Integer Linear Programming (MILP) formulations for Guillotine 2D Cutting Problems by (i) adapting a previously known reduction to our preprocessing phase and by (ii) enhancing a previous formulation by cutting down its size and symmetries. Our focus is the Guillotine 2D Knapsack Problem with orthogonal and unrestricted cuts, constrained demand, unlimited stages, and no rotation -- however, the formulation may be adapted to many related problems. The code is available. Concerning the set of 59 instances used to benchmark the original formulation, and summing the statistics for all models generated, the enhanced formulation has only a small fraction of the variables and constraints of the original model (respectively, 3.07% and 8.35%). The enhanced formulation also takes about 4 hours to solve all instances while the original formulation takes 12 hours to solve 53 of them (the other six runs hit a three-hour time limit each). We integrate, to both formulations, a pricing framework proposed for the original formulation; the enhanced formulation keeps a significant advantage in this situation. Finally, in a recently proposed set of 80 harder instances, the enhanced formulation (with and without the pricing framework) found: 22 optimal solutions for the unrestricted problem (5 already known, 17 new); 22 optimal solutions for the restricted problem (all are new and they are not the same 22 of the optimal unrestricted solutions); better lower bounds for 25 instances; better upper bounds for 58 instances.

We study approximation properties of multivariate periodic functions from weighted Wiener spaces by sparse grids methods constructed with the help of quasi-interpolation operators. The class of such operators includes classical interpolation and sampling operators, Kantorovich-type operators, scaling expansions associated with wavelet constructions, and others. We obtain the rate of convergence of the corresponding sparse grids methods in weighted Wiener norms as well as analogues of the Littlewood-Paley-type characterizations in terms of families of quasi-interpolation operators.

Stochastic gradient descent with momentum (SGDM) is the dominant algorithm in many optimization scenarios, including convex optimization instances and non-convex neural network training. Yet, in the stochastic setting, momentum interferes with gradient noise, often leading to specific step size and momentum choices in order to guarantee convergence, set aside acceleration. Proximal point methods, on the other hand, have gained much attention due to their numerical stability and elasticity against imperfect tuning. Their stochastic accelerated variants though have received limited attention: how momentum interacts with the stability of (stochastic) proximal point methods remains largely unstudied. To address this, we focus on the convergence and stability of the stochastic proximal point algorithm with momentum (SPPAM), and show that SPPAM allows a faster linear convergence rate compared to stochastic proximal point algorithm (SPPA) with a better contraction factor, under proper hyperparameter tuning. In terms of stability, we show that SPPAM depends on problem constants more favorably than SGDM, allowing a wider range of step size and momentum that lead to convergence.

We introduce a generic template for developing regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured. The key of our analysis is a new technique called implicit finite-horizon approximation, which approximates the SSP model by a finite-horizon counterpart only in the analysis without explicit implementation. Using this template, we develop two new algorithms: the first one is model-free (the first in the literature to our knowledge) and minimax optimal under strictly positive costs; the second one is model-based and minimax optimal even with zero-cost state-action pairs, matching the best existing result from [Tarbouriech et al., 2021b]. Importantly, both algorithms admit highly sparse updates, making them computationally more efficient than all existing algorithms. Moreover, both can be made completely parameter-free.

In this paper the Divide-and-Conquer method is applied and assessed to three integer optimizations problems: Multidimensional Knapsack Problem (d-KP), Bin Packing Problem (BPP) and Travelling Salesman Problem (TSP). For each case, the method is introduced, together with the design of numerical experiments, in order to empirically establish its performance from both points of view: its computational time and its numerical accuracy.

Efficiently approximating local curvature information of the loss function is a key tool for optimization and compression of deep neural networks. Yet, most existing methods to approximate second-order information have high computational or storage costs, which can limit their practicality. In this work, we investigate matrix-free, linear-time approaches for estimating Inverse-Hessian Vector Products (IHVPs) for the case when the Hessian can be approximated as a sum of rank-one matrices, as in the classic approximation of the Hessian by the empirical Fisher matrix. We propose two new algorithms as part of a framework called M-FAC: the first algorithm is tailored towards network compression and can compute the IHVP for dimension $d$, if the Hessian is given as a sum of $m$ rank-one matrices, using $O(dm^2)$ precomputation, $O(dm)$ cost for computing the IHVP, and query cost $O(m)$ for any single element of the inverse Hessian. The second algorithm targets an optimization setting, where we wish to compute the product between the inverse Hessian, estimated over a sliding window of optimization steps, and a given gradient direction, as required for preconditioned SGD. We give an algorithm with cost $O(dm + m^2)$ for computing the IHVP and $O(dm + m^3)$ for adding or removing any gradient from the sliding window. These two algorithms yield state-of-the-art results for network pruning and optimization with lower computational overhead relative to existing second-order methods. Implementations are available at [9] and [17].

We propose a deterministic Kaczmarz algorithm for solving linear systems $A\x=\b$. Different from previous Kaczmarz algorithms, we use reflections in each step of the iteration. This generates a series of points distributed with patterns on a sphere centered at a solution. Firstly, we prove that taking the average of $O(\eta/\epsilon)$ points leads to an effective approximation of the solution up to relative error $\epsilon$, where $\eta$ is a parameter depending on $A$ and can be bounded above by the square of the condition number. We also show how to select these points efficiently. From the numerical tests, our Kaczmarz algorithm usually converges more quickly than the (block) randomized Kaczmarz algorithms. Secondly, when the linear system is consistent, the Kaczmarz algorithm returns the solution that has the minimal distance to the initial vector. This gives a method to solve the least-norm problem. Finally, we prove that our Kaczmarz algorithm indeed solves the linear system $A^TW^{-1}A \x = A^TW^{-1} \b$, where $W$ is the low-triangular matrix such that $W+W^T=2AA^T$. The relationship between this linear system and the original one is studied.

In this work, we propose three novel block-structured multigrid relaxation schemes based on distributive relaxation, Braess-Sarazin relaxation, and Uzawa relaxation, for solving the Stokes equations discretized by the mark-and-cell scheme. In our earlier work \cite{he2018local}, we discussed these three types of relaxation schemes, where the weighted Jacobi iteration is used for inventing the Laplacian involved in the Stokes equations. In \cite{he2018local}, we show that the optimal smoothing factor is $\frac{3}{5}$ for distributive weighted-Jacobi relaxation and inexact Braess-Sarazin relaxation, and is $\sqrt{\frac{3}{5}}$ for $\sigma$-Uzawa relaxation. Here, we propose mass-based approximation inside of these three relaxations, where mass matrix $Q$ obtained from bilinear finite element method is directly used to approximate to the inverse of scalar Laplacian operator instead of using Jacobi iteration. Using local Fourier analysis, we theoretically derive the optimal smoothing factors for the resulting three relaxation schemes. Specifically, mass-based distributive relaxation, mass-based Braess-Sarazin relaxation, and mass-based $\sigma$-Uzawa relaxation have optimal smoothing factor $\frac{1}{3}$, $\frac{1}{3}$ and $\sqrt{\frac{1}{3}}$, respectively. Note that the mass-based relaxation schemes do not cost more than the original ones using Jacobi iteration. Another superiority is that there is no need to compute the inverse of a matrix. These new relaxation schemes are appealing.

Many force-gradient explicit symplectic integration algorithms have been designed for the Hamiltonian $H=T (\mathbf{p})+V(\mathbf{q})$ with kinetic energy $T(\mathbf{p})=\mathbf{p}^2/2$ in the existing references. When the force-gradient operator is appropriately adjusted as a new operator, they are still suitable for a class of Hamiltonian problems $H=K(\mathbf{p},\mathbf{q})+V(\mathbf{q})$ with \emph{integrable} part $K(\mathbf{p},\mathbf{q}) = \sum_{i=1}^{n} \sum_{j=1}^{n}a_{ij}p_ip_j+\sum_{i=1}^{n} b_ip_i$, where $a_{ij}=a_{ij}(\textbf{q})$ and $b_i=b_i(\textbf{q})$ are functions of coordinates $\textbf{q}$. The newly adjusted operator is not a force-gradient operator but is similar to the momentum-version operator associated to the potential $V$. The newly extended (or adjusted) algorithms are no longer solvers of the original Hamiltonian, but are solvers of slightly modified Hamiltonians. They are explicit symplectic integrators with symmetry or time-reversibility. Numerical tests show that the standard symplectic integrators without the new operator are generally poorer than the corresponding extended methods with the new operator in computational accuracies and efficiencies. The optimized methods have better accuracies than the corresponding non-optimized counterparts. Among the tested symplectic methods, the two extended optimized seven-stage fourth-order methods of Omelyan, Mryglod and Folk exhibit the best numerical performance. As a result, one of the two optimized algorithms is used to study the orbital dynamical features of a modified H\'{e}non-Heiles system and a spring pendulum. These extended integrators allow for integrations in Hamiltonian problems, such as the spiral structure in self-consistent models of rotating galaxies and the spiral arms in galaxies.

We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output. We call this architecture a Pointer Net (Ptr-Net). We show Ptr-Nets can be used to learn approximate solutions to three challenging geometric problems -- finding planar convex hulls, computing Delaunay triangulations, and the planar Travelling Salesman Problem -- using training examples alone. Ptr-Nets not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries. We show that the learnt models generalize beyond the maximum lengths they were trained on. We hope our results on these tasks will encourage a broader exploration of neural learning for discrete problems.

北京阿比特科技有限公司