亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Standard perturbation theory of eigenvalue problems consists of obtaining approximations of eigenmodes in the neighborhood of a Hamiltonian where the corresponding eigenmode is known. Nevertheless, if the corresponding eigenmodes of several nearby Hamiltonians are known, standard perturbation theory cannot simultaneously use all this knowledge to provide a better approximation. We derive a formula enabling such an approximation result, and provide numerical examples for which this method is more competitive than standard perturbation theory.

相關內容

In this work, we revisit the problem of solving large-scale semidefinite programs using randomized first-order methods and stochastic smoothing. We introduce two oblivious stochastic mirror descent algorithms based on a complementary composite setting. One algorithm is designed for non-smooth objectives, while an accelerated version is tailored for smooth objectives. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. For the non-smooth case with $\mathcal{M}-$bounded oracles, we prove a convergence rate of $ O( {\mathcal{M}}/{\sqrt{T}} ) $. For the $L$-smooth case with a feasible set bounded by $D$, we derive a convergence rate of $ O( {L^2 D^2}/{(T^{2}\sqrt{T})} + {(D_0^2+\sigma^2)}/{\sqrt{T}} )$, where $D_0$ is the starting distance to an optimal solution, and $ \sigma^2$ is the stochastic oracle variance. These rates had only been obtained so far by either assuming prior knowledge of the Lipschitz constant or the starting distance to an optimal solution. We further show how to extend our framework to relative scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.

Gradient methods have become mainstream techniques for Bi-Level Optimization (BLO) in learning fields. The validity of existing works heavily rely on either a restrictive Lower-Level Strong Convexity (LLSC) condition or on solving a series of approximation subproblems with high accuracy or both. In this work, by averaging the upper and lower level objectives, we propose a single loop Bi-level Averaged Method of Multipliers (sl-BAMM) for BLO that is simple yet efficient for large-scale BLO and gets rid of the limited LLSC restriction. We further provide non-asymptotic convergence analysis of sl-BAMM towards KKT stationary points, and the comparative advantage of our analysis lies in the absence of strong gradient boundedness assumption, which is always required by others. Thus our theory safely captures a wider variety of applications in deep learning, especially where the upper-level objective is quadratic w.r.t. the lower-level variable. Experimental results demonstrate the superiority of our method.

Singularly perturbed problems present inherent difficulty due to the presence of a thin boundary layer in its solution. To overcome this difficulty, we propose using deep operator networks (DeepONets), a method previously shown to be effective in approximating nonlinear operators between infinite-dimensional Banach spaces. In this paper, we demonstrate for the first time the application of DeepONets to one-dimensional singularly perturbed problems, achieving promising results that suggest their potential as a robust tool for solving this class of problems. We consider the convergence rate of the approximation error incurred by the operator networks in approximating the solution operator, and examine the generalization gap and empirical risk, all of which are shown to converge uniformly with respect to the perturbation parameter. By utilizing Shishkin mesh points as locations of the loss function, we conduct several numerical experiments that provide further support for the effectiveness of operator networks in capturing the singular boundary layer behavior.

We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,\delta)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and show that the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$ is bounded by roughly $\tilde{O}(\sqrt{kd})$, whenever there is an appropriately large gap between the $k$'th and the $k+1$'th eigenvalues of $M$. This improves on previous work that requires that the gap between every pair of top-$k$ eigenvalues of $M$ is at least $\sqrt{d}$ for a similar bound. Our analysis leverages the fact that the eigenvalues of complex matrix Brownian motion repel more than in the real case, and uses Dyson's stochastic differential equations governing the evolution of its eigenvalues to show that the eigenvalues of the matrix $M$ perturbed by complex Gaussian noise have large gaps with high probability. Our results contribute to the analysis of low-rank approximations under average-case perturbations and to an understanding of eigenvalue gaps for random matrices, which may be of independent interest.

It is well known that the Euler method for approximating the solutions of a random ordinary differential equation $\mathrm{d}X_t/\mathrm{d}t = f(t, X_t, Y_t)$ driven by a stochastic process $\{Y_t\}_t$ with $\theta$-H\"older sample paths is estimated to be of strong order $\theta$ with respect to the time step, provided $f=f(t, x, y)$ is sufficiently regular and with suitable bounds. Here, it is proved that, in many typical cases, further conditions on the noise can be exploited so that the strong convergence is actually of order 1, regardless of the H\"older regularity of the sample paths. This applies for instance to additive or multiplicative It\^o process noises (such as Wiener, Ornstein-Uhlenbeck, and geometric Brownian motion processes); to point-process noises (such as Poisson point processes and Hawkes self-exciting processes, which even have jump-type discontinuities); and to transport-type processes with sample paths of bounded variation. The result is based on a novel approach, estimating the global error as an iterated integral over both large and small mesh scales, and switching the order of integration to move the critical regularity to the large scale. The work is complemented with numerical simulations illustrating the strong order 1 convergence in those cases, and with an example with fractional Brownian motion noise with Hurst parameter $0 < H < 1/2$ for which the order of convergence is $H + 1/2$, hence lower than the attained order 1 in the examples above, but still higher than the order $H$ of convergence expected from previous works.

Ferrers diagram rank-metric codes were introduced by Etzion and Silberstein in 2009. In their work, they proposed a conjecture on the largest dimension of a space of matrices over a finite field whose nonzero elements are supported on a given Ferrers diagram and all have rank lower bounded by a fixed positive integer $d$. Since stated, the Etzion-Silberstein conjecture has been verified in a number of cases, often requiring additional constraints on the field size or on the minimum rank $d$ in dependence of the corresponding Ferrers diagram. As of today, this conjecture still remains widely open. Using modular methods, we give a constructive proof of the Etzion-Silberstein conjecture for the class of strictly monotone Ferrers diagrams, which does not depend on the minimum rank $d$ and holds over every finite field. In addition, we leverage on the last result to also prove the conjecture for the class of MDS-constructible Ferrers diagrams, without requiring any restriction on the field size.

In this paper, we are concerned with efficiently solving the sequences of regularized linear least squares problems associated with employing Tikhonov-type regularization with regularization operators designed to enforce edge recovery. An optimal regularization parameter, which balances the fidelity to the data with the edge-enforcing constraint term, is typically not known a priori. This adds to the total number of regularized linear least squares problems that must be solved before the final image can be recovered. Therefore, in this paper, we determine effective multigrid preconditioners for these sequences of systems. We focus our approach on the sequences that arise as a result of the edge-preserving method introduced in [6], where we can exploit an interpretation of the regularization term as a diffusion operator; however, our methods are also applicable in other edge-preserving settings, such as iteratively reweighted least squares problems. Particular attention is paid to the selection of components of the multigrid preconditioner in order to achieve robustness for different ranges of the regularization parameter value. In addition, we present a parameter culling approach that, when used with the L-curve heuristic, reduces the total number of solves required. We demonstrate our preconditioning and parameter culling routines on examples in computed tomography and image deblurring.

We prove the first polynomial separation between randomized and deterministic time-space tradeoffs of multi-output functions. In particular, we present a total function that on the input of $n$ elements in $[n]$, outputs $O(n)$ elements, such that: (1) There exists a randomized oblivious algorithm with space $O(\log n)$, time $O(n\log n)$ and one-way access to randomness, that computes the function with probability $1-O(1/n)$; (2) Any deterministic oblivious branching program with space $S$ and time $T$ that computes the function must satisfy $T^2S\geq\Omega(n^{2.5}/\log n)$. This implies that logspace randomized algorithms for multi-output functions cannot be black-box derandomized without an $\widetilde{\Omega}(n^{1/4})$ overhead in time. Since previously all the polynomial time-space tradeoffs of multi-output functions are proved via the Borodin-Cook method, which is a probabilistic method that inherently gives the same lower bound for randomized and deterministic branching programs, our lower bound proof is intrinsically different from previous works. We also examine other natural candidates for proving such separations, and show that any polynomial separation for these problems would resolve the long-standing open problem of proving $n^{1+\Omega(1)}$ time lower bound for decision problems with $\mathrm{polylog}(n)$ space.

We revisit the problem of spurious modes that are sometimes encountered in partial differential equations discretizations. It is generally suspected that one of the causes for spurious modes is due to how boundary conditions are treated, and we use this as the starting point of our investigations. By regarding boundary conditions as algebraic constraints on a differential equation, we point out that any differential equation with homogeneous boundary conditions also admits a typically infinite number of hidden or implicit boundary conditions. In most discretization schemes, these additional implicit boundary conditions are violated, and we argue that this is what leads to the emergence of spurious modes. These observations motivate two definitions of the quality of computed eigenvalues based on violations of derivatives of boundary conditions on the one hand, and on the Grassmann distance between subspaces associated with computed eigenspaces on the other. Both of these tests are based on a standardized treatment of boundary conditions and do not require a priori knowledge of eigenvalue locations. The effectiveness of these tests is demonstrated on several examples known to have spurious modes. In addition, these quality tests show that in most problems, about half the computed spectrum of a differential operator is of low quality. The tests also specifically identify the low accuracy modes, which can then be projected out as a type of model reduction scheme.

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.

北京阿比特科技有限公司