亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the structural and statistical properties of $\mathcal{R}$-norm minimizing interpolants of datasets labeled by specific target functions. The $\mathcal{R}$-norm is the basis of an inductive bias for two-layer neural networks, recently introduced to capture the functional effect of controlling the size of network weights, independently of the network width. We find that these interpolants are intrinsically multivariate functions, even when there are ridge functions that fit the data, and also that the $\mathcal{R}$-norm inductive bias is not sufficient for achieving statistically optimal generalization for certain learning problems. Altogether, these results shed new light on an inductive bias that is connected to practical neural network training.

相關內容

To enhance solution accuracy and training efficiency in neural network approximation to partial differential equations, partitioned neural networks can be used as a solution surrogate instead of a single large and deep neural network defined on the whole problem domain. In such a partitioned neural network approach, suitable interface conditions or subdomain boundary conditions are combined to obtain a convergent approximate solution. However, there has been no rigorous study on the convergence and parallel computing enhancement on the partitioned neural network approach. In this paper, iterative algorithms are proposed to address these issues. Our algorithms are based on classical additive Schwarz domain decomposition methods. Numerical results are included to show the performance of the proposed iterative algorithms.

Stencil composition uses the idea of function composition, wherein two stencils with arbitrary orders of derivative are composed to obtain a stencil with a derivative order equal to sum of the orders of the composing stencils. In this paper, we show how stencil composition can be applied to form finite difference stencils in order to numerically solve partial differential equations (PDEs). We present various properties of stencil composition and investigate the relationship between the order of accuracy of the composed stencil and that of the composing stencils. We also present comparisons between the stability restrictions of composed higher-order PDEs to their compact versions and numerical experiments wherein we verify the order of accuracy by convergence tests. To demonstrate an application to PDEs, a boundary value problem involving the two-dimensional biharmonic equation is numerically solved using stencil composition and the order of accuracy is verified by performing a convergence test. The method is then applied to the Cahn-Hilliard phase-field model. In addition to sample results in 2D and 3D for this benchmark problem, the scalability, spectral properties, and sparsity is explored.

We develop the no-propagate algorithm for sampling the linear response of random dynamical systems, which are non-uniform hyperbolic deterministic systems perturbed by noise with smooth density. We first derive a Monte-Carlo type formula and then the algorithm, which is different from the ensemble (stochastic gradient) algorithms, finite-element algorithms, and fast-response algorithms; it does not involve the propagation of vectors or covectors, and only the density of the noise is differentiated, so the formula is not cursed by gradient explosion, dimensionality, or non-hyperbolicity. We demonstrate our algorithm on a tent map perturbed by noise and a chaotic neural network with 51 layers $\times$ 9 neurons. By itself, this algorithm approximates the linear response of non-hyperbolic deterministic systems, with an additional error proportional to the noise. We also discuss the potential of using this algorithm as a part of a bigger algorithm with smaller error.

In this work we construct novel $H(\mathrm{sym} \mathrm{Curl})$-conforming finite elements for the recently introduced relaxed micromorphic sequence, which can be considered as the completion of the $\mathrm{div} \mathrm{Div}$-sequence with respect to the $H(\mathrm{sym} \mathrm{Curl})$-space. The elements respect $H(\mathrm{Curl})$-regularity and their lowest order versions converge optimally for $[H(\mathrm{sym} \mathrm{Curl}) \setminus H(\mathrm{Curl})]$-fields. This work introduces a detailed construction, proofs of linear independence and conformity of the basis, and numerical examples. Further, we demonstrate an application to the computation of metamaterials with the relaxed micromorphic model.

We consider the problem of estimating the trace of a matrix function $f(A)$. In certain situations, in particular if $f(A)$ cannot be well approximated by a low-rank matrix, combining probing methods based on graph colorings with stochastic trace estimation techniques can yield accurate approximations at moderate cost. So far, such methods have not been thoroughly analyzed, though, but were rather used as efficient heuristics by practitioners. In this manuscript, we perform a detailed analysis of stochastic probing methods and, in particular, expose conditions under which the expected approximation error in the stochastic probing method scales more favorably with the dimension of the matrix than the error in non-stochastic probing. Extending results from [E. Aune, D. P. Simpson, J. Eidsvik, Parameter estimation in high dimensional Gaussian distributions, Stat. Comput., 24, pp. 247--263, 2014], we also characterize situations in which using just one stochastic vector is always -- not only in expectation -- better than the deterministic probing method. Several numerical experiments illustrate our theory and compare with existing methods.

This paper presents a novel approach to the construction of the lowest order $H(\mathrm{curl})$ and $H(\mathrm{div})$ exponentially-fitted finite element spaces ${\mathcal{S}_{1^-}^{k}}~(k=1,2)$ on 3D simplicial mesh for corresponding convection-diffusion problems. It is noteworthy that this method not only facilitates the construction of the functions themselves but also provides corresponding discrete fluxes simultaneously. Utilizing this approach, we successfully establish a discrete convection-diffusion complex and employ a specialized weighted interpolation to establish a bridge between the continuous complex and the discrete complex, resulting in a coherent framework. Furthermore, we demonstrate the commutativity of the framework when the convection field is locally constant, along with the exactness of the discrete convection-diffusion complex. Consequently, these types of spaces can be directly employed to devise the corresponding discrete scheme through a Petrov-Galerkin method.

The problem of finding a solution to the linear system $Ax = b$ with certain minimization properties arises in numerous scientific and engineering areas. In the era of big data, the stochastic optimization algorithms become increasingly significant due to their scalability for problems of unprecedented size. This paper focuses on the problem of minimizing a strongly convex function subject to linear constraints. We consider the dual formulation of this problem and adopt the stochastic coordinate descent to solve it. The proposed algorithmic framework, called fast stochastic dual coordinate descent, utilizes sampling matrices sampled from user-defined distributions to extract gradient information. Moreover, it employs Polyak's heavy ball momentum acceleration with adaptive parameters learned through iterations, overcoming the limitation of the heavy ball momentum method that it requires prior knowledge of certain parameters, such as the singular values of a matrix. With these extensions, the framework is able to recover many well-known methods in the context, including the randomized sparse Kaczmarz method, the randomized regularized Kaczmarz method, the linearized Bregman iteration, and a variant of the conjugate gradient (CG) method. We prove that, with strongly admissible objective function, the proposed method converges linearly in expectation. Numerical experiments are provided to confirm our results.

Given samples of a real or complex-valued function on a set of distinct nodes, the traditional linear Chebyshev approximation is to compute the best minimax approximation on a prescribed linear functional space. Lawson's iteration is a classical and well-known method for that task. However, Lawson's iteration converges linearly and in many cases, the convergence is very slow. In this paper, by the duality theory of linear programming, we first provide an elementary and self-contained proof for the well-known Alternation Theorem in the real case. Also, relying upon the Lagrange duality, we further establish an $L_q$-weighted dual programming for the linear Chebyshev approximation. In this framework, we revisit the convergence of Lawson's iteration, and moreover, propose a Newton type iteration, the interior-point method, to solve the $L_2$-weighted dual programming. Numerical experiments are reported to demonstrate its fast convergence and its capability in finding the reference points that characterize the unique minimax approximation.

Kleene's computability theory based on the S1-S9 computation schemes constitutes a model for computing with objects of any finite type and extends Turing's 'machine model' which formalises computing with real numbers. A fundamental distinction in Kleene's framework is between normal and non-normal functionals where the former compute the associated Kleene quantifier $\exists^n$ and the latter do not. Historically, the focus was on normal functionals, but recently new non-normal functionals have been studied based on well-known theorems, the weakest among which seems to be the uncountability of the reals. These new non-normal functionals are fundamentally different from historical examples like Tait's fan functional: the latter is computable from $\exists^2$, while the former are computable in $\exists^3$ but not in weaker oracles. Of course, there is a great divide or abyss separating $\exists^2$ and $\exists^3$ and we identify slight variations of our new non-normal functionals that are again computable in $\exists^2$, i.e. fall on different sides of this abyss. Our examples are based on mainstream mathematical notions, like quasi-continuity, Baire classes, bounded variation, and semi-continuity from real analysis.

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

北京阿比特科技有限公司