亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We revisit the general framework introduced by Fazylab et al. (SIAM J. Optim. 28, 2018) to construct Lyapunov functions for optimization algorithms in discrete and continuous time. For smooth, strongly convex objective functions, we relax the requirements necessary for such a construction. As a result we are able to prove for Polyak's ordinary differential equations and for a two-parameter family of Nesterov algorithms rates of convergence that improve on those available in the literature. We analyse the interpretation of Nesterov algorithms as discretizations of the Polyak equation. We show that the algorithms are instances of Additive Runge-Kutta integrators and discuss the reasons why most discretizations of the differential equation do not result in optimization algorithms with acceleration. We also introduce a modification of Polyak's equation and study its convergence properties. Finally we extend the general framework to the stochastic scenario and consider an application to random algorithms with acceleration for overparameterized models; again we are able to prove convergence rates that improve on those in the literature.

相關內容

The solution approximation for partial differential equations (PDEs) can be substantially improved using smooth basis functions. The recently introduced mollified basis functions are constructed through mollification, or convolution, of cell-wise defined piecewise polynomials with a smooth mollifier of certain characteristics. The properties of the mollified basis functions are governed by the order of the piecewise functions and the smoothness of the mollifier. In this work, we exploit the high-order and high-smoothness properties of the mollified basis functions for solving PDEs through the point collocation method. The basis functions are evaluated at a set of collocation points in the domain. In addition, boundary conditions are imposed at a set of boundary collocation points distributed over the domain boundaries. To ensure the stability of the resulting linear system of equations, the number of collocation points is set larger than the total number of basis functions. The resulting linear system is overdetermined and is solved using the least square technique. The presented numerical examples confirm the convergence of the proposed approximation scheme for Poisson, linear elasticity, and biharmonic problems. We study in particular the influence of the mollifier and the spatial distribution of the collocation points.

We are motivated by a study that seeks to better understand the dynamic relationship between muscle activation and paw position during locomotion. For each gait cycle in this experiment, activation in the biceps and triceps is measured continuously and in parallel with paw position as a mouse trotted on a treadmill. We propose an innovative general regression method that draws from both ordinary differential equations and functional data analysis to model the relationship between these functional inputs and responses as a dynamical system that evolves over time. Specifically, our model addresses gaps in both literatures and borrows strength across curves estimating ODE parameters across all curves simultaneously rather than separately modeling each functional observation. Our approach compares favorably to related functional data methods in simulations and in cross-validated predictive accuracy of paw position in the gait data. In the analysis of the gait cycles, we find that paw speed and position are dynamically influenced by inputs from the biceps and triceps muscles, and that the effect of muscle activation persists beyond the activation itself.

When the target of inference is a real-valued function of probability parameters in the k-sample multinomial problem, variance estimation may be challenging. In small samples, methods like the nonparametric bootstrap or delta method may perform poorly. We propose a novel general method in this setting for computing exact p-values and confidence intervals which means that type I error rates are correctly bounded and confidence intervals have at least nominal coverage at all sample sizes. Our method is applicable to any real-valued function of multinomial probabilities, accommodating an arbitrary number of samples with varying category counts. We describe the method and provide an implementation of it in R, with some computational optimization to ensure broad applicability. Simulations demonstrate our method's ability to maintain correct coverage rates in settings where the nonparametric bootstrap fails.

Solving partial differential equations using an annealing-based approach is based on solving generalized eigenvalue problems. When a partial differential equation is discretized, it leads to a system of linear equations (SLE). Solving an SLE can be expressed as a general eigenvalue problem, which can be converted into an optimization problem with the objective function being a generalized Rayleigh quotient. The proposed algorithm allows the computation of eigenvectors at arbitrary precision without increasing the number of variables using an Ising machine. Simple examples solved using this method and theoretical analysis provide a guideline for appropriate parameter settings.

Factor models are widely used for dimension reduction in the analysis of multivariate data. This is achieved through decomposition of a p x p covariance matrix into the sum of two components. Through a latent factor representation, they can be interpreted as a diagonal matrix of idiosyncratic variances and a shared variation matrix, that is, the product of a p x k factor loadings matrix and its transpose. If k << p, this defines a parsimonious factorisation of the covariance matrix. Historically, little attention has been paid to incorporating prior information in Bayesian analyses using factor models where, at best, the prior for the factor loadings is order invariant. In this work, a class of structured priors is developed that can encode ideas of dependence structure about the shared variation matrix. The construction allows data-informed shrinkage towards sensible parametric structures while also facilitating inference over the number of factors. Using an unconstrained reparameterisation of stationary vector autoregressions, the methodology is extended to stationary dynamic factor models. For computational inference, parameter-expanded Markov chain Monte Carlo samplers are proposed, including an efficient adaptive Gibbs sampler. Two substantive applications showcase the scope of the methodology and its inferential benefits.

This paper presents a multivariate normal integral expression for the joint survival function of the cumulated components of any multinomial random vector. This result can be viewed as a multivariate analog of Equation (7) from Carter & Pollard (2004), who improved Tusn\'ady's inequality. Our findings are based on a crucial relationship between the joint survival function of the cumulated components of any multinomial random vector and the joint cumulative distribution function of a corresponding Dirichlet distribution. We offer two distinct proofs: the first expands the logarithm of the Dirichlet density, while the second employs Laplace's method applied to the Dirichlet integral.

We develop novel LASSO-based methods for coefficient testing and confidence interval construction in the Gaussian linear model with $n\ge d$. Our methods' finite-sample guarantees are identical to those of their ubiquitous ordinary-least-squares-$t$-test-based analogues, yet have substantially higher power when the true coefficient vector is sparse. In particular, our coefficient test, which we call the $\ell$-test, performs like the one-sided $t$-test (despite not being given any information about the sign) under sparsity, and the corresponding confidence intervals are more than 10% shorter than the standard $t$-test based intervals. The nature of the $\ell$-test directly provides a novel exact adjustment conditional on LASSO selection for post-selection inference, allowing for the construction of post-selection p-values and confidence intervals. None of our methods require resampling or Monte Carlo estimation. We perform a variety of simulations and a real data analysis on an HIV drug resistance data set to demonstrate the benefits of the $\ell$-test. We end with a discussion of how the $\ell$-test may asymptotically apply to a much more general class of parametric models.

Boundary value problems based on the convection-diffusion equation arise naturally in models of fluid flow across a variety of engineering applications and design feasibility studies. Naturally, their efficient numerical solution has continued to be an interesting and active topic of research for decades. In the context of finite-element discretization of these boundary value problems, the Streamline Upwind Petrov-Galerkin (SUPG) technique yields accurate discretization in the singularly perturbed regime. In this paper, we propose efficient multigrid iterative solution methods for the resulting linear systems. In particular, we show that techniques from standard multigrid for anisotropic problems can be adapted to these discretizations on both tensor-product as well as semi-structured meshes. The resulting methods are demonstrated to be robust preconditioners for several standard flow benchmarks.

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

北京阿比特科技有限公司