亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider the iterative solution of large linear systems of equations in which the coefficient matrix is the sum of two terms, a sparse matrix $A$ and a possibly dense, rank deficient matrix of the form $\gamma UU^T$, where $\gamma > 0$ is a parameter which in some applications may be taken to be 1. The matrix $A$ itself can be singular, but we assume that the symmetric part of $A$ is positive semidefinite and that $A+\gamma UU^T$ is nonsingular. Linear systems of this form arise frequently in fields like optimization, fluid mechanics, computational statistics, and others. We investigate preconditioning strategies based on an alternating splitting approach combined with the use of the Sherman-Morrison-Woodbury matrix identity. The potential of the proposed approach is demonstrated by means of numerical experiments on linear systems from different application areas.

相關內容

We consider the problem of constructing minimax rate-optimal estimators for a doubly robust nonparametric functional that has witnessed applications across the causal inference and conditional independence testing literature. Minimax rate-optimal estimators for such functionals are typically constructed through higher-order bias corrections of plug-in and one-step type estimators and, in turn, depend on estimators of nuisance functions. In this paper, we consider a parallel question of interest regarding the optimality and/or sub-optimality of plug-in and one-step bias-corrected estimators for the specific doubly robust functional of interest. Specifically, we verify that by using undersmoothing and sample splitting techniques when constructing nuisance function estimators, one can achieve minimax rates of convergence in all H\"older smoothness classes of the nuisance functions (i.e. the propensity score and outcome regression) provided that the marginal density of the covariates is sufficiently regular. Additionally, by demonstrating suitable lower bounds on these classes of estimators, we demonstrate the necessity to undersmooth the nuisance function estimators to obtain minimax optimal rates of convergence.

In this paper we generalize the polynomial time integration framework to additively partitioned initial value problems. The framework we present is general and enables the construction of many new families of additive integrators with arbitrary order-of-accuracy and varying degree of implicitness. In this first work, we focus on a new class of implicit-explicit polynomial block methods that are based on fully-implicit Runge-Kutta methods with Radau nodes, and possess high stage order. We show that the new fully-implicit-explicit (FIMEX) integrators have improved stability compared to existing IMEX Runge-Kutta methods, while also being more computationally efficient due to recent developments in preconditioning techniques for solving the associated systems of nonlinear equations. For PDEs on periodic domains where the implicit component is trivial to invert, we will show how parallelization of the right-hand-side evaluations can be exploited to obtain significant speedup compared to existing serial IMEX Runge-Kutta methods. For parallel (in space) finite-element discretizations, the new methods can achieve orders of magnitude better accuracy than existing IMEX Runge-Kutta methods, and/or achieve a given accuracy several times times faster in terms of computational runtime.

The rise of internet-based services and products in the late 1990's brought about an unprecedented opportunity for online businesses to engage in large scale data-driven decision making. Over the past two decades, organizations such as Airbnb, Alibaba, Amazon, Baidu, Booking, Alphabet's Google, LinkedIn, Lyft, Meta's Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have invested tremendous resources in online controlled experiments (OCEs) to assess the impact of innovation on their customers and businesses. Running OCEs at scale has presented a host of challenges requiring solutions from many domains. In this paper we review challenges that require new statistical methodologies to address them. In particular, we discuss the practice and culture of online experimentation, as well as its statistics literature, placing the current methodologies within their relevant statistical lineages and providing illustrative examples of OCE applications. Our goal is to raise academic statisticians' awareness of these new research opportunities to increase collaboration between academia and the online industry.

Partial differential equations (PDEs) are important tools to model physical systems, and including them into machine learning models is an important way of incorporating physical knowledge. Given any system of linear PDEs with constant coefficients, we propose a family of Gaussian process (GP) priors, which we call EPGP, such that all realizations are exact solutions of this system. We apply the Ehrenpreis-Palamodov fundamental principle, which works like a non-linear Fourier transform, to construct GP kernels mirroring standard spectral methods for GPs. Our approach can infer probable solutions of linear PDE systems from any data such as noisy measurements, or initial and boundary conditions. Constructing EPGP-priors is algorithmic, generally applicable, and comes with a sparse version (S-EPGP) that learns the relevant spectral frequencies and works better for big data sets. We demonstrate our approach on three families of systems of PDE, the heat equation, wave equation, and Maxwell's equations, where we improve upon the state of the art in computation time and precision, in some experiments by several orders of magnitude.

We present a new perspective on the use of weighted essentially nonoscillatory (WENO) reconstructions in high-order methods for scalar hyperbolic conservation laws. The main focus of this work is on nonlinear stabilization of continuous Galerkin (CG) approximations. The proposed methodology also provides an interesting alternative to WENO-based limiters for discontinuous Galerkin (DG) methods. Unlike Runge--Kutta DG schemes that overwrite finite element solutions with WENO reconstructions, our approach uses a reconstruction-based smoothness sensor to blend the numerical viscosity operators of high- and low-order stabilization terms. The so-defined WENO approximation introduces low-order nonlinear diffusion in the vicinity of shocks, while preserving the high-order accuracy of a linearly stable baseline discretization in regions where the exact solution is sufficiently smooth. The underlying reconstruction procedure performs Hermite interpolation on stencils consisting of a mesh cell and its neighbors. The amount of numerical dissipation depends on the relative differences between partial derivatives of reconstructed candidate polynomials and those of the underlying finite element approximation. All derivatives are taken into account by the employed smoothness sensor. To assess the accuracy of our CG-WENO scheme, we derive error estimates and perform numerical experiments. In particular, we prove that the consistency error of the nonlinear stabilization is of the order $p+1/2$, where $p$ is the polynomial degree. This estimate is optimal for general meshes. For uniform meshes and smooth exact solutions, the experimentally observed rate of convergence is as high as $p+1$.

Offline reinforcement learning (RL) concerns pursuing an optimal policy for sequential decision-making from a pre-collected dataset, without further interaction with the environment. Recent theoretical progress has focused on developing sample-efficient offline RL algorithms with various relaxed assumptions on data coverage and function approximators, especially to handle the case with excessively large state-action spaces. Among them, the framework based on the linear-programming (LP) reformulation of Markov decision processes has shown promise: it enables sample-efficient offline RL with function approximation, under only partial data coverage and realizability assumptions on the function classes, with favorable computational tractability. In this work, we revisit the LP framework for offline RL, and advance the existing results in several aspects, relaxing certain assumptions and achieving optimal statistical rates in terms of sample size. Our key enabler is to introduce proper constraints in the reformulation, instead of using any regularization as in the literature, sometimes also with careful choices of the function classes and initial state distributions. We hope our insights further advocate the study of the LP framework, as well as the induced primal-dual minimax optimization, in offline RL.

Machine learning models trained by different optimization algorithms under different data distributions can exhibit distinct generalization behaviors. In this paper, we analyze the generalization of models trained by noisy iterative algorithms. We derive distribution-dependent generalization bounds by connecting noisy iterative algorithms to additive noise channels found in communication and information theory. Our generalization bounds shed light on several applications, including differentially private stochastic gradient descent (DP-SGD), federated learning, and stochastic gradient Langevin dynamics (SGLD). We demonstrate our bounds through numerical experiments, showing that they can help understand recent empirical observations of the generalization phenomena of neural networks.

A well-established approach for inferring full displacement and stress fields from possibly sparse data is to calibrate the parameter of a given constitutive model using a Bayesian update. After calibration, a (stochastic) forward simulation is conducted with the identified model parameters to resolve physical fields in regions that were not accessible to the measurement device. A shortcoming of model calibration is that the model is deemed to best represent reality, which is only sometimes the case, especially in the context of the aging of structures and materials. While this issue is often addressed with repeated model calibration, a different approach is followed in the recently proposed statistical Finite Element Method (statFEM). Instead of using Bayes' theorem to update model parameters, the displacement is chosen as the stochastic prior and updated to fit the measurement data more closely. For this purpose, the statFEM framework introduces a so-called model-reality mismatch, parametrized by only three hyperparameters. This makes the inference of full-field data computationally efficient in an online stage: If the stochastic prior can be computed offline, solving the underlying partial differential equation (PDE) online is unnecessary. Compared to solving a PDE, identifying only three hyperparameters and conditioning the state on the sensor data requires much fewer computational resources. This paper presents two contributions to the existing statFEM approach: First, we use a non-intrusive polynomial chaos method to compute the prior, enabling the use of complex mechanical models in deterministic formulations. Second, we examine the influence of prior material models (linear elastic and St.Venant Kirchhoff material with uncertain Young's modulus) on the updated solution. We present statFEM results for 1D and 2D examples, while an extension to 3D is straightforward.

This PhD thesis contains several contributions to the field of statistical causal modeling. Statistical causal models are statistical models embedded with causal assumptions that allow for the inference and reasoning about the behavior of stochastic systems affected by external manipulation (interventions). This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust (out-of-distribution generalizing) prediction methods. We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization. Our proposed estimators show, in certain settings, mean squared error improvements compared to both canonical and state-of-the-art estimators. We show that recent research on distributionally robust prediction methods has connections to well-studied estimators from econometrics. This connection leads us to prove that general K-class estimators possess distributional robustness properties. We, furthermore, propose a general framework for distributional robustness with respect to intervention-induced distributions. In this framework, we derive sufficient conditions for the identifiability of distributionally robust prediction methods and present impossibility results that show the necessity of several of these conditions. We present a new structure learning method applicable in additive noise models with directed trees as causal graphs. We prove consistency in a vanishing identifiability setup and provide a method for testing substructure hypotheses with asymptotic family-wise error control that remains valid post-selection. Finally, we present heuristic ideas for learning summary graphs of nonlinear time-series models.

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

北京阿比特科技有限公司