亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We describe a Lanczos-based algorithm for approximating the product of a rational matrix function with a vector. This algorithm, which we call the Lanczos method for optimal rational matrix function approximation (Lanczos-OR), returns the optimal approximation from a given Krylov subspace in a norm depending on the rational function's denominator, and can be computed using the information from a slightly larger Krylov subspace. We also provide a low-memory implementation which only requires storing a number of vectors proportional to the denominator degree of the rational function. Finally, we show that Lanczos-OR can be used to derive algorithms for computing other matrix functions, including the matrix sign function and quadrature based rational function approximations. In many cases, it improves on the approximation quality of prior approaches, including the standard Lanczos method, with little additional computational overhead.

相關內容

We present an algorithm for the solution of Sylvester equations with right-hand side of low rank. The method is based on projection onto a block rational Krylov subspace, with two key contributions with respect to the state-of-the-art. First, we show how to maintain the last pole equal to infinity throughout the iteration, by means of pole reodering. This allows for a cheap evaluation of the true residual at every step. Second, we extend the convergence analysis in [Beckermann B., An error analysis for rational Galerkin projection applied to the Sylvester equation, SINUM, 2011] to the block case. This extension allows to link the convergence with the problem of minimizing the norm of a small rational matrix over the spectra or field-of-values of the involved matrices. This is in contrast with the non-block case, where the minimum problem is scalar, instead of matrix-valued. Replacing the norm of the objective function with an easier to evaluate function yields several adaptive pole selection strategies, providing a theoretical analysis for known heuristics, as well as effective novel techniques.

The problem of low rank approximation is ubiquitous in science. Traditionally this problem is solved in unitary invariant norms such as Frobenius or spectral norm due to existence of efficient methods for building approximations. However, recent results reveal the potential of low rank approximations in Chebyshev norm, which naturally arises in many applications. In this paper we tackle the problem of building optimal rank-1 approximations in the Chebyshev norm. We investigate the properties of alternating minimization algorithm for building the low rank approximations and demonstrate how to use it to construct optimal rank-1 approximation. As a result we propose an algorithm that is capable of building optimal rank-1 approximations in Chebyshev norm for small matrices.

This paper is devoted to the statistical and numerical properties of the geometric median, and its applications to the problem of robust mean estimation via the median of means principle. Our main theoretical results include (a) an upper bound for the distance between the mean and the median for general absolutely continuous distributions in R^d, and examples of specific classes of distributions for which these bounds do not depend on the ambient dimension d; (b) exponential deviation inequalities for the distance between the sample and the population versions of the geometric median, which again depend only on the trace-type quantities and not on the ambient dimension. As a corollary, we deduce improved bounds for the (geometric) median of means estimator that hold for large classes of heavy-tailed distributions. Finally, we address the error of numerical approximation, which is an important practical aspect of any statistical estimation procedure. We demonstrate that the objective function minimized by the geometric median satisfies a "local quadratic growth" condition that allows one to translate suboptimality bounds for the objective function to the corresponding bounds for the numerical approximation to the median itself, and propose a simple stopping rule applicable to any optimization method which yields explicit error guarantees. We conclude with the numerical experiments including the application to estimation of mean values of log-returns for S&P 500 data.

In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.

Since their introduction in Abadie and Gardeazabal (2003), Synthetic Control (SC) methods have quickly become one of the leading methods for estimating causal effects in observational studies in settings with panel data. Formal discussions often motivate SC methods by the assumption that the potential outcomes were generated by a factor model. Here we study SC methods from a design-based perspective, assuming a model for the selection of the treated unit(s) and period(s). We show that the standard SC estimator is generally biased under random assignment. We propose a Modified Unbiased Synthetic Control (MUSC) estimator that guarantees unbiasedness under random assignment and derive its exact, randomization-based, finite-sample variance. We also propose an unbiased estimator for this variance. We document in settings with real data that under random assignment, SC-type estimators can have root mean-squared errors that are substantially lower than that of other common estimators. We show that such an improvement is weakly guaranteed if the treated period is similar to the other periods, for example, if the treated period was randomly selected. While our results only directly apply in settings where treatment is assigned randomly, we believe that they can complement model-based approaches even for observational studies.

We consider a high-dimensional sparse normal means model where the goal is to estimate the mean vector assuming the proportion of non-zero means is unknown. We model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. We address some questions related to asymptotic properties of the resulting posterior distribution of the mean vector for the said class priors. We consider two ways to model the global parameter in this paper. Firstly by considering this as an unknown fixed parameter and then by an empirical Bayes estimate of it. In the second approach, we do a hierarchical Bayes treatment by assigning a suitable non-degenerate prior distribution to it. We first show that for the class of priors under study, the posterior distribution of the mean vector contracts around the true parameter at a near minimax rate when the empirical Bayes approach is used. Next, we prove that in the hierarchical Bayes approach, the corresponding Bayes estimate attains the minimax risk asymptotically under the squared error loss function. We also show that the posterior contracts around the true parameter at a near minimax rate. These results generalize those of van der Pas et al. (2014) \cite{van2014horseshoe}, (2017) \cite{van2017adaptive}, proved for the horseshoe prior. We have also studied in this work the asymptotic Bayes optimality of global-local shrinkage priors where the number of non-null hypotheses is unknown. Here our target is to propose some conditions on the prior density of the global parameter such that the Bayes risk induced by the decision rule attains Optimal Bayes risk, up to some multiplicative constant. Using our proposed condition, under the asymptotic framework of Bogdan et al. (2011) \cite{bogdan2011asymptotic}, we are able to provide an affirmative answer to satisfy our hunch.

There are multiple cluster randomised trial designs that vary in when the clusters cross between control and intervention states, when observations are made within clusters, and how many observations are made at that time point. Identifying the most efficient study design is complex though, owing to the correlation between observations within clusters and over time. In this article, we present a review of statistical and computational methods for identifying optimal cluster randomised trial designs. We also adapt methods from the experimental design literature for experimental designs with correlated observations to the cluster trial context. We identify three broad classes of methods: using exact formulae for the treatment effect estimator variance for specific models to derive algorithms or weights for cluster sequences; generalised methods for estimating weights for experimental units; and, combinatorial optimisation algorithms to select an optimal subset of experimental units. We also discuss methods for rounding weights to whole numbers of clusters and extensions to non-Gaussian models. We present results from multiple cluster trial examples that compare the different methods, including problems involving determining optimal allocation of clusters across a set of cluster sequences, and selecting the optimal number of single observations to make in each cluster-period for both Gaussian and non-Gaussian models, and including exchangeable and exponential decay covariance structures.

In single-particle cryo-electron microscopy (cryo-EM), the efficient determination of orientation parameters for 2D projection images poses a significant challenge yet is crucial for reconstructing 3D structures. This task is complicated by the high noise levels present in the cryo-EM datasets, which often include outliers, necessitating several time-consuming 2D clean-up processes. Recently, solutions based on deep learning have emerged, offering a more streamlined approach to the traditionally laborious task of orientation estimation. These solutions often employ amortized inference, eliminating the need to estimate parameters individually for each image. However, these methods frequently overlook the presence of outliers and may not adequately concentrate on the components used within the network. This paper introduces a novel approach that uses a 10-dimensional feature vector to represent the orientation and applies a Quadratically-Constrained Quadratic Program to derive the predicted orientation as a unit quaternion, supplemented by an uncertainty metric. Furthermore, we propose a unique loss function that considers the pairwise distances between orientations, thereby enhancing the accuracy of our method. Finally, we also comprehensively evaluate the design choices involved in constructing the encoder network, a topic that has not received sufficient attention in the literature. Our numerical analysis demonstrates that our methodology effectively recovers orientations from 2D cryo-EM images in an end-to-end manner. Importantly, the inclusion of uncertainty quantification allows for direct clean-up of the dataset at the 3D level. Lastly, we package our proposed methods into a user-friendly software suite named cryo-forum, designed for easy accessibility by the developers.

The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inference. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.

We investigate a fundamental vertex-deletion problem called (Induced) Subgraph Hitting: given a graph $G$ and a set $\mathcal{F}$ of forbidden graphs, the goal is to compute a minimum-sized set $S$ of vertices of $G$ such that $G-S$ does not contain any graph in $\mathcal{F}$ as an (induced) subgraph. This is a generic problem that encompasses many well-known problems that were extensively studied on their own, particularly (but not only) from the perspectives of both approximation and parameterization. In this paper, we study the approximability of the problem on a large variety of graph classes. Our first result is a linear-time $(1+\varepsilon)$-approximation reduction from (Induced) Subgraph Hitting on any graph class $\mathcal{G}$ of bounded expansion to the same problem on bounded degree graphs within $\mathcal{G}$. This directly yields linear-size $(1+\varepsilon)$-approximation lossy kernels for the problems on any bounded-expansion graph classes. Our second result is a linear-time approximation scheme for (Induced) Subgraph Hitting on any graph class $\mathcal{G}$ of polynomial expansion, based on the local-search framework of Har-Peled and Quanrud [SICOMP 2017]. This approximation scheme can be applied to a more general family of problems that aim to hit all subgraphs satisfying a certain property $\pi$ that is efficiently testable and has bounded diameter. Both of our results have applications to Subgraph Hitting (not induced) on wide classes of geometric intersection graphs, resulting in linear-size lossy kernels and (near-)linear time approximation schemes for the problem.

北京阿比特科技有限公司