亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We show that the max entropy algorithm can be derandomized (with respect to a particular objective function) to give a deterministic $3/2-\epsilon$ approximation algorithm for metric TSP for some $\epsilon > 10^{-36}$. To obtain our result, we apply the method of conditional expectation to an objective function constructed in prior work which was used to certify that the expected cost of the algorithm is at most $3/2-\epsilon$ times the cost of an optimal solution to the subtour elimination LP. The proof in this work involves showing that the expected value of this objective function can be computed in polynomial time (at all stages of the algorithm's execution).

相關內容

我們給定x,函數都會輸出一個f(X),這個輸出的f(X)與真實值Y可能是相同的,也可能是不同的,為了表示擬合的好壞,就用一個函數來度量擬合的程度。這個函數就稱為損失函數(loss function),或者叫代價函數(cost function)

We propose to approximate a (possibly discontinuous) multivariate function f (x) on a compact set by the partial minimizer arg miny p(x, y) of an appropriate polynomial p whose construction can be cast in a univariate sum of squares (SOS) framework, resulting in a highly structured convex semidefinite program. In a number of non-trivial cases (e.g. when f is a piecewise polynomial) we prove that the approximation is exact with a low-degree polynomial p. Our approach has three distinguishing features: (i) It is mesh-free and does not require the knowledge of the discontinuity locations. (ii) It is model-free in the sense that we only assume that the function to be approximated is available through samples (point evaluations). (iii) The size of the semidefinite program is independent of the ambient dimension and depends linearly on the number of samples. We also analyze the sample complexity of the approach, proving a generalization error bound in a probabilistic setting. This allows for a comparison with machine learning approaches.

2-Opt is probably the most basic local search heuristic for the TSP. This heuristic achieves amazingly good results on real world Euclidean instances both with respect to running time and approximation ratio. There are numerous experimental studies on the performance of 2-Opt. However, the theoretical knowledge about this heuristic is still very limited. Not even its worst case running time on 2-dimensional Euclidean instances was known so far. We clarify this issue by presenting, for every $p\in\mathbb{N}$, a family of $L_p$ instances on which 2-Opt can take an exponential number of steps. Previous probabilistic analyses were restricted to instances in which $n$ points are placed uniformly at random in the unit square $[0,1]^2$. We consider a more advanced model in which the points can be placed independently according to general distributions on $[0,1]^d$, for an arbitrary $d\ge 2$. In particular, we allow different distributions for different points. We study the expected number of local improvements in terms of the number $n$ of points and the maximal density $\phi$ of the probability distributions. We show an upper bound on the expected length of any 2-Opt improvement path of $\tilde{O}(n^{4+1/3}\cdot\phi^{8/3})$. When starting with an initial tour computed by an insertion heuristic, the upper bound on the expected number of steps improves even to $\tilde{O}(n^{4+1/3-1/d}\cdot\phi^{8/3})$. If the distances are measured according to the Manhattan metric, then the expected number of steps is bounded by $\tilde{O}(n^{4-1/d}\cdot\phi)$. In addition, we prove an upper bound of $O(\sqrt[d]{\phi})$ on the expected approximation factor with respect to all $L_p$ metrics. Let us remark that our probabilistic analysis covers as special cases the uniform input model with $\phi=1$ and a smoothed analysis with Gaussian perturbations of standard deviation $\sigma$ with $\phi\sim1/\sigma^d$.

The twin support vector machine and its extensions have made great achievements in dealing with binary classification problems. However, it suffers from difficulties in effective solution of multi-classification and fast model selection. This work devotes to the fast regularization parameter tuning algorithm for the twin multi-class support vector machine. Specifically, a novel sample data set partition strategy is first adopted, which is the basis for the model construction. Then, combining the linear equations and block matrix theory, the Lagrangian multipliers are proved to be piecewise linear w.r.t. the regularization parameters, so that the regularization parameters are continuously updated by only solving the break points. Next, Lagrangian multipliers are proved to be 1 as the regularization parameter approaches infinity, thus, a simple yet effective initialization algorithm is devised. Finally, eight kinds of events are defined to seek for the starting event for the next iteration. Extensive experimental results on nine UCI data sets show that the proposed method can achieve comparable classification performance without solving any quadratic programming problem.

Differential machine learning (DML) is a recently proposed technique that uses samplewise state derivatives to regularize least square fits to learn conditional expectations of functionals of stochastic processes as functions of state variables. Exploiting the derivative information leads to fewer samples than a vanilla ML approach for the same level of precision. This paper extends the methodology to parametric problems where the processes and functionals also depend on model and contract parameters, respectively. In addition, we propose adaptive parameter sampling to improve relative accuracy when the functionals have different magnitudes for different parameter sets. For calibration, we construct pricing surrogates for calibration instruments and optimize over them globally. We discuss strategies for robust calibration. We demonstrate the usefulness of our methodology on one-factor Cheyette models with benchmark rate volatility specification with an extra stochastic volatility factor on (two-curve) caplet prices at different strikes and maturities, first for parametric pricing, and then by calibrating to a given caplet volatility surface. To allow convenient and efficient simulation of processes and functionals and in particular the corresponding computation of samplewise derivatives, we propose to specify the processes and functionals in a low-code way close to mathematical notation which is then used to generate efficient computation of the functionals and derivatives in TensorFlow.

The change-plane Cox model is a popular tool for the subgroup analysis of survival data. Despite the rich literature on this model, there has been limited investigation into the asymptotic properties of the estimators of the finite-dimensional parameter. Particularly, the convergence rate, not to mention the asymptotic distribution, has not been fully characterized for the general model where classification is based on multiple covariates. To bridge this theoretical gap, this study proposes a maximum smoothed partial likelihood estimator and establishes the following asymptotic properties. First, it shows that the convergence rate for the classification parameter can be arbitrarily close to 1/n up to a logarithmic factor under a certain condition on covariates and the choice of tuning parameter. Given this convergence rate result, it also establishes the asymptotic normality for the regression parameter.

In this paper we establish accuracy bounds of Prony's method (PM) for recovery of sparse measures from incomplete and noisy frequency measurements, or the so-called problem of super-resolution, when the minimal separation between the points in the support of the measure may be much smaller than the Rayleigh limit. In particular, we show that PM is optimal with respect to the previously established min-max bound for the problem, in the setting when the measurement bandwidth is constant, with the minimal separation going to zero. Our main technical contribution is an accurate analysis of the inter-relations between the different errors in each step of PM, resulting in previously unnoticed cancellations. We also prove that PM is numerically stable in finite-precision arithmetic. We believe our analysis will pave the way to providing accurate analysis of known algorithms for the super-resolution problem in full generality.

We study exact algorithms for Euclidean TSP in $\mathbb{R}^d$. In the early 1990s algorithms with $n^{O(\sqrt{n})}$ running time were presented for the planar case, and some years later an algorithm with $n^{O(n^{1-1/d})}$ running time was presented for any $d\geq 2$. Despite significant interest in subexponential exact algorithms over the past decade, there has been no progress on Euclidean TSP, except for a lower bound stating that the problem admits no $2^{O(n^{1-1/d-\epsilon})}$ algorithm unless ETH fails. Up to constant factors in the exponent, we settle the complexity of Euclidean TSP by giving a $2^{O(n^{1-1/d})}$ algorithm and by showing that a $2^{o(n^{1-1/d})}$ algorithm does not exist unless ETH fails.

In this paper, we study error bounds for {\em Bayesian quadrature} (BQ), with an emphasis on noisy settings, randomized algorithms, and average-case performance measures. We seek to approximate the integral of functions in a {\em Reproducing Kernel Hilbert Space} (RKHS), particularly focusing on the Mat\'ern-$\nu$ and squared exponential (SE) kernels, with samples from the function potentially being corrupted by Gaussian noise. We provide a two-step meta-algorithm that serves as a general tool for relating the average-case quadrature error with the $L^2$-function approximation error. When specialized to the Mat\'ern kernel, we recover an existing near-optimal error rate while avoiding the existing method of repeatedly sampling points. When specialized to other settings, we obtain new average-case results for settings including the SE kernel with noise and the Mat\'ern kernel with misspecification. Finally, we present algorithm-independent lower bounds that have greater generality and/or give distinct proofs compared to existing ones.

In the usual Bayesian setting, a full probabilistic model is required to link the data and parameters, and the form of this model and the inference and prediction mechanisms are specified via de Finetti's representation. In general, such a formulation is not robust to model mis-specification of its component parts. An alternative approach is to draw inference based on loss functions, where the quantity of interest is defined as a minimizer of some expected loss, and to construct posterior distributions based on the loss-based formulation; this strategy underpins the construction of the Gibbs posterior. We develop a Bayesian non-parametric approach; specifically, we generalize the Bayesian bootstrap, and specify a Dirichlet process model for the distribution of the observables. We implement this using direct prior-to-posterior calculations, but also using predictive sampling. We also study the assessment of posterior validity for non-standard Bayesian calculations, and provide an efficient way to calibrate the scaling parameter in the Gibbs posterior so that it can achieve the desired coverage rate. We show that the developed non-standard Bayesian updating procedures yield valid posterior distributions in terms of consistency and asymptotic normality under model mis-specification. Simulation studies show that the proposed methods can recover the true value of the parameter efficiently and achieve frequentist coverage even when the sample size is small. Finally, we apply our methods to evaluate the causal impact of speed cameras on traffic collisions in England.

In this work, we present a deterministic algorithm for computing the entire weight distribution of polar codes. As the first step, we derive an efficient recursive procedure to compute the weight distribution that arises in successive cancellation decoding of polar codes along any decoding path. This solves the open problem recently posed by Polyanskaya, Davletshin, and Polyanskii. Using this recursive procedure, at code length n, we can compute the weight distribution of any polar cosets in time O(n^2). We show that any polar code can be represented as a disjoint union of such polar cosets; moreover, this representation extends to polar codes with dynamically frozen bits. However, the number of polar cosets in such representation scales exponentially with a parameter introduced herein, which we call the mixing factor. To upper bound the complexity of our algorithm for polar codes being decreasing monomial codes, we study the range of their mixing factors. We prove that among all decreasing monomial codes with rates at most 1/2, self-dual Reed-Muller codes have the largest mixing factors. To further reduce the complexity of our algorithm, we make use of the fact that, as decreasing monomial codes, polar codes have a large automorphism group. That automorphism group includes the block lower-triangular affine group (BLTA), which in turn contains the lower-triangular affine group (LTA). We prove that a subgroup of LTA acts transitively on certain subsets of decreasing monomial codes, thereby drastically reducing the number of polar cosets that we need to evaluate. This complexity reduction makes it possible to compute the weight distribution of polar codes at length n = 128.

北京阿比特科技有限公司