亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Consider the setting where a $\rho$-sparse Rademacher vector is planted in a random $d$-dimensional subspace of $R^n$. A classical question is how to recover this planted vector given a random basis in this subspace. A recent result by [ZSWB21] showed that the Lattice basis reduction algorithm can recover the planted vector when $n\geq d+1$. Although the algorithm is not expected to tolerate inverse polynomial amount of noise, it is surprising because it was previously shown that recovery cannot be achieved by low degree polynomials when $n\ll \rho^2 d^{2}$ [MW21]. A natural question is whether we can derive an Statistical Query (SQ) lower bound matching the previous low degree lower bound in [MW21]. This will - imply that the SQ lower bound can be surpassed by lattice based algorithms; - predict the computational hardness when the planted vector is perturbed by inverse polynomial amount of noise. In this paper, we prove such an SQ lower bound. In particular, we show that super-polynomial number of VSTAT queries is needed to solve the easier statistical testing problem when $n\ll \rho^2 d^{2}$ and $\rho\gg \frac{1}{\sqrt{d}}$. The most notable technique we used to derive the SQ lower bound is the almost equivalence relationship between SQ lower bound and low degree lower bound [BBH+20, MW21].

相關內容

Learning precise surrogate models of complex computer simulations and physical machines often require long-lasting or expensive experiments. Furthermore, the modeled physical dependencies exhibit nonlinear and nonstationary behavior. Machine learning methods that are used to produce the surrogate model should therefore address these problems by providing a scheme to keep the number of queries small, e.g. by using active learning and be able to capture the nonlinear and nonstationary properties of the system. One way of modeling the nonstationarity is to induce input-partitioning, a principle that has proven to be advantageous in active learning for Gaussian processes. However, these methods either assume a known partitioning, need to introduce complex sampling schemes or rely on very simple geometries. In this work, we present a simple, yet powerful kernel family that incorporates a partitioning that: i) is learnable via gradient-based methods, ii) uses a geometry that is more flexible than previous ones, while still being applicable in the low data regime. Thus, it provides a good prior for active learning procedures. We empirically demonstrate excellent performance on various active learning tasks.

Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body $K$ of diameter $\Delta$ in $\textbf{R}^d$ for fixed $d$. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. It is known from classical results of Dudley (1974) and Bronshteyn and Ivanov (1976) that $\Theta((\Delta/\varepsilon)^{(d-1)/2})$ vertices (alternatively, facets) are both necessary and sufficient. While this bound is tight in the worst case, that of Euclidean balls, it is far from optimal for skinny convex bodies. A natural way to characterize a convex object's skinniness is in terms of its relationship to the Euclidean ball. Given a convex body $K$, define its \emph{volume diameter} $\Delta_d$ to be the diameter of a Euclidean ball of the same volume as $K$, and define its \emph{surface diameter} $\Delta_{d-1}$ analogously for surface area. It follows from generalizations of the isoperimetric inequality that $\Delta \geq \Delta_{d-1} \geq \Delta_d$. Arya, da Fonseca, and Mount (SoCG 2012) demonstrated that the diameter-based bound could be made surface-area sensitive, improving the above bound to $O((\Delta_{d-1}/\varepsilon)^{(d-1)/2})$. In this paper, we strengthen this by proving the existence of an approximation with $O((\Delta_d/\varepsilon)^{(d-1)/2})$ facets.

The support vector machines (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the non-differentiability of the SVM hinge loss function can lead to computational difficulties in high dimensional settings. To overcome this problem, we rely on Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVM), which is suitable for the high dimension $p >> n$ regime. As the BernSVM objective loss function is of the class $C^2$, we propose two efficient algorithms for computing the solution of the penalized BernSVM. The first algorithm is based on coordinate descent with maximization-majorization (MM) principle and the second one is IRLS-type algorithm (iterative re-weighted least squares). Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted Lasso BernSVM estimator. Using a local linear approximation, we extend the latter result to penalized BernSVM with non convex penalties SCAD and MCP. Our bound holds with high probability and achieves a rate of order $\sqrt{s\log(p)/n}$, where $s$ is the number of active features. Simulation studies are considered to illustrate the prediction accuracy of BernSVM to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.

This paper is concerned with low-rank matrix optimization, which has found a wide range of applications in machine learning. This problem in the special case of matrix sensing has been studied extensively through the notion of Restricted Isometry Property (RIP), leading to a wealth of results on the geometric landscape of the problem and the convergence rate of common algorithms. However, the existing results can handle the problem in the case with a general objective function subject to noisy data only when the RIP constant is close to 0. In this paper, we develop a new mathematical framework to solve the above-mentioned problem with a far less restrictive RIP constant. We prove that as long as the RIP constant of the noiseless objective is less than $1/3$, any spurious local solution of the noisy optimization problem must be close to the ground truth solution. By working through the strict saddle property, we also show that an approximate solution can be found in polynomial time. We characterize the geometry of the spurious local minima of the problem in a local region around the ground truth in the case when the RIP constant is greater than $1/3$. Compared to the existing results in the literature, this paper offers the strongest RIP bound and provides a complete theoretical analysis on the global and local optimization landscapes of general low-rank optimization problems under random corruptions from any finite-variance family.

We investigate the problem of bandits with expert advice when the experts are fixed and known distributions over the actions. Improving on previous analyses, we show that the regret in this setting is controlled by information-theoretic quantities that measure the similarity between experts. In some natural special cases, this allows us to obtain the first regret bound for EXP4 that can get arbitrarily close to zero if the experts are similar enough. While for a different algorithm, we provide another bound that describes the similarity between the experts in terms of the KL-divergence, and we show that this bound can be smaller than the one of EXP4 in some cases. Additionally, we provide lower bounds for certain classes of experts showing that the algorithms we analyzed are nearly optimal in some cases.

In literature, the cost of a partitioned fluid-structure interaction scheme is typically assessed by the number of coupling iterations required per time step, while ignoring the internal iterations within the nonlinear subproblems. In this work, we demonstrate that these internal iterations have a significant influence on the computational cost of the coupled simulation. Particular attention is paid to how limiting the number of iterations within each solver call can shorten the overall run time, as it avoids polishing the subproblem solution using unconverged coupling data. Based on systematic parameter studies, we investigate the optimal number of subproblem iterations per coupling step. Lastly, this work proposes a new convergence criterion for coupled systems that is based on the residuals of the subproblems and therefore does not require any additional convergence tolerance for the coupling loop.

Following the breakthrough work of Tardos in the bit-complexity model, Vavasis and Ye gave the first exact algorithm for linear programming in the real model of computation with running time depending only on the constraint matrix. For solving a linear program (LP) $\max\, c^\top x,\: Ax = b,\: x \geq 0,\: A \in \mathbb{R}^{m \times n}$, Vavasis and Ye developed a primal-dual interior point method using a 'layered least squares' (LLS) step, and showed that $O(n^{3.5} \log (\bar{\chi}_A+n))$ iterations suffice to solve (LP) exactly, where $\bar{\chi}_A$ is a condition measure controlling the size of solutions to linear systems related to $A$. Monteiro and Tsuchiya, noting that the central path is invariant under rescalings of the columns of $A$ and $c$, asked whether there exists an LP algorithm depending instead on the measure $\bar{\chi}^*_A$, defined as the minimum $\bar{\chi}_{AD}$ value achievable by a column rescaling $AD$ of $A$, and gave strong evidence that this should be the case. We resolve this open question affirmatively. Our first main contribution is an $O(m^2 n^2 + n^3)$ time algorithm which works on the linear matroid of $A$ to compute a nearly optimal diagonal rescaling $D$ satisfying $\bar{\chi}_{AD} \leq n(\bar{\chi}^*)^3$. This algorithm also allows us to approximate the value of $\bar{\chi}_A$ up to a factor $n (\bar{\chi}^*)^2$. As our second main contribution, we develop a scaling invariant LLS algorithm, together with a refined potential function based analysis for LLS algorithms in general. With this analysis, we derive an improved $O(n^{2.5} \log n\log (\bar{\chi}^*_A+n))$ iteration bound for optimally solving (LP) using our algorithm. The same argument also yields a factor $n/\log n$ improvement on the iteration complexity bound of the original Vavasis-Ye algorithm.

Our research deals with the optimization version of the set partition problem, where the objective is to minimize the absolute difference between the sums of the two disjoint partitions. Although this problem is known to be NP-hard and requires exponential time to solve, we propose a less demanding version of this problem where the goal is to find a locally optimal solution. In our approach, we consider the local optimality in respect to any movement of at most two elements. To accomplish this, we developed an algorithm that can generate a locally optimal solution in at most $O(N^2)$ time and $O(N)$ space. Our algorithm can handle arbitrary input precisions and does not require positive or integer inputs. Hence, it can be applied in various problem scenarios with ease.

In this article, we propose a class of $L_q$-norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between $L_q$-norm based U-statistics for several $q$s under mild moment and cumulant conditions. A simple combination of two studentized $L_q$-based test statistics via their $p$-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al. (2021), which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of $L_q$-norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from $O(n^{qr})$, which is required for the calculation of the full U-statistic, to $O(n^r)$ where $r$ is the order of the kernel. Numerical studies further corroborate the advantage of the proposed adaptive test as compared to some existing competitors.

The Chebyshev or $\ell_{\infty}$ estimator is an unconventional alternative to the ordinary least squares in solving linear regressions. It is defined as the minimizer of the $\ell_{\infty}$ objective function \begin{align*} \hat{\boldsymbol{\beta}} := \arg\min_{\boldsymbol{\beta}} \|\boldsymbol{Y} - \mathbf{X}\boldsymbol{\beta}\|_{\infty}. \end{align*} The asymptotic distribution of the Chebyshev estimator under fixed number of covariates was recently studied (Knight, 2020), yet finite sample guarantees and generalizations to high-dimensional settings remain open. In this paper, we develop non-asymptotic upper bounds on the estimation error $\|\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}^*\|_2$ for a Chebyshev estimator $\hat{\boldsymbol{\beta}}$, in a regression setting with uniformly distributed noise $\varepsilon_i\sim U([-a,a])$ where $a$ is either known or unknown. With relatively mild assumptions on the (random) design matrix $\mathbf{X}$, we can bound the error rate by $\frac{C_p}{n}$ with high probability, for some constant $C_p$ depending on the dimension $p$ and the law of the design. Furthermore, we illustrate that there exist designs for which the Chebyshev estimator is (nearly) minimax optimal. On the other hand we also argue that there exist designs for which this estimator behaves sub-optimally in terms of the constant $C_p$'s dependence on $p$. In addition we show that "Chebyshev's LASSO" has advantages over the regular LASSO in high dimensional situations, provided that the noise is uniform. Specifically, we argue that it achieves a much faster rate of estimation under certain assumptions on the growth rate of the sparsity level and the ambient dimension with respect to the sample size.

北京阿比特科技有限公司