亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

List-decoding and list-recovery are important generalizations of unique decoding that received considerable attention over the years. However, the optimal trade-off among list-decoding (resp. list-recovery) radius, list size, and the code rate are not fully understood in both problems. This paper takes a step towards this direction when the list size is a given constant and the alphabet size is large (as a function of the code length). We prove a new Singleton-type upper bound for list-decodable codes, which improves upon the previously known bound by roughly a factor of $1/L$, where $L$ is the list size. We also prove a Singleton-type upper bound for list-recoverable codes, which is to the best of our knowledge, the first such bound for list-recovery. We apply these results to obtain new lower bounds that are optimal up to a multiplicative constant on the list size for list-decodable and list-recoverable codes with rates approaching capacity. Moreover, we show that list-decodable \emph{nonlinear} codes can strictly outperform list-decodable linear codes. More precisely, we show that there is a gap for a wide range of parameters, which grows fast with the alphabet size, between the size of the largest list-decodable nonlinear code and the size of the largest list-decodable linear codes. This is achieved by a novel connection between list-decoding and the notion of sparse hypergraphs in extremal combinatorics. We remark that such a gap is not known to exist in the problem of unique decoding. Lastly, we show that list-decodability or recoverability of codes implies in some sense good unique decodability.

相關內容

We provide the first generalization error analysis for black-box learning through derivative-free optimization. Under the assumption of a Lipschitz and smooth unknown loss, we consider the Zeroth-order Stochastic Search (ZoSS) algorithm, that updates a $d$-dimensional model by replacing stochastic gradient directions with stochastic differences of $K+1$ perturbed loss evaluations per dataset (example) query. For both unbounded and bounded possibly nonconvex losses, we present the first generalization bounds for the ZoSS algorithm. These bounds coincide with those for SGD, and rather surprisingly are independent of $d$, $K$ and the batch size $m$, under appropriate choices of a slightly decreased learning rate. For bounded nonconvex losses and a batch size $m=1$, we additionally show that both generalization error and learning rate are independent of $d$ and $K$, and remain essentially the same as for the SGD, even for two function evaluations. Our results extensively extend and consistently recover established results for SGD in prior work, on both generalization bounds and corresponding learning rates. If additionally $m=n$, where $n$ is the dataset size, we derive generalization guarantees for full-batch GD as well.

Recently, codes in the sum-rank metric attracted attention due to several applications in e.g. multishot network coding, distributed storage and quantum-resistant cryptography. The sum-rank analogs of Reed-Solomon and Gabidulin codes are linearized Reed-Solomon codes. We show how to construct $h$-folded linearized Reed-Solomon (FLRS) codes and derive an interpolation-based decoding scheme that is capable of correcting sum-rank errors beyond the unique decoding radius. The presented decoder can be used for either list or probabilistic unique decoding and requires at most $\mathcal{O}(sn^2)$ operations in $\mathbb{F}_{q^m}$, where $s \leq h$ is an interpolation parameter and $n$ denotes the length of the unfolded code. We derive a heuristic upper bound on the failure probability of the probabilistic unique decoder and verify the results via Monte Carlo simulations.

Sparse binary matrices are of great interest in the field of compressed sensing. This class of matrices make possible to perform signal recovery with lower storage costs and faster decoding algorithms. In particular, random matrices formed by i.i.d Bernoulli $p$ random variables are of practical relevance in the context of nonnegative sparse recovery. In this work, we investigate the robust nullspace property of sparse Bernoulli $p$ matrices. Previous results in the literature establish that such matrices can accurately recover $n$-dimensional $s$-sparse vectors with $m=O\left (\frac{s}{c(p)}\log\frac{en}{s}\right )$ measurements, where $c(p) \le p$ is a constant that only depends on the parameter $p$. These results suggest that, when $p$ vanishes, the sparse Bernoulli matrix requires considerably more measurements than the minimal necessary achieved by the standard isotropic subgaussian designs. We show that this is not true. Our main result characterizes, for a wide range of levels sparsity $s$, the smallest $p$ such that it is possible to perform sparse recovery with the minimal number of measurements. We also provide matching lower bounds to establish the optimality of our results.

In this paper, we study the convergence properties of off-policy policy improvement algorithms with state-action density ratio correction under function approximation setting, where the objective function is formulated as a max-max-min optimization problem. We characterize the bias of the learning objective and present two strategies with finite-time convergence guarantees. In our first strategy, we present algorithm P-SREDA with convergence rate $O(\epsilon^{-3})$, whose dependency on $\epsilon$ is optimal. In our second strategy, we propose a new off-policy actor-critic style algorithm named O-SPIM. We prove that O-SPIM converges to a stationary point with total complexity $O(\epsilon^{-4})$, which matches the convergence rate of some recent actor-critic algorithms in the on-policy setting.

We use results from communication complexity, both new and old ones, to prove lower bounds for unambiguous finite automata (UFAs). We show three results. $\textit{Complement:}$ There is a language $L$ recognised by an $n$-state UFA such that the complement language $\overline{L}$ requires NFAs with $n^{\tilde{\Omega}(\log n)}$ states. This improves on a lower bound by Raskin. $\textit{Union:}$ There are languages $L_1$, $L_2$ recognised by $n$-state UFAs such that the union $L_1\cup L_2$ requires UFAs with $n^{\tilde{\Omega}(\log n)}$ states. $\textit{Separation:}$ There is a language $L$ such that both $L$ and $\overline{L}$ are recognised by $n$-state NFAs but such that $L$ requires UFAs with $n^{\Omega(\log n)}$ states. This refutes a conjecture by Colcombet.

Benign overfitting demonstrates that overparameterized models can perform well on test data while fitting noisy training data. However, it only considers the final min-norm solution in linear regression, which ignores the algorithm information and the corresponding training procedure. In this paper, we generalize the idea of benign overfitting to the whole training trajectory instead of the min-norm solution and derive a time-variant bound based on the trajectory analysis. Starting from the time-variant bound, we further derive a time interval that suffices to guarantee a consistent generalization error for a given feature covariance. Unlike existing approaches, the newly proposed generalization bound is characterized by a time-variant effective dimension of feature covariance. By introducing the time factor, we relax the strict assumption on the feature covariance matrix required in previous benign overfitting under the regimes of overparameterized linear regression with gradient descent. This paper extends the scope of benign overfitting, and experiment results indicate that the proposed bound accords better with empirical evidence.

We propose a metric -- Projection Norm -- to predict a model's performance on out-of-distribution (OOD) data without access to ground truth labels. Projection Norm first uses model predictions to pseudo-label test samples and then trains a new model on the pseudo-labels. The more the new model's parameters differ from an in-distribution model, the greater the predicted OOD error. Empirically, our approach outperforms existing methods on both image and text classification tasks and across different network architectures. Theoretically, we connect our approach to a bound on the test error for overparameterized linear models. Furthermore, we find that Projection Norm is the only approach that achieves non-trivial detection performance on adversarial examples. Our code is available at //github.com/yaodongyu/ProjNorm.

The theory of reinforcement learning currently suffers from a mismatch between its empirical performance and the theoretical characterization of its performance, with consequences for, e.g., the understanding of sample efficiency, safety, and robustness. The linear quadratic regulator with unknown dynamics is a fundamental reinforcement learning setting with significant structure in its dynamics and cost function, yet even in this setting there is a gap between the best known regret lower-bound of $\Omega_p(\sqrt{T})$ and the best known upper-bound of $O_p(\sqrt{T}\,\text{polylog}(T))$. The contribution of this paper is to close that gap by establishing a novel regret upper-bound of $O_p(\sqrt{T})$. Our proof is constructive in that it analyzes the regret of a concrete algorithm, and simultaneously establishes an estimation error bound on the dynamics of $O_p(T^{-1/4})$ which is also the first to match the rate of a known lower-bound. The two keys to our improved proof technique are (1) a more precise upper- and lower-bound on the system Gram matrix and (2) a self-bounding argument for the expected estimation error of the optimal controller.

This paper is concerned with list decoding of $2$-interleaved binary alternant codes. The principle of the proposed algorithm is based on a combination of a list decoding algorithm for (interleaved) Reed-Solomon codes and an algorithm for (non-interleaved) alternant codes. A new upper bound on the decoding radius is derived and the list size is shown to scale polynomially in the code parameters. While it remains an open problem whether this upper bound is achievable, the provided simulation results show that a decoding radius exceeding the binary Johnson radius can be achieved with a high probability of decoding success by the proposed algorithm.

We consider the well-studied problem of learning intersections of halfspaces under the Gaussian distribution in the challenging \emph{agnostic learning} model. Recent work of Diakonikolas et al. (2021) shows that any Statistical Query (SQ) algorithm for agnostically learning the class of intersections of $k$ halfspaces over $\mathbb{R}^n$ to constant excess error either must make queries of tolerance at most $n^{-\tilde{\Omega}(\sqrt{\log k})}$ or must make $2^{n^{\Omega(1)}}$ queries. We strengthen this result by improving the tolerance requirement to $n^{-\tilde{\Omega}(\log k)}$. This lower bound is essentially best possible since an SQ algorithm of Klivans et al. (2008) agnostically learns this class to any constant excess error using $n^{O(\log k)}$ queries of tolerance $n^{-O(\log k)}$. We prove two variants of our lower bound, each of which combines ingredients from Diakonikolas et al. (2021) with (an extension of) a different earlier approach for agnostic SQ lower bounds for the Boolean setting due to Dachman-Soled et al. (2014). Our approach also yields lower bounds for agnostically SQ learning the class of "convex subspace juntas" (studied by Vempala, 2010) and the class of sets with bounded Gaussian surface area; all of these lower bounds are nearly optimal since they essentially match known upper bounds from Klivans et al. (2008).

北京阿比特科技有限公司