亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Polynomial based approaches, such as the Mat-Dot and entangled polynomial (EP) codes have been used extensively within coded matrix computations to obtain schemes with good thresholds. However, these schemes are well-recognized to suffer from poor numerical stability in decoding. Moreover, the encoding process in these schemes involves linearly combining a large number of input submatrices, i.e., the encoding weight is high. For the practically relevant case of sparse input matrices, this can have the undesirable effect of significantly increasing the worker node computation time. In this work, we propose a generalization of the EP scheme by combining the idea of gradient coding along with the basic EP encoding. Our scheme allows us to reduce the weight of the encoding and arrive at schemes that exhibit much better numerical stability; this is achieved at the expense of a worse threshold. By appropriately setting parameters in our scheme, we recover several well-known schemes in the literature. Simulation results show that our scheme provides excellent numerical stability and fast computation speed (for sparse input matrices) as compared to EPC and Mat-Dot codes.

相關內容

Consider the unsupervised classification problem in random hypergraphs under the non-uniform \emph{Hypergraph Stochastic Block Model} (HSBM) with two equal-sized communities ($n/2$), where each edge appears independently with some probability depending only on the labels of its vertices. In this paper, an \emph{information-theoretical} threshold for strong consistency is established. Below the threshold, every algorithm would misclassify at least two vertices with high probability, and the expected \emph{mismatch ratio} of the eigenvector estimator is upper bounded by $n$ to the power of minus the threshold. On the other hand, when above the threshold, despite the information loss induced by tensor contraction, one-stage spectral algorithms assign every vertex correctly with high probability when only given the contracted adjacency matrix, even if \emph{semidefinite programming} (SDP) fails in some scenarios. Moreover, strong consistency is achievable by aggregating information from all uniform layers, even if it is impossible when each layer is considered alone. Our conclusions are supported by both theoretical analysis and numerical experiments.

We develop an algorithmic framework that finds an optimal solution by enumerating some feasible solutions, which number is bounded by a specially derived Variable Parameter (VP) with a favorable asymptotic behavior. We build a VP algorithm for a strongly $\mathsf{NP}$-hard single-machine scheduling problem. The target VP $\nu$ is the number of jobs with some special properties, the so-called emerging jobs. At phase 1 a partial solution including $n-\nu$ non-emerging jobs is constructed in a low degree polynomial time. At phase 2 less than $\nu!$ permutations of the $\nu$ emerging jobs are considered, each of them being incorporated into the partial schedule of phase 1. Based on an earlier conducted experimental study, in practice, $\nu/n$ varied from $1/4$ for small problem instances to $1/10$ for the largest tested instances. We illustrate how the proposed method can be used to build a polynomial-time approximation scheme (PTAS) with the worst-case time complexity $O(\kappa!\kappa k n \log n)$, where $\kappa$, $\kappa<\nu< n$, is a VP and the corresponding approximation factor is $1+1/k$, with $k\kappa<k$. This is better than the time complexity of the earlier known approximation schemes. Using an intuitive probabilistic model, we give more realistic bounds on the running time of the VP algorithm and the PTAS, which are far below the worst-case bounds $\nu!$ and $\kappa!$.

We present a new approach to semiparametric inference using corrected posterior distributions. The method allows us to leverage the adaptivity, regularization and predictive power of nonparametric Bayesian procedures to estimate low-dimensional functionals of interest without being restricted by the holistic Bayesian formalism. Starting from a conventional nonparametric posterior, we target the functional of interest by transforming the entire distribution with a Bayesian bootstrap correction. We provide conditions for the resulting $\textit{one-step posterior}$ to possess calibrated frequentist properties and specialize the results for several canonical examples: the integrated squared density, the mean of a missing-at-random outcome, and the average causal treatment effect on the treated. The procedure is computationally attractive, requiring only a simple, efficient post-processing step that can be attached onto any arbitrary posterior sampling algorithm. Using the ACIC 2016 causal data analysis competition, we illustrate that our approach can outperform the existing state-of-the-art through the propagation of Bayesian uncertainty.

Matrix valued data has become increasingly prevalent in many applications. Most of the existing clustering methods for this type of data are tailored to the mean model and do not account for the dependence structure of the features, which can be very informative, especially in high-dimensional settings. To extract the information from the dependence structure for clustering, we propose a new latent variable model for the features arranged in matrix form, with some unknown membership matrices representing the clusters for the rows and columns. Under this model, we further propose a class of hierarchical clustering algorithms using the difference of a weighted covariance matrix as the dissimilarity measure. Theoretically, we show that under mild conditions, our algorithm attains clustering consistency in the high-dimensional setting. While this consistency result holds for our algorithm with a broad class of weighted covariance matrices, the conditions for this result depend on the choice of the weight. To investigate how the weight affects the theoretical performance of our algorithm, we establish the minimax lower bound for clustering under our latent variable model. Given these results, we identify the optimal weight in the sense that using this weight guarantees our algorithm to be minimax rate-optimal in terms of the magnitude of some cluster separation metric. The practical implementation of our algorithm with the optimal weight is also discussed. Finally, we conduct simulation studies to evaluate the finite sample performance of our algorithm and apply the method to a genomic dataset.

Hamilton and Moitra (2021) showed that, in certain regimes, it is not possible to accelerate Riemannian gradient descent in the hyperbolic plane if we restrict ourselves to algorithms which make queries in a (large) bounded domain and which receive gradients and function values corrupted by a (small) amount of noise. We show that acceleration remains unachievable for any deterministic algorithm which receives exact gradient and function-value information (unbounded queries, no noise). Our results hold for the classes of strongly and nonstrongly geodesically convex functions, and for a large class of Hadamard manifolds including hyperbolic spaces and the symmetric space $\mathrm{SL}(n) / \mathrm{SO}(n)$ of positive definite $n \times n$ matrices of determinant one. This cements a surprising gap between the complexity of convex optimization and geodesically convex optimization: for hyperbolic spaces, Riemannian gradient descent is optimal on the class of smooth and and strongly geodesically convex functions, in the regime where the condition number scales with the radius of the optimization domain. The key idea for proving the lower bound consists of perturbing the hard functions of Hamilton and Moitra (2021) with sums of bump functions chosen by a resisting oracle.

We approximate the d complex zeros of a univariate polynomial p(x) of a degree d or those zeros that lie in a fixed region of interest on the complex plane such as a disc or a square. Our divide and conquer algorithm of STOC 1995 supports solution of this problem in optimal Boolean time (up to a poly-logarithmic factor), that is, runs nearly as fast as one can access the coefficients of p with the precision necessary to support required accuracy of the output. That record complexity has not been matched by any other algorithm yet, but our root-finder of 1995 is quite involved and has never been implemented. We present alternative nearly optimal root-finders based on our novel variants of the classical subdivision iterations. Unlike our predecessor of 1995, we require randomization of Las Vegas type, allowing us to detect any output error at a dominated computational cost, but our new root-finders are much simpler to implement than their predecessor of 1995. According to the results of extensive test with standard test polynomials for their preliminary version, which incorporates only a part of our novel techniques, the new root-finders compete and for a large class of inputs significantly supersedes the package of root-finding subroutines MPSolve, which for decades has been user's choice package. Unlike our predecessor of 1995 and all known fast algorithms for the cited tasks of polynomial root-finding, our new algorithms can be also applied to a polynomial given by a black box oracle for its evaluation rather than by its coefficients. This makes our root-finders particularly efficient for polynomials p(x) that can be evaluated fast such as the Mandelbrot polynomials or those given by the sum of a small number of shifted monomials. Our algorithm can be readily extended to fast approximation of the eigenvalues of a matrix or a matrix polynomial.

The proliferation of automated data collection schemes and the advances in sensorics are increasing the amount of data we are able to monitor in real-time. However, given the high annotation costs and the time required by quality inspections, data is often available in an unlabeled form. This is fostering the use of active learning for the development of soft sensors and predictive models. In production, instead of performing random inspections to obtain product information, labels are collected by evaluating the information content of the unlabeled data. Several query strategy frameworks for regression have been proposed in the literature but most of the focus has been dedicated to the static pool-based scenario. In this work, we propose a new strategy for the stream-based scenario, where instances are sequentially offered to the learner, which must instantaneously decide whether to perform the quality check to obtain the label or discard the instance. The approach is inspired by the optimal experimental design theory and the iterative aspect of the decision-making process is tackled by setting a threshold on the informativeness of the unlabeled data points. The proposed approach is evaluated using numerical simulations and the Tennessee Eastman Process simulator. The results confirm that selecting the examples suggested by the proposed algorithm allows for a faster reduction in the prediction error.

Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions. However, existing constraints on MBC architectures lead to models with limited expressive power. Additionally, prior work has not addressed how to deal with large sets during training when the full set gradient is required. To address these issues, we propose a Universally MBC (UMBC) class of set functions which can be used in conjunction with arbitrary non-MBC components while still satisfying MBC, enabling a wider range of function classes to be used in MBC settings. Furthermore, we propose an efficient MBC training algorithm which gives an unbiased approximation of the full set gradient and has a constant memory overhead for any set size for both train- and test-time. We conduct extensive experiments including image completion, text classification, unsupervised clustering, and cancer detection on high-resolution images to verify the efficiency and efficacy of our scalable set encoding framework. Our code is available at github.com/jeffwillette/umbc

The design of codes for feedback-enabled communications has been a long-standing open problem. Recent research on non-linear, deep learning-based coding schemes have demonstrated significant improvements in communication reliability over linear codes, but are still vulnerable to the presence of forward and feedback noise over the channel. In this paper, we develop a new family of non-linear feedback codes that greatly enhance robustness to channel noise. Our autoencoder-based architecture is designed to learn codes based on consecutive blocks of bits, which obtains de-noising advantages over bit-by-bit processing to help overcome the physical separation between the encoder and decoder over a noisy channel. Moreover, we develop a power control layer at the encoder to explicitly incorporate hardware constraints into the learning optimization, and prove that the resulting average power constraint is satisfied asymptotically. Numerical experiments demonstrate that our scheme outperforms state-of-the-art feedback codes by wide margins over practical forward and feedback noise regimes, and provide information-theoretic insights on the behavior of our non-linear codes. Moreover, we observe that, in a long blocklength regime, canonical error correction codes are still preferable to feedback codes when the feedback noise becomes high.

A mainstream type of current self-supervised learning methods pursues a general-purpose representation that can be well transferred to downstream tasks, typically by optimizing on a given pretext task such as instance discrimination. In this work, we argue that existing pretext tasks inevitably introduce biases into the learned representation, which in turn leads to biased transfer performance on various downstream tasks. To cope with this issue, we propose Maximum Entropy Coding (MEC), a more principled objective that explicitly optimizes on the structure of the representation, so that the learned representation is less biased and thus generalizes better to unseen downstream tasks. Inspired by the principle of maximum entropy in information theory, we hypothesize that a generalizable representation should be the one that admits the maximum entropy among all plausible representations. To make the objective end-to-end trainable, we propose to leverage the minimal coding length in lossy data coding as a computationally tractable surrogate for the entropy, and further derive a scalable reformulation of the objective that allows fast computation. Extensive experiments demonstrate that MEC learns a more generalizable representation than previous methods based on specific pretext tasks. It achieves state-of-the-art performance consistently on various downstream tasks, including not only ImageNet linear probe, but also semi-supervised classification, object detection, instance segmentation, and object tracking. Interestingly, we show that existing batch-wise and feature-wise self-supervised objectives could be seen equivalent to low-order approximations of MEC. Code and pre-trained models are available at //github.com/xinliu20/MEC.

北京阿比特科技有限公司