苹果电影在线观看免费高清_啊在线不卡视频无码_伊人成综合网伊人222_色婷婷六月亚洲中文字幕_日韩伦理免费在线_国产精品综合一区久久久久久久_一区二区不卡性爱无码在线播放

Formal software verification techniques are widely used to specify and prove the functional correctness of programs. However, nonfunctional properties such as time complexity are usually carried out with pen and paper. Inefficient code in terms of time complexity may cause massive performance problems in large-scale complex systems. We present a proof of concept for using the Dafny verification tool to specify and verify the worst-case time complexity of binary search. This approach can also be used for academic purposes as a new way to teach algorithms and complexity.

相關內容

binary

關注 1

binary · INFORMS · 可約的 · 通道 · Less ·

2021 年 10 月 6 日

On information rates over a binary input channel

Michael Peleg,Tomer Michaeli,Shlomo Shamai

from arxiv, in IEEE Open Journal of the Communications Society, 2021

We study communication systems over band-limited Additive White Gaussian Noise (AWGN) channels in which the transmitter output is constrained to be symmetric binary (bi-polar). In this work we improve the original Ozarov-Wyner-Ziv (OWZ) lower bound on capacity by introducing new achievability schemes with two advantages over the studied OWZ scheme which is based on peak-power constrained pulse-amplitude modulation. Our schemes achieve a moderately improved information rate and do so with much less sign transitions of the binary signal. The gap between the known upper-bound based on spectral constrains of bi-polar signals and our achievable lower bound is reduced to 0.93 bits per Nyquist interval at high SNR.

SimPLe · 優化器 · 全 · 相同 · 算法與數據結構 ·

2021 年 10 月 4 日

A Simple Algorithm for Optimal Search Trees with Two-Way Comparisons

Marek Chrobak,Mordecai Golin,J. Ian Munro,Neal E. Young

from arxiv, v3 adds Appendix B, with a stronger alternative to Theorem 1

We present a simple $O(n^4)$-time algorithm for computing optimal search trees with two-way comparisons. The only previous solution to this problem, by Anderson et al., has the same running time, but is significantly more complicated and is restricted to the variant where only successful queries are allowed. Our algorithm extends directly to solve the standard full variant of the problem, which also allows unsuccessful queries and for which no polynomial-time algorithm was previously known. The correctness proof of our algorithm relies on a new structural theorem for two-way-comparison search trees.

binary · 可約的 · 優化器 · 原點 · 泛函 ·

2021 年 10 月 3 日

Binary code optimization

Parviz Gharehbagheri,Sayeed Hamid Haji Sayeed Javadi,Parvaneh Asghari,Naser Gharehbagheri

from arxiv, in Persian language

This article shows that any type of binary data can be defined as a collection from codewords of variable length. This feature helps us to define an Injective and surjective function from the suggested codewords to the required codewords. Therefore, by replacing the new codewords, the binary data becomes another binary data regarding the intended goals. One of these goals is to reduce data size. It means that instead of the original codewords of each binary data, it replaced the Huffman codewords to reduce the data size. One of the features of this method is the result of positive compression for any type of binary data, that is, regardless of the size of the code table, the difference between the original data size and the data size after compression will be greater than or equal to zero. Another important and practical feature of this method is the use of symmetric codewords instead of the suggested codewords in order to create symmetry, reversibility and error resistance properties with two-way decoding.

權值向量 · tuning · Weight · 線性分類 · 向量化 ·

2021 年 10 月 1 日

Weight Vector Tuning and Asymptotic Analysis of Binary Linear Classifiers

Lama B. Niyazi,Abla Kammoun,Hayssam Dahrouj,Mohamed-Slim Alouini,Tareq Al-Naffouri

Unlike its intercept, a linear classifier's weight vector cannot be tuned by a simple grid search. Hence, this paper proposes weight vector tuning of a generic binary linear classifier through the parameterization of a decomposition of the discriminant by a scalar which controls the trade-off between conflicting informative and noisy terms. By varying this parameter, the original weight vector is modified in a meaningful way. Applying this method to a number of linear classifiers under a variety of data dimensionality and sample size settings reveals that the classification performance loss due to non-optimal native hyperparameters can be compensated for by weight vector tuning. This yields computational savings as the proposed tuning method reduces to tuning a scalar compared to tuning the native hyperparameter, which may involve repeated weight vector generation along with its burden of optimization, dimensionality reduction, etc., depending on the classifier. It is also found that weight vector tuning significantly improves the performance of Linear Discriminant Analysis (LDA) under high estimation noise. Proceeding from this second finding, an asymptotic study of the misclassification probability of the parameterized LDA classifier in the growth regime where the data dimensionality and sample size are comparable is conducted. Using random matrix theory, the misclassification probability is shown to converge to a quantity that is a function of the true statistics of the data. Additionally, an estimator of the misclassification probability is derived. Finally, computationally efficient tuning of the parameter using this estimator is demonstrated on real data.

邊緣化 · 對率損失 · FAST · 線性分類 · Performer ·

2021 年 7 月 1 日

Fast Margin Maximization via Dual Acceleration

Ziwei Ji,Nathan Srebro,Matus Telgarsky

from arxiv, ICML 2021

We present and analyze a momentum-based gradient method for training linear classifiers with an exponentially-tailed loss (e.g., the exponential or logistic loss), which maximizes the classification margin on separable data at a rate of $\widetilde{\mathcal{O}}(1/t^2)$. This contrasts with a rate of $\mathcal{O}(1/\log(t))$ for standard gradient descent, and $\mathcal{O}(1/t)$ for normalized gradient descent. This momentum-based method is derived via the convex dual of the maximum-margin problem, and specifically by applying Nesterov acceleration to this dual, which manages to result in a simple and intuitive method in the primal. This dual view can also be used to derive a stochastic variant, which performs adaptive non-uniform sampling via the dual variables.

Networking · SimPLe · Automator · INFORMS · Prompt ·

2021 年 6 月 11 日

Neural Architecture Search without Training

Joseph Mellor,Jack Turner,Amos Storkey,Elliot J. Crowley

from arxiv, Accepted at ICML 2021 for a long presentation

The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.

Performer · 優化器 · BERT · 可約的 · Neural Networks ·

2019 年 9 月 25 日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Yang You,Jing Li,Sashank Reddi,Jonathan Hseu,Sanjiv Kumar,Srinadh Bhojanapalli,Xiaodan Song,James Demmel,Kurt Keutzer,Cho-Jui Hsieh

from arxiv, This paper has not been reviewed by any conference

Training large deep neural networks on massive datasets is computationally very challenging. There has been recent surge in interest in using large batch stochastic optimization methods to tackle this issue. The most prominent algorithm in this line of research is LARS, which by employing layerwise adaptive learning rates trains ResNet on ImageNet in a few minutes. However, LARS performs poorly for attention models like BERT, indicating that its performance gains are not consistent across tasks. In this paper, we first study a principled layerwise adaptation strategy to accelerate training of deep neural networks using large mini-batches. Using this strategy, we develop a new layerwise adaptive large batch optimization technique called LAMB; we then provide convergence analysis of LAMB as well as LARS, showing convergence to a stationary point in general nonconvex settings. Our empirical results demonstrate the superior performance of LAMB across various tasks such as BERT and ResNet-50 training with very little hyperparameter tuning. In particular, for BERT training, our optimizer enables use of very large batch sizes of 32868 without any degradation of performance. By increasing the batch size to the memory limit of a TPUv3 Pod, BERT training time can be reduced from 3 days to just 76 minutes (Table 1).

Automator · Performer · 估計/估計量 · Processing（編程語言） · 語音識別 ·

2018 年 9 月 5 日

Neural Architecture Search: A Survey

Thomas Elsken,Jan Hendrik Metzen,Frank Hutter

Deep Learning has enabled remarkable progress over the last years on a variety of tasks, such as image recognition, speech recognition, and machine translation. One crucial aspect for this progress are novel neural architectures. Currently employed architectures have mostly been developed manually by human experts, which is a time-consuming and error-prone process. Because of this, there is growing interest in automated neural architecture search methods. We provide an overview of existing work in this field of research and categorize them according to three dimensions: search space, search strategy, and performance estimation strategy.

MoDELS · SimPLe · CC · 模型評估 · 高斯混合（模型） ·

2018 年 2 月 24 日

The Search Problem in Mixture Models

Avik Ray,Joe Neeman,Sujay Sanghavi,Sanjay Shakkottai

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

估計/估計量 · 穩健性 · 容差 · 樣本復雜度 · TCS ·

2017 年 12 月 14 日

Being Robust (in High Dimensions) Can Be Practical

Ilias Diakonikolas,Gautam Kamath,Daniel M. Kane,Jerry Li,Ankur Moitra,Alistair Stewart

from arxiv, Appeared in ICML 2017

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.