亚洲十八禁无码在线免费观看,欧美亚州视频一区二区三区,在线香蕉免费一区二区三区

In this paper we discuss the application of Artificial Intelligence (AI) to the exemplary industrial use case of the two-dimensional commissioning problem in a high-bay storage, which essentially can be phrased as an instance of Traveling Salesperson Problem (TSP). We investigate the mlrose library that provides an TSP optimizer based on various heuristic optimization techniques. Our focus is on two methods, namely Genetic Algorithm (GA) and Hill Climbing (HC), which are provided by mlrose. We present improvements for both methods that yield shorter tour lengths, by moderately exploiting the problem structure of TSP. That is, the proposed improvements have a generic character and are not limited to TSP only.

相關內容

優化器

關注 4

近似 · 歐幾里得范數 · SimPLe · CASE · 算法與數據結構 ·

2023 年 5 月 12 日

The Approximation Ratio of the $k$-Opt Heuristic for the Euclidean Traveling Salesman Problem

Ulrich A. Brodowsky,Stefan Hougardy,Xianghui Zhong

from arxiv, this article supersedes arXiv:2010.02583

The $k$-Opt heuristic is a simple improvement heuristic for the Traveling Salesman Problem. It starts with an arbitrary tour and then repeatedly replaces $k$ edges of the tour by $k$ other edges, as long as this yields a shorter tour. We will prove that for 2-dimensional Euclidean Traveling Salesman Problems with $n$ cities the approximation ratio of the $k$-Opt heuristic is $\Theta(\log n / \log \log n)$. This improves the upper bound of $O(\log n)$ given by Chandra, Karloff, and Tovey in 1999 and provides for the first time a non-trivial lower bound for the case $k\ge 3$. Our results not only hold for the Euclidean norm but extend to arbitrary $p$-norms with $1 \le p < \infty$.

Analysis · Processing（編程語言） · Continuity · 數據分析 · INTERACT ·

2023 年 5 月 11 日

Training development for multisensory data analysis

Natasha María Monserrat Bertaina Lucero

from arxiv, 73 pages of work, in Spanish language, and 43 pages of annexes

Perception is a process that requires a great deal of mental processing, which provides the means by which one's concept of the environment is created and which helps one learn and interact with it. The compilation of previous studies throughout history has led to the conclusion that auditory performance improves when combined with visual stimuli and vice versa. Taking into account the previous consideration, in the present work the two sensory pathways (vision and hearing) were used with the intention of carrying out a series of multisensory training, which were presented in different instances and with the purpose of introducing sound as a signal detection tool. A web development was also included to create a site that would allow the execution of the designed training, which is still in development due to difficulties that arose and exceed the limits of this final work. The work described in this report gave rise to a future doctoral thesis, which has a CONICET scholarship, where the development of new training and the continuous development of the website that will allow its execution are proposed.

可辨認的 · 模型評估 · 全 · 對數幾率回歸 · 留一法 ·

2023 年 5 月 11 日

Using Full-Text Content to Characterize and Identify Best Seller Books

Giovana D. da Silva,Filipi N. Silva,Henrique F. de Arruda,Bárbara C. e Souza,Luciano da F. Costa,Diego R. Amancio

Artistic pieces can be studied from several perspectives, one example being their reception among readers over time. In the present work, we approach this interesting topic from the standpoint of literary works, particularly assessing the task of predicting whether a book will become a best seller. Dissimilarly from previous approaches, we focused on the full content of books and considered visualization and classification tasks. We employed visualization for the preliminary exploration of the data structure and properties, involving SemAxis and linear discriminant analyses. Then, to obtain quantitative and more objective results, we employed various classifiers. Such approaches were used along with a dataset containing (i) books published from 1895 to 1924 and consecrated as best sellers by the Publishers Weekly Bestseller Lists and (ii) literary works published in the same period but not being mentioned in that list. Our comparison of methods revealed that the best-achieved result - combining a bag-of-words representation with a logistic regression classifier - led to an average accuracy of 0.75 both for the leave-one-out and 10-fold cross-validations. Such an outcome suggests that it is unfeasible to predict the success of books with high accuracy using only the full content of the texts. Nevertheless, our findings provide insights into the factors leading to the relative success of a literary work.

INFORMS · MoDELS · 語言模型化 · 優化器 · 訓練數據 ·

2023 年 5 月 11 日

INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Large Language Models

H S V N S Kowndinya Renduchintala,Krishnateja Killamsetty,Sumit Bhatia,Milan Aggarwal,Ganesh Ramakrishnan,Rishabh Iyer,Balaji Krishnamurthy

A salient characteristic of large pre-trained language models (PTLMs) is a remarkable improvement in their generalization capability and emergence of new capabilities with increasing model capacity and pre-training dataset size. Consequently, we are witnessing the development of enormous models pushing the state-of-the-art. It is, however, imperative to realize that this inevitably leads to prohibitively long training times, extortionate computing costs, and a detrimental environmental impact. Significant efforts are underway to make PTLM training more efficient through innovations in model architectures, training pipelines, and loss function design, with scant attention being paid to optimizing the utility of training data. The key question that we ask is whether it is possible to train PTLMs by employing only highly informative subsets of the training data while maintaining downstream performance? Building upon the recent progress in informative data subset selection, we show how we can employ submodular optimization to select highly representative subsets of the training corpora. Our results demonstrate that the proposed framework can be applied to efficiently train multiple PTLMs (BERT, BioBERT, GPT-2) using only a fraction of data while retaining up to $\sim99\%$ of the performance of the fully-trained models.

協變量偏移 · MoDELS · 有偏 · Extensibility · 相互獨立的 ·

2023 年 5 月 11 日

An Offline Metric for the Debiasedness of Click Models

Romain Deffayet,Philipp Hager,Jean-Michel Renders,Maarten de Rijke

from arxiv, SIGIR23 - Full paper

A well-known problem when learning from user clicks are inherent biases prevalent in the data, such as position or trust bias. Click models are a common method for extracting information from user clicks, such as document relevance in web search, or to estimate click biases for downstream applications such as counterfactual learning-to-rank, ad placement, or fair ranking. Recent work shows that the current evaluation practices in the community fail to guarantee that a well-performing click model generalizes well to downstream tasks in which the ranking distribution differs from the training distribution, i.e., under covariate shift. In this work, we propose an evaluation metric based on conditional independence testing to detect a lack of robustness to covariate shift in click models. We introduce the concept of debiasedness and a metric for measuring it. We prove that debiasedness is a necessary condition for recovering unbiased and consistent relevance scores and for the invariance of click prediction under covariate shift. In extensive semi-synthetic experiments, we show that our proposed metric helps to predict the downstream performance of click models under covariate shift and is useful in an off-policy model selection setting.

線性的 · MoDELS · 優化器 · JACM · 最優化 ·

2023 年 5 月 11 日

On modeling NP-Complete problems as polynomial-sized linear programs: Escaping/Side-stepping the "barriers"

Moustapha Diaby,Mark Karwan,Lei Sun

from arxiv, 24 pages; 4 figures; 2 tables; Vesrion 2: Minor typos corrected; This version: Discussion/editorial clarification in section 2.2.1 (last paragraph)

In view of the extended formulations (EFs) developments (e.g. "Fiorini, S., S. Massar, S. Pokutta, H.R. Tiwary, and R. de Wolf [2015]. Exponential Lower Bounds for Polytopes in Combinatorial Optimization. Journal of the ACM 62:2"), we focus in this paper on the question of whether it is possible to model an NP-Complete problem as a polynomial-sized linear program. For the sake of simplicity of exposition, the discussions are focused on the TSP. We show that a finding that there exists no polynomial-sized extended formulation of "the TSP polytope" does not (necessarily) imply that it is "impossible" for a polynomial-sized linear program to solve the TSP optimization problem. We show that under appropriate conditions the TSP optimization problem can be solved without recourse to the traditional city-to-city ("travel leg") variables, thereby side-stepping/"escaping from" "the TSP polytope" and hence, the barriers. Some illustrative examples are discussed.

相互獨立的 · 近似 · 統計量 · 規范化的 · 向量化 ·

2023 年 5 月 10 日

Bounds for distributional approximation in the multivariate delta method by Stein's method

Robert E. Gaunt,Heather Sutcliffe

from arxiv, 38 pages

We obtain bounds to quantify the distributional approximation in the delta method for vector statistics (the sample mean of $n$ independent random vectors) for normal and non-normal limits, measured using smooth test functions. For normal limits, we obtain bounds of the optimal order $n^{-1/2}$ rate of convergence, but for a wide class of non-normal limits, which includes quadratic forms amongst others, we achieve bounds with a faster order $n^{-1}$ convergence rate. We apply our general bounds to derive explicit bounds to quantify distributional approximations of an estimator for Bernoulli variance, several statistics of sample moments, order $n^{-1}$ bounds for the chi-square approximation of a family of rank-based statistics, and we also provide an efficient independent derivation of an order $n^{-1}$ bound for the chi-square approximation of Pearson's statistic. In establishing our general results, we generalise recent results on Stein's method for functions of multivariate normal random vectors to vector-valued functions and sums of independent random vectors whose components may be dependent. These bounds are widely applicable and are of independent interest.

代碼 · 變換 · 數據集 · Extensibility · motivation ·

2023 年 5 月 10 日

TASTY: A Transformer based Approach to Space and Time complexity

Kaushik Moudgalya,Ankit Ramakrishnan,Vamsikrishna Chemudupati,Xing Han Lu

Code based Language Models (LMs) have shown very promising results in the field of software engineering with applications such as code refinement, code completion and generation. However, the task of time and space complexity classification from code has not been extensively explored due to a lack of datasets, with prior endeavors being limited to Java. In this project, we aim to address these gaps by creating a labelled dataset of code snippets spanning multiple languages (Python and C++ datasets currently, with C, C#, and JavaScript datasets being released shortly). We find that existing time complexity calculation libraries and tools only apply to a limited number of use-cases. The lack of a well-defined rule based system motivates the application of several recently proposed code-based LMs. We demonstrate the effectiveness of dead code elimination and increasing the maximum sequence length of LMs. In addition to time complexity, we propose to use LMs to find space complexities from code, and to the best of our knowledge, this is the first attempt to do so. Furthermore, we introduce a novel code comprehension task, called cross-language transfer, where we fine-tune the LM on one language and run inference on another. Finally, we visualize the activation of the attention fed classification head of our LMs using Non-negative Matrix Factorization (NMF) to interpret our results.

過擬合 · MoDELS · 近似 · 訓練數據 · Learning ·

2023 年 5 月 9 日

Testing for Overfitting

James Schmidt

High complexity models are notorious in machine learning for overfitting, a phenomenon in which models well represent data but fail to generalize an underlying data generating process. A typical procedure for circumventing overfitting computes empirical risk on a holdout set and halts once (or flags that/when) it begins to increase. Such practice often helps in outputting a well-generalizing model, but justification for why it works is primarily heuristic. We discuss the overfitting problem and explain why standard asymptotic and concentration results do not hold for evaluation with training data. We then proceed to introduce and argue for a hypothesis test by means of which both model performance may be evaluated using training data, and overfitting quantitatively defined and detected. We rely on said concentration bounds which guarantee that empirical means should, with high probability, approximate their true mean to conclude that they should approximate each other. We stipulate conditions under which this test is valid, describe how the test may be used for identifying overfitting, articulate a further nuance according to which distributional shift may be flagged, and highlight an alternative notion of learning which usefully captures generalization in the absence of uniform PAC guarantees.

優化器 · Extensibility · 最優化 · Automator · Neural Networks ·

2020 年 3 月 12 日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Tong Yu,Hong Zhu

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.