男男网站网址视频免费观看,国产日本亚洲欧美一区二区,国产乱叫456另类在线,成人电影 91在线

The Traveling Salesman Problem (TSP) is one of the most extensively researched and widely applied combinatorial optimization problems. It is NP-hard even in the symmetric and metric case. Building upon elaborate research, state-of-the-art exact solvers such as CONCORDE can solve TSP instances with several ten thousand vertices. A key ingredient for these integer programming approaches are fast heuristics to find a good initial solution, in particular the Lin-Kernighan-Helsgaun (LKH) heuristic. For instances with few hundred vertices heuristics like LKH often find an optimal solution. In this work we develop variations of LKH that perform significantly better on large instances. LKH repeatedly improves an initially random tour by exchanging edges along alternating circles. Thereby, it respects several criteria designed to quickly find alternating circles that give a feasible improvement of the tour. Among those criteria, the positive gain criterion stayed mostly untouched in previous research. It requires that, while constructing an alternating circle, the total gain has to be positive after each pair of edges. We relax this criterion carefully leading to improvement steps hitherto undiscovered by LKH. We confirm this improvement experimentally via extensive simulations on various benchmark libraries for TSP. Our computational study shows that for large instances our method is on average 13% faster than the latest version of LKH.

相關內容

準則

關注 0

大語言模型 · 可理解性 · MoDELS · 縮放 · Performer ·

2024 年 3 月 11 日

Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena

Leonie Weissweiler,Abdullatif K?ksal,Hinrich Schütze

Argument Structure Constructions (ASCs) are one of the most well-studied construction groups, providing a unique opportunity to demonstrate the usefulness of Construction Grammar (CxG). For example, the caused-motion construction (CMC, ``She sneezed the foam off her cappuccino'') demonstrates that constructions must carry meaning, otherwise the fact that ``sneeze'' in this context causes movement cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art Large Language Models (LLMs), for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and GPT-3.5 can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate GPT, Gemini, Llama2 and Mistral models for their understanding of the CMC using the newly collected corpus. We find that all models struggle with understanding the motion component that the CMC adds to a sentence.

優化器 · 方差減小 · 環 · 方差 · motivation ·

2024 年 3 月 11 日

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Yury Demidovich,Grigory Malinovsky,Peter Richtárik

In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.

估計/估計量 · Performer · 回合 · Analysis · 控制器 ·

2024 年 3 月 9 日

Analyzing the Influence of Processor Speed and Clock Speed on Remaining Useful Life Estimation of Software Systems

M. Rubyet Islam,Peter Sandborn

from arxiv, I do not wish to put this in arXiv anymore. I need a total re-work on this paper for which I do nor have any time for. I realize that this data /paper should not be be on this archive anymore

Prognostics and Health Management (PHM) is a discipline focused on predicting the point at which systems or components will cease to perform as intended, typically measured as Remaining Useful Life (RUL). RUL serves as a vital decision-making tool for contingency planning, guiding the timing and nature of system maintenance. Historically, PHM has primarily been applied to hardware systems, with its application to software only recently explored. In a recent study we introduced a methodology and demonstrated how changes in software can impact the RUL of software. However, in practical software development, real-time performance is also influenced by various environmental attributes, including operating systems, clock speed, processor performance, RAM, machine core count and others. This research extends the analysis to assess how changes in environmental attributes, such as operating system and clock speed, affect RUL estimation in software. Findings are rigorously validated using real performance data from controlled test beds and compared with predictive model-generated data. Statistical validation, including regression analysis, supports the credibility of the results. The controlled test bed environment replicates and validates faults from real applications, ensuring a standardized assessment platform. This exploration yields actionable knowledge for software maintenance and optimization strategies, addressing a significant gap in the field of software health management.

分解的 · MoDELS · 高斯核 · 高斯核函數 · 協方差矩陣 ·

2024 年 3 月 8 日

Applying Non-negative Matrix Factorization with Covariates to the Longitudinal Data as Growth Curve Model

Kenichi Satoh

from arxiv, 21 pages, 7 figures, R package: nmfkc published in GitHub, //github.com/ksatohds/nmfkc

Using Non-negative Matrix Factorization (NMF), the observed matrix can be approximated by the product of the basis and coefficient matrices. Moreover, if the coefficient vectors are explained by the covariates for each individual, the coefficient matrix can be written as the product of the parameter matrix and the covariate matrix, and additionally described in the framework of Non-negative Matrix tri-Factorization (tri-NMF) with covariates. Consequently, this is equal to the mean structure of the Growth Curve Model (GCM). The difference is that the basis matrix for GCM is given by the analyst, whereas that for NMF with covariates is unknown and optimized. In this study, we applied NMF with covariance to longitudinal data and compared it with GCM. We have also published an R package that implements this method, and we show how to use it through examples of data analyses including longitudinal measurement, spatiotemporal data and text data. In particular, we demonstrate the usefulness of Gaussian kernel functions as covariates.

MoDELS · Microsoft Surface · 設計 · Performer · 變換 ·

2024 年 3 月 8 日

Inverse Design of Photonic Crystal Surface Emitting Lasers is a Sequence Modeling Problem

Ceyao Zhang,Renjie Li,Cheng Zhang,Zhaoyu Zhang,Feng Yin

from arxiv, accepted by AAAI workshop AI2ASE(2024)//ai-2-ase.github.io/papers/29%5cCameraReady%5cPIT__PSCEL_inverse_design_transformer.pdf

Photonic Crystal Surface Emitting Lasers (PCSEL)'s inverse design demands expert knowledge in physics, materials science, and quantum mechanics which is prohibitively labor-intensive. Advanced AI technologies, especially reinforcement learning (RL), have emerged as a powerful tool to augment and accelerate this inverse design process. By modeling the inverse design of PCSEL as a sequential decision-making problem, RL approaches can construct a satisfactory PCSEL structure from scratch. However, the data inefficiency resulting from online interactions with precise and expensive simulation environments impedes the broader applicability of RL approaches. Recently, sequential models, especially the Transformer architecture, have exhibited compelling performance in sequential decision-making problems due to their simplicity and scalability to large language models. In this paper, we introduce a novel framework named PCSEL Inverse Design Transformer (PiT) that abstracts the inverse design of PCSEL as a sequence modeling problem. The central part of our PiT is a Transformer-based structure that leverages the past trajectories and current states to predict the current actions. Compared with the traditional RL approaches, PiT can output the optimal actions and achieve target PCSEL designs by leveraging offline data and conditioning on the desired return. Results demonstrate that PiT achieves superior performance and data efficiency compared to baselines.

地球 · 講稿 · 相似度度量 · Branch · 極小點 ·

2024 年 3 月 7 日

Fine-Grained Complexity of Earth Mover's Distance under Translation

Karl Bringmann,Frank Staals,Karol W?grzycki,Geert van Wordragen

from arxiv, Full version of the paper "Fine-Grained Complexity of Earth Mover's Distance under Translation" accepted for SoCG 2024

The Earth Mover's Distance is a popular similarity measure in several branches of computer science. It measures the minimum total edge length of a perfect matching between two point sets. The Earth Mover's Distance under Translation ($\mathrm{EMDuT}$) is a translation-invariant version thereof. It minimizes the Earth Mover's Distance over all translations of one point set. For $\mathrm{EMDuT}$ in $\mathbb{R}^1$, we present an $\widetilde{\mathcal{O}}(n^2)$-time algorithm. We also show that this algorithm is nearly optimal by presenting a matching conditional lower bound based on the Orthogonal Vectors Hypothesis. For $\mathrm{EMDuT}$ in $\mathbb{R}^d$, we present an $\widetilde{\mathcal{O}}(n^{2d+2})$-time algorithm for the $L_1$ and $L_\infty$ metric. We show that this dependence on $d$ is asymptotically tight, as an $n^{o(d)}$-time algorithm for $L_1$ or $L_\infty$ would contradict the Exponential Time Hypothesis (ETH). Prior to our work, only approximation algorithms were known for these problems.

GNN · 圖形處理器 · Neural Networks · Networking · 優化器 ·

2022 年 11 月 11 日

A Comprehensive Survey on Distributed Training of Graph Neural Networks

Haiyang Lin,Mingyu Yan,Xiaochun Ye,Dongrui Fan,Shirui Pan,Wenguang Chen,Yuan Xie

from arxiv, 30 pages, double column, 10 figures, 10 tables

Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.

泛化理論 · 學成 · 深度學習 · 樣例 · 數據集 ·

2022 年 3 月 18 日

On the Generalization Mystery in Deep Learning

Satrajit Chatterjee,Piotr Zielinski

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization. Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.

跳躍連接 · Neural Networks · 優化器 · 線性的 · 圖 ·

2021 年 5 月 10 日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Keyulu Xu,Mozhi Zhang,Stefanie Jegelka,Kenji Kawaguchi

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

entity · 鏈路預測 · Performer · 圖 · 知識圖譜 ·

2019 年 9 月 26 日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Yao Zhu,Hongzhi Liu,Zhonghai Wu,Yang Song,Tao Zhang

Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.