亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The Traveling Salesman Problem (TSP) is one of the most extensively researched and widely applied combinatorial optimization problems. It is NP-hard even in the symmetric and metric case. Building upon elaborate research, state-of-the-art exact solvers such as CONCORDE can solve TSP instances with several ten thousand vertices. A key ingredient for these integer programming approaches are fast heuristics to find a good initial solution, in particular the Lin-Kernighan-Helsgaun (LKH) heuristic. For instances with few hundred vertices heuristics like LKH often find an optimal solution. In this work we develop variations of LKH that perform significantly better on large instances. LKH repeatedly improves an initially random tour by exchanging edges along alternating circles. Thereby, it respects several criteria designed to quickly find alternating circles that give a feasible improvement of the tour. Among those criteria, the positive gain criterion stayed mostly untouched in previous research. It requires that, while constructing an alternating circle, the total gain has to be positive after each pair of edges. We relax this criterion carefully leading to improvement steps hitherto undiscovered by LKH. We confirm this improvement experimentally via extensive simulations on various benchmark libraries for TSP. Our computational study shows that for large instances our method is on average 13% faster than the latest version of LKH.

相關內容

Argument Structure Constructions (ASCs) are one of the most well-studied construction groups, providing a unique opportunity to demonstrate the usefulness of Construction Grammar (CxG). For example, the caused-motion construction (CMC, ``She sneezed the foam off her cappuccino'') demonstrates that constructions must carry meaning, otherwise the fact that ``sneeze'' in this context causes movement cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art Large Language Models (LLMs), for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and GPT-3.5 can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate GPT, Gemini, Llama2 and Mistral models for their understanding of the CMC using the newly collected corpus. We find that all models struggle with understanding the motion component that the CMC adds to a sentence.

In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.

Prognostics and Health Management (PHM) is a discipline focused on predicting the point at which systems or components will cease to perform as intended, typically measured as Remaining Useful Life (RUL). RUL serves as a vital decision-making tool for contingency planning, guiding the timing and nature of system maintenance. Historically, PHM has primarily been applied to hardware systems, with its application to software only recently explored. In a recent study we introduced a methodology and demonstrated how changes in software can impact the RUL of software. However, in practical software development, real-time performance is also influenced by various environmental attributes, including operating systems, clock speed, processor performance, RAM, machine core count and others. This research extends the analysis to assess how changes in environmental attributes, such as operating system and clock speed, affect RUL estimation in software. Findings are rigorously validated using real performance data from controlled test beds and compared with predictive model-generated data. Statistical validation, including regression analysis, supports the credibility of the results. The controlled test bed environment replicates and validates faults from real applications, ensuring a standardized assessment platform. This exploration yields actionable knowledge for software maintenance and optimization strategies, addressing a significant gap in the field of software health management.

Using Non-negative Matrix Factorization (NMF), the observed matrix can be approximated by the product of the basis and coefficient matrices. Moreover, if the coefficient vectors are explained by the covariates for each individual, the coefficient matrix can be written as the product of the parameter matrix and the covariate matrix, and additionally described in the framework of Non-negative Matrix tri-Factorization (tri-NMF) with covariates. Consequently, this is equal to the mean structure of the Growth Curve Model (GCM). The difference is that the basis matrix for GCM is given by the analyst, whereas that for NMF with covariates is unknown and optimized. In this study, we applied NMF with covariance to longitudinal data and compared it with GCM. We have also published an R package that implements this method, and we show how to use it through examples of data analyses including longitudinal measurement, spatiotemporal data and text data. In particular, we demonstrate the usefulness of Gaussian kernel functions as covariates.

Photonic Crystal Surface Emitting Lasers (PCSEL)'s inverse design demands expert knowledge in physics, materials science, and quantum mechanics which is prohibitively labor-intensive. Advanced AI technologies, especially reinforcement learning (RL), have emerged as a powerful tool to augment and accelerate this inverse design process. By modeling the inverse design of PCSEL as a sequential decision-making problem, RL approaches can construct a satisfactory PCSEL structure from scratch. However, the data inefficiency resulting from online interactions with precise and expensive simulation environments impedes the broader applicability of RL approaches. Recently, sequential models, especially the Transformer architecture, have exhibited compelling performance in sequential decision-making problems due to their simplicity and scalability to large language models. In this paper, we introduce a novel framework named PCSEL Inverse Design Transformer (PiT) that abstracts the inverse design of PCSEL as a sequence modeling problem. The central part of our PiT is a Transformer-based structure that leverages the past trajectories and current states to predict the current actions. Compared with the traditional RL approaches, PiT can output the optimal actions and achieve target PCSEL designs by leveraging offline data and conditioning on the desired return. Results demonstrate that PiT achieves superior performance and data efficiency compared to baselines.

The Earth Mover's Distance is a popular similarity measure in several branches of computer science. It measures the minimum total edge length of a perfect matching between two point sets. The Earth Mover's Distance under Translation ($\mathrm{EMDuT}$) is a translation-invariant version thereof. It minimizes the Earth Mover's Distance over all translations of one point set. For $\mathrm{EMDuT}$ in $\mathbb{R}^1$, we present an $\widetilde{\mathcal{O}}(n^2)$-time algorithm. We also show that this algorithm is nearly optimal by presenting a matching conditional lower bound based on the Orthogonal Vectors Hypothesis. For $\mathrm{EMDuT}$ in $\mathbb{R}^d$, we present an $\widetilde{\mathcal{O}}(n^{2d+2})$-time algorithm for the $L_1$ and $L_\infty$ metric. We show that this dependence on $d$ is asymptotically tight, as an $n^{o(d)}$-time algorithm for $L_1$ or $L_\infty$ would contradict the Exponential Time Hypothesis (ETH). Prior to our work, only approximation algorithms were known for these problems.

Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization. Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.

北京阿比特科技有限公司