亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We abstract and study \emph{reachability preservers}, a graph-theoretic primitive that has been implicit in prior work on network design. Given a directed graph $G = (V, E)$ and a set of \emph{demand pairs} $P \subseteq V \times V$, a reachability preserver is a sparse subgraph $H$ that preserves reachability between all demand pairs. Our first contribution is a series of extremal bounds on the size of reachability preservers. Our main result states that, for an $n$-node graph and demand pairs of the form $P \subseteq S \times V$ for a small node subset $S$, there is always a reachability preserver on $O(n+\sqrt{n |P| |S|})$ edges. We additionally give a lower bound construction demonstrating that this upper bound characterizes the settings in which $O(n)$ size reachability preservers are generally possible, in a large range of parameters. The second contribution of this paper is a new connection between extremal graph sparsification results and classical Steiner Network Design problems. Surprisingly, prior to this work, the osmosis of techniques between these two fields had been superficial. This allows us to improve the state of the art approximation algorithms for the most basic Steiner-type problem in directed graphs from the $O(n^{0.6+\varepsilon})$ of Chlamatac, Dinitz, Kortsarz, and Laekhanukit (SODA'17) to $O(n^{4/7+\varepsilon})$.

相關內容

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managing computational costs when scaling up model parameters. However, conventional MoE architectures like GShard, which activate the top-$K$ out of $N$ experts, face challenges in ensuring expert specialization, i.e. each expert acquires non-overlapping and focused knowledge. In response, we propose the DeepSeekMoE architecture towards ultimate expert specialization. It involves two principal strategies: (1) finely segmenting the experts into $mN$ ones and activating $mK$ from them, allowing for a more flexible combination of activated experts; (2) isolating $K_s$ experts as shared ones, aiming at capturing common knowledge and mitigating redundancy in routed experts. Starting from a modest scale with 2B parameters, we demonstrate that DeepSeekMoE 2B achieves comparable performance with GShard 2.9B, which has 1.5 times the expert parameters and computation. In addition, DeepSeekMoE 2B nearly approaches the performance of its dense counterpart with the same number of total parameters, which set the upper bound of MoE models. Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations. Further, our preliminary efforts to scale up DeepSeekMoE to 145B parameters consistently validate its substantial advantages over the GShard architecture, and show its performance comparable with DeepSeek 67B, using only 28.5% (maybe even 18.2%) of computations.

Due to the sheer size of software logs, developers rely on automated log analysis. Log parsing, which parses semi-structured logs into a structured format, is a prerequisite of automated log analysis. However, existing log parsers are unsatisfactory when applied in practice because: 1) they ignore categories of variables, and 2) have poor generalization ability. To address the limitations of existing approaches, we propose LogPTR, the first end-to-end variable-aware log parser that can extract the static and dynamic parts in logs, and further identify the categories of variables. The key of LogPTR is using pointer network to copy words from the log message. We have performed extensive experiments on 16 public log datasets and the results show that LogPTR outperforms state-of-the-art log parsers both on general log parsing that extracts the log template and variable-aware log parsing that further identifies the category of variables.

In this paper, we study the problem of estimating the normalizing constant $\int e^{-\lambda f(x)}dx$ through queries to the black-box function $f$, where $f$ belongs to a reproducing kernel Hilbert space (RKHS), and $\lambda$ is a problem parameter. We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $\lambda$: When $\lambda$ approaches zero, the problem is similar to Bayesian quadrature (BQ), while when $\lambda$ approaches infinity, the problem is similar to Bayesian optimization (BO). More generally, the problem varies between BQ and BO. We find that this pattern holds true even when the function evaluations are noisy, bringing new aspects to this topic. Our findings are supported by both algorithm-independent lower bounds and algorithmic upper bounds, as well as simulation studies conducted on a variety of benchmark functions.

We present Classy Ensemble, a novel ensemble-generation algorithm for classification tasks, which aggregates models through a weighted combination of per-class accuracy. Tested over 153 machine learning datasets we demonstrate that Classy Ensemble outperforms two other well-known aggregation algorithms -- order-based pruning and clustering-based pruning -- as well as the recently introduced lexigarden ensemble generator. We then present three enhancements: 1) Classy Cluster Ensemble, which combines Classy Ensemble and cluster-based pruning; 2) Deep Learning experiments, showing the merits of Classy Ensemble over four image datasets: Fashion MNIST, CIFAR10, CIFAR100, and ImageNet; and 3) Classy Evolutionary Ensemble, wherein an evolutionary algorithm is used to select the set of models which Classy Ensemble picks from. This latter, combining learning and evolution, resulted in improved performance on the hardest dataset.

The past decade has witnessed a plethora of works that leverage the power of visualization (VIS) to interpret machine learning (ML) models. The corresponding research topic, VIS4ML, keeps growing at a fast pace. To better organize the enormous works and shed light on the developing trend of VIS4ML, we provide a systematic review of these works through this survey. Since data quality greatly impacts the performance of ML models, our survey focuses specifically on summarizing VIS4ML works from the data perspective. First, we categorize the common data handled by ML models into five types, explain the unique features of each type, and highlight the corresponding ML models that are good at learning from them. Second, from the large number of VIS4ML works, we tease out six tasks that operate on these types of data (i.e., data-centric tasks) at different stages of the ML pipeline to understand, diagnose, and refine ML models. Lastly, by studying the distribution of 143 surveyed papers across the five data types, six data-centric tasks, and their intersections, we analyze the prospective research directions and envision future research trends.

Graphs are used widely to model complex systems, and detecting anomalies in a graph is an important task in the analysis of complex systems. Graph anomalies are patterns in a graph that do not conform to normal patterns expected of the attributes and/or structures of the graph. In recent years, graph neural networks (GNNs) have been studied extensively and have successfully performed difficult machine learning tasks in node classification, link prediction, and graph classification thanks to the highly expressive capability via message passing in effectively learning graph representations. To solve the graph anomaly detection problem, GNN-based methods leverage information about the graph attributes (or features) and/or structures to learn to score anomalies appropriately. In this survey, we review the recent advances made in detecting graph anomalies using GNN models. Specifically, we summarize GNN-based methods according to the graph type (i.e., static and dynamic), the anomaly type (i.e., node, edge, subgraph, and whole graph), and the network architecture (e.g., graph autoencoder, graph convolutional network). To the best of our knowledge, this survey is the first comprehensive review of graph anomaly detection methods based on GNNs.

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This allows one to reason about the effects of changes to this process (i.e., interventions) and what would have happened in hindsight (i.e., counterfactuals). We categorize work in \causalml into five groups according to the problems they tackle: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, (5) causal reinforcement learning. For each category, we systematically compare its methods and point out open problems. Further, we review modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

The study of network robustness is a critical tool in the characterization and sense making of complex interconnected systems such as infrastructure, communication and social networks. While significant research has been conducted in all of these areas, gaps in the surveying literature still exist. Answers to key questions are currently scattered across multiple scientific fields and numerous papers. In this survey, we distill key findings across numerous domains and provide researchers crucial access to important information by--(1) summarizing and comparing recent and classical graph robustness measures; (2) exploring which robustness measures are most applicable to different categories of networks (e.g., social, infrastructure; (3) reviewing common network attack strategies, and summarizing which attacks are most effective across different network topologies; and (4) extensive discussion on selecting defense techniques to mitigate attacks across a variety of networks. This survey guides researchers and practitioners in navigating the expansive field of network robustness, while summarizing answers to key questions. We conclude by highlighting current research directions and open problems.

The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(AdaBoosting Graph Convolutional Network) has the ability to efficiently extract knowledge from high-order neighbors and integrate knowledge from different hops of neighbors into the network in an AdaBoost way. We also present the architectural difference between AdaGCN and existing graph convolutional methods to show the benefits of our proposal. Finally, extensive experiments demonstrate the state-of-the-art prediction performance and the computational advantage of our approach AdaGCN.

Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.

北京阿比特科技有限公司