亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Given an undirected graph $G$ and a conductance parameter $\alpha$, the problem of testing whether $G$ has conductance at least $\alpha$ or is far from having conductance at least $\Omega(\alpha^2)$ has been extensively studied for bounded-degree graphs in the classic property testing model. In the last few years, the same problem has also been addressed in non-sequential models of computing such as MPC and distributed CONGEST. However, all the algorithms in these models like their classic counterparts apply an aggregate function over some statistics pertaining to a set of random walks on $G$ as a test criteria. The only distributed CONGEST algorithm for the problem by~\cite{VasudevDistributed} tests conductance of the underlying network in the unbounded degree graph model. Their algorithm builds a rooted spanning tree of the underlying network to collect information at the root and then applies an aggregate function to this information. We ask the question whether the parallelism offered by distributed computing can be exploited to avoid information collection and answer it in affirmative. We propose a new algorithm which also performs a set of random walks on $G$ but does not collect any statistic at a central node. In fact, we show that for an appropriate statistic, each node has sufficient information to decide on its own whether to accept or not. Given an $n$-vertex, $m$-edge undirected, unweighted graph $G$, a conductance parameter $\alpha$, and a distance parameter $\epsilon$, our distributed conductance tester accepts $G$ if $G$ has conductance at least $\alpha$ and rejects $G$ if $G$ is $\epsilon$-far from having conductance $\Omega(\alpha^2)$ and does so in $O(\log n)$ rounds of communication. Unlike the algorithm of \cite{VasudevDistributed}, our algorithm does not rely on the wasteful construction of a spanning tree and information accumulation at its root.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · 路徑 · Performer · 有向 · 泛函 ·
2023 年 7 月 11 日

Given a directed edge-weighted graph $G=(V, E)$ with beer vertices $B\subseteq V$, a beer path between two vertices $u$ and $v$ is a path between $u$ and $v$ that visits at least one beer vertex in $B$, and the beer distance between two vertices is the shortest length of beer paths. We consider \emph{indexing problems} on beer paths, that is, a graph is given a priori, and we construct some data structures (called indexes) for the graph. Then later, we are given two vertices, and we find the beer distance or beer path between them using the data structure. For such a scheme, efficient algorithms using indexes for the beer distance and beer path queries have been proposed for outerplanar graphs and interval graphs. For example, Bacic et al. (2021) present indexes with size $O(n)$ for outerplanar graphs and an algorithm using them that answers the beer distance between given two vertices in $O(\alpha(n))$ time, where $\alpha(\cdot)$ is the inverse Ackermann function; the performance is shown to be optimal. This paper proposes indexing data structures and algorithms for beer path queries on general graphs based on two types of graph decomposition: the tree decomposition and the triconnected component decomposition. We propose indexes with size $O(m+nr^2)$ based on the triconnected component decomposition, where $r$ is the size of the largest triconnected component. For a given query $u,v\in V$, our algorithm using the indexes can output the beer distance in query time $O(\alpha(m))$. In particular, our indexing data structures and algorithms achieve the optimal performance (the space and the query time) for series-parallel graphs, which is a wider class of outerplanar graphs.

This paper considers improving wireless communication and computation efficiency in federated learning (FL) via model quantization. In the proposed bitwidth FL scheme, edge devices train and transmit quantized versions of their local FL model parameters to a coordinating server, which aggregates them into a quantized global model and synchronizes the devices. The goal is to jointly determine the bitwidths employed for local FL model quantization and the set of devices participating in FL training at each iteration. We pose this as an optimization problem that aims to minimize the training loss of quantized FL under a per-iteration device sampling budget and delay requirement. However, the formulated problem is difficult to solve without (i) a concrete understanding of how quantization impacts global ML performance and (ii) the ability of the server to construct estimates of this process efficiently. To address the first challenge, we analytically characterize how limited wireless resources and induced quantization errors affect the performance of the proposed FL method. Our results quantify how the improvement of FL training loss between two consecutive iterations depends on the device selection and quantization scheme as well as on several parameters inherent to the model being learned. Then, we show that the FL training process can be described as a Markov decision process and propose a model-based reinforcement learning (RL) method to optimize action selection over iterations. Compared to model-free RL, this model-based RL approach leverages the derived mathematical characterization of the FL training process to discover an effective device selection and quantization scheme without imposing additional device communication overhead. Simulation results show that the proposed FL algorithm can reduce the convergence time.

In the spanning tree congestion problem, given a connected graph $G$, the objective is to compute a spanning tree $T$ in $G$ that minimizes its maximum edge congestion, where the congestion of an edge $e$ of $T$ is the number of edges in $G$ for which the unique path in $T$ between their endpoints traverses $e$. The problem is known to be $\mathbb{NP}$-hard, but its approximability is still poorly understood. In the decision version of this problem, denoted $K-\textsf{STC}$, we need to determine if $G$ has a spanning tree with congestion at most $K$. It is known that $K-\textsf{STC}$ is $\mathbb{NP}$-complete for $K\ge 8$. On the other hand, $3-\textsf{STC}$ can be solved in polynomial time, with the complexity status of this problem for $K\in \{4,5,6,7\}$ remaining an open problem. We substantially improve the earlier hardness results by proving that $K-\textsf{STC}$ is $\mathbb{NP}$-complete for $K\ge 5$. This leaves only the case $K=4$ open, and improves the lower bound on the approximation ratio to $1.2$. Motivated by evidence that minimizing congestion is hard even for graphs of small constant radius, we consider $K-\textsf{STC}$ restricted to graphs of radius $2$, and we prove that this variant is $\mathbb{NP}$-complete for all $K\ge 6$. Exploring further in this direction, we also examine the variant, denoted $K-\textsf{STC}D$, where the objective is to determine if the graph has a depth-$D$ spanning three of congestion at most $K$. We prove that $6-\textsf{STC}2$ is $\mathbb{NP}$-complete even for bipartite graphs. For bipartite graphs we establish a tight bound, by also proving that $5-\textsf{STC}2$ is polynomial-time solvable. Additionally, we complement this result with polynomial-time algorithms for two special cases that involve bipartite graphs and restrictions on vertex degrees.

This paper presents a decentralized algorithm for solving distributed convex optimization problems in dynamic networks with time-varying objectives. The unique feature of the algorithm lies in its ability to accommodate a wide range of communication systems, including previously unsupported ones, by abstractly modeling the information exchange in the network. Specifically, it supports a novel communication protocol based on the "over-the-air" function computation (OTA-C) technology, that is designed for an efficient and truly decentralized implementation of the consensus step of the algorithm. Unlike existing OTA-C protocols, the proposed protocol does not require the knowledge of network graph structure or channel state information, making it particularly suitable for decentralized implementation over ultra-dense wireless networks with time-varying topologies and fading channels. Furthermore, the proposed algorithm synergizes with the "superiorization" methodology, allowing the development of new distributed algorithms with enhanced performance for the intended applications. The theoretical analysis establishes sufficient conditions for almost sure convergence of the algorithm to a common time-invariant solution for all agents, assuming such a solution exists. Our algorithm is applied to a real-world distributed random field estimation problem, showcasing its efficacy in terms of convergence speed, scalability, and spectral efficiency. Furthermore, we present a superiorized version of our algorithm that achieves faster convergence with significantly reduced energy consumption compared to the unsuperiorized algorithm.

In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(\mathbf{x})$ using limited instances $S={(\mathbf{x^{(i)}},y^{(i)})}$, where $\mathbf{x^{(i)}} \in D$ and $D$ represents the domain of interest, is a common objective. We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $\mathbf{x}$. Consequently, the model's generalization ability is evaluated on a separate set $T=\{\mathbf{x^{(j)}}\} \subset D$, where $T \neq S$, frequently with $T \cap S = \emptyset$, to assess its performance beyond the training set. However, certain applications require accurate approximation not only within the original domain $D$ but also in an extended domain $D'$ that encompasses $D$. This becomes particularly relevant in scenarios involving the design of new structures, where minimizing errors in approximations is crucial. For example, when developing new materials through data-driven approaches, the AI/ML system can provide valuable insights to guide the design process by serving as a surrogate function. Consequently, the learned model can be employed to facilitate the design of new laboratory experiments. In this paper, we propose a method for multivariate regression based on iterative fitting of a continued fraction, incorporating additive spline models. We compare the performance of our method with established techniques, including AdaBoost, Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression, Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent, and XGBoost. To evaluate these methods, we focus on an important problem in the field: predicting the critical temperature of superconductors based on physical-chemical characteristics.

This article inspects whether a multivariate distribution is different from a specified distribution or not, and it also tests the equality of two multivariate distributions. In the course of this study, a graphical tool-kit using well-known half-spaced depth based information criteria is proposed, which is a two-dimensional plot, regardless of the dimension of the data, and it is even useful in comparing high-dimensional distributions. The simple interpretability of the proposed graphical tool-kit motivates us to formulate test statistics to carry out the corresponding testing of hypothesis problems. It is established that the proposed tests based on the same information criteria are consistent, and moreover, the asymptotic distributions of the test statistics under contiguous/local alternatives are derived, which enable us to compute the asymptotic power of these tests. Furthermore, it is observed that the computations associated with the proposed tests are unburdensome. Besides, these tests perform better than many other tests available in the literature when data are generated from various distributions such as heavy tailed distributions, which indicates that the proposed methodology is robust as well. Finally, the usefulness of the proposed graphical tool-kit and tests is shown on two benchmark real data sets.

We propose an unsupervised tree boosting algorithm for inferring the underlying sampling distribution of an i.i.d. sample based on fitting additive tree ensembles in a fashion analogous to supervised tree boosting. Integral to the algorithm is a new notion of "addition" on probability distributions that leads to a coherent notion of "residualization", i.e., subtracting a probability distribution from an observation to remove the distributional structure from the sampling distribution of the latter. We show that these notions arise naturally for univariate distributions through cumulative distribution function (CDF) transforms and compositions due to several "group-like" properties of univariate CDFs. While the traditional multivariate CDF does not preserve these properties, a new definition of multivariate CDF can restore these properties, thereby allowing the notions of "addition" and "residualization" to be formulated for multivariate settings as well. This then gives rise to the unsupervised boosting algorithm based on forward-stagewise fitting of an additive tree ensemble, which sequentially reduces the Kullback-Leibler divergence from the truth. The algorithm allows analytic evaluation of the fitted density and outputs a generative model that can be readily sampled from. We enhance the algorithm with scale-dependent shrinkage and a two-stage strategy that separately fits the marginals and the copula. The algorithm then performs competitively to state-of-the-art deep-learning approaches in multivariate density estimation on multiple benchmark data sets.

In this paper, we introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA), which i) requires only a pre-trained source model, ii) allows the source and target domain to have different label sets, i.e., they share a common label set and hold their own private label set, and iii) requires only a few labeled samples in each class of the target domain. To address USMA, we propose a collaborative consistency training framework that regularizes the prediction consistency between two models, i.e., a pre-trained source model and its variant pre-trained with target data only, and combines their complementary strengths to learn a more powerful model. The rationale of our framework stems from the observation that the source model performs better on common categories than the target-only model, while on target-private categories, the target-only model performs better. We also propose a two-perspective, i.e., sample-wise and class-wise, consistency regularization to improve the training. Experimental results demonstrate the effectiveness of our method on several benchmark datasets.

Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司