亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider the problem of learning the latent community structure in a Multi-Layer Contextual Block Model introduced by Ma and Nandy (2021), where the average degree for each of the observed networks is of constant order and establish a sharp detection threshold for the community structure, above which detection is possible asymptotically, while below the threshold no procedure can perform better than random guessing. We further establish that the detection threshold coincides with the threshold for weak recovery of the common community structure using multiple correlated networks and co-variate matrices. Finally, we provide a quasi-polynomial time algorithm to estimate the latent communities in the recovery regime. Our results improve upon the results of Ma and Nandy (2021), which considered the diverging degree regime and recovers the results of Lu and Sen (2020) in the special case of a single network structure.

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

Distributed sparse learning for high dimensional parameters has attached vast attentions due to its wide application in prediction and classification in diverse fields of machine learning. Existing distributed sparse regression usually takes an average way to ensemble the local results produced by distributed machines, which enjoys low communication cost but is statistical inefficient. To address this problem, we proposed a new Weighted AVerage Estimate (WAVE) for high-dimensional regressions. The WAVE is a solution to a weighted least-square loss with an adaptive $L_1$ penalty, in which the $L_1$ penalty controls the sparsity and the weight promotes the statistical efficiency. It can not only achieve a balance between the statistical and communication efficiency, but also reach a faster rate than the average estimate with a very low communication cost, requiring the local machines delivering two vectors to the master merely. The consistency of parameter estimation and model selection is also provided, which guarantees the safety of using WAVE in the distributed system. The consistency also provides a way to make hypothisis testing on the parameter. Moreover, WAVE is robust to the heterogeneous distributed samples with varied mean and covariance across machines, which has been verified by the asymptotic normality under such conditions. Other competitors, however, do not own this property. The effectiveness of WAVE is further illustrated by extensive numerical studies and real data analyses.

The core numbers of vertices in a graph are one of the most well-studied cohesive subgraph models because of the linear running time. In practice, many data graphs are dynamic graphs that are continuously changing by inserting or removing edges. The core numbers are updated in dynamic graphs with edge insertions and deletions, which is called core maintenance. When a burst of a large number of inserted or removed edges come in, we have to handle these edges on time to keep up with the data stream. There are two main sequential algorithms for core maintenance, \textsc{Traversal} and \textsc{Order}. It is proved that the \textsc{Order} algorithm significantly outperforms the \alg{Traversal} algorithm over all tested graphs with up to 2,083 times speedups. To the best of our knowledge, all existing parallel approaches are based on the \alg{Traversal} algorithm; also, their parallelism exists only for affected vertices with different core numbers, which will reduce to sequential when all vertices have the same core numbers. In this paper, we propose a new parallel core maintenance algorithm based on the \alg{Order} algorithm. Importantly, our new approach always has parallelism, even for the graphs where all vertices have the same core numbers. Extensive experiments are conducted over real-world, temporal, and synthetic graphs on a 64-core machine. The results show that for inserting and removing 100,000 edges using 16-worker, our method achieves up to 289x and 10x times speedups compared with the most efficient existing method, respectively.

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. This model generates weights for adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge, our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best or competitive gain when a mixture of multiple resources is available, while being on par with strong baselines in the standard scenario. Hyper-X is also considerably more efficient in terms of parameters and resources compared to methods that train separate adapters. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages, showing the versatility of our approach beyond zero-shot transfer.

In this paper, we develop algorithms for joint user scheduling and three types of mmWave link configuration: relay selection, codebook optimization, and beam tracking in millimeter wave (mmWave) networks. Our goal is to design an online controller that dynamically schedules users and configures their links to minimize the system delay. To solve this complex scheduling problem, we model it as a dynamic decision-making process and develop two reinforcement learning-based solutions. The first solution is based on deep reinforcement learning (DRL), which leverages the proximal policy optimization to train a neural network-based solution. Due to the potential high sample complexity of DRL, we also propose an empirical multi-armed bandit (MAB)-based solution, which decomposes the decision-making process into a sequential of sub-actions and exploits classic maxweight scheduling and Thompson sampling to decide those sub-actions. Our evaluation of the proposed solutions confirms their effectiveness in providing acceptable system delay. It also shows that the DRL-based solution has better delay performance while the MAB-based solution has a faster training process.

The current success of Graph Neural Networks (GNNs) usually relies on loading the entire attributed graph for processing, which may not be satisfied with limited memory resources, especially when the attributed graph is large. This paper pioneers to propose a Binary Graph Convolutional Network (Bi-GCN), which binarizes both the network parameters and input node attributes and exploits binary operations instead of floating-point matrix multiplications for network compression and acceleration. Meanwhile, we also propose a new gradient approximation based back-propagation method to properly train our Bi-GCN. According to the theoretical analysis, our Bi-GCN can reduce the memory consumption by an average of ~31x for both the network parameters and input data, and accelerate the inference speed by an average of ~51x, on three citation networks, i.e., Cora, PubMed, and CiteSeer. Besides, we introduce a general approach to generalize our binarization method to other variants of GNNs, and achieve similar efficiencies. Although the proposed Bi-GCN and Bi-GNNs are simple yet efficient, these compressed networks may also possess a potential capacity problem, i.e., they may not have enough storage capacity to learn adequate representations for specific tasks. To tackle this capacity problem, an Entropy Cover Hypothesis is proposed to predict the lower bound of the width of Bi-GNN hidden layers. Extensive experiments have demonstrated that our Bi-GCN and Bi-GNNs can give comparable performances to the corresponding full-precision baselines on seven node classification datasets and verified the effectiveness of our Entropy Cover Hypothesis for solving the capacity problem.

Today, an increasing number of Adaptive Deep Neural Networks (AdNNs) are being used on resource-constrained embedded devices. We observe that, similar to traditional software, redundant computation exists in AdNNs, resulting in considerable performance degradation. The performance degradation is dependent on the input and is referred to as input-dependent performance bottlenecks (IDPBs). To ensure an AdNN satisfies the performance requirements of resource-constrained applications, it is essential to conduct performance testing to detect IDPBs in the AdNN. Existing neural network testing methods are primarily concerned with correctness testing, which does not involve performance testing. To fill this gap, we propose DeepPerform, a scalable approach to generate test samples to detect the IDPBs in AdNNs. We first demonstrate how the problem of generating performance test samples detecting IDPBs can be formulated as an optimization problem. Following that, we demonstrate how DeepPerform efficiently handles the optimization problem by learning and estimating the distribution of AdNNs' computational consumption. We evaluate DeepPerform on three widely used datasets against five popular AdNN models. The results show that DeepPerform generates test samples that cause more severe performance degradation (FLOPs: increase up to 552\%). Furthermore, DeepPerform is substantially more efficient than the baseline methods in generating test inputs(runtime overhead: only 6-10 milliseconds).

Graph Neural Networks (GNNs) draw their strength from explicitly modeling the topological information of structured data. However, existing GNNs suffer from limited capability in capturing the hierarchical graph representation which plays an important role in graph classification. In this paper, we innovatively propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. Specifically, disentangled graph capsules are established by identifying heterogeneous factors underlying each node, such that their instantiation parameters represent different properties of the same entity. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higher-level capsules (whole) by explicitly considering the structure information among the parts. Experimental studies demonstrate the effectiveness of HGCN and the contribution of each component.

There has been appreciable progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. In this paper we theoretically group different approaches under a unifying framework and empirically investigate the effectiveness of different network representation methods. In particular, we argue that most of the UNRL approaches either explicitly or implicit model and exploit context information of a node. Consequently, we propose a framework that casts a variety of approaches -- random walk based, matrix factorization and deep learning based -- into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study the differences among these methods in detail which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering 9 popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks -- node classification and link prediction. We find that there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph's topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.

In this paper, we present an accurate and scalable approach to the face clustering task. We aim at grouping a set of faces by their potential identities. We formulate this task as a link prediction problem: a link exists between two faces if they are of the same identity. The key idea is that we find the local context in the feature space around an instance (face) contains rich information about the linkage relationship between this instance and its neighbors. By constructing sub-graphs around each instance as input data, which depict the local context, we utilize the graph convolution network (GCN) to perform reasoning and infer the likelihood of linkage between pairs in the sub-graphs. Experiments show that our method is more robust to the complex distribution of faces than conventional methods, yielding favorably comparable results to state-of-the-art methods on standard face clustering benchmarks, and is scalable to large datasets. Furthermore, we show that the proposed method does not need the number of clusters as prior, is aware of noises and outliers, and can be extended to a multi-view version for more accurate clustering accuracy.

北京阿比特科技有限公司