我的美女教师在线观看免费_色又黄又刺激的国语对白视频_一级欧洲性爱视频_五月丁香六月婷婷视频在线观看_99久久亚洲综合精品成人毛片_激情综合网婷婷深爱五月丁香_国产高清乱伦视频

Rapid growth of genetic databases means huge savings from improvements in their data compression, what requires better inexpensive statistical models. This article proposes automatized optimizations e.g. of Markov-like models, especially context binning and model clustering. While it is popular to cut low bits of context, proposed context binning optimizes such reduction as tabled: state=bin[context] determining probability distribution, this way extracting nearly all useful information also from very large contexts, into a small number of states. Model clustering uses k-means clustering in space of general statistical models, allowing to optimize a few models (as cluster centroids) to be chosen e.g. separately for each read. There are also briefly discussed some adaptivity techniques to include data non-stationarity. This article is work in progress, to be expanded in the future.

相關內容

簇(cu)

關注 1

異常檢測 · Extensibility · 模型評估 · Conformer · 樣本復雜度 ·

2022 年 4 月 20 日

Composite Anomaly Detection via Hierarchical Dynamic Search

Benjamin Wolff,Tomer Gafni,Guy Revach,Nir Shlezinger,Kobi Cohen

Anomaly detection among a large number of processes arises in many applications ranging from dynamic spectrum access to cybersecurity. In such problems one can often obtain noisy observations aggregated from a chosen subset of processes that conforms to a tree structure. The distribution of these observations, based on which the presence of anomalies is detected, may be only partially known. This gives rise to the need for a search strategy designed to account for both the sample complexity and the detection accuracy, as well as cope with statistical models that are known only up to some missing parameters. In this work we propose a sequential search strategy using two variations of the Generalized Local Likelihood Ratio statistic. Our proposed Hierarchical Dynamic Search (HDS) strategy is shown to be order-optimal with respect to the size of the search space and asymptotically optimal with respect to the detection accuracy. An explicit upper bound on the error probability of HDS is established for the finite sample regime. Extensive experiments are conducted, demonstrating the performance gains of HDS over existing methods.

知識 (knowledge) · 查準率/準確率 · PCA · 基 · 知識庫 ·

2022 年 4 月 18 日

Knowledge Base Index Compression via Dimensionality and Precision Reduction

Vilém Zouhar,Marius Mosbach,Miaoran Zhang,Dietrich Klakow

from arxiv, To be presented at Spa-NLP workshop at ACL 2022

Recently neural network based approaches to knowledge-intensive NLP tasks, such as question answering, started to rely heavily on the combination of neural retrievers and readers. Retrieval is typically performed over a large textual knowledge base (KB) which requires significant memory and compute resources, especially when scaled up. On HotpotQA we systematically investigate reducing the size of the KB index by means of dimensionality (sparse random projections, PCA, autoencoders) and numerical precision reduction. Our results show that PCA is an easy solution that requires very little data and is only slightly worse than autoencoders, which are less stable. All methods are sensitive to pre- and post-processing and data should always be centered and normalized both before and after dimension reduction. Finally, we show that it is possible to combine PCA with using 1bit per dimension. Overall we achieve (1) 100$\times$ compression with 75%, and (2) 24$\times$ compression with 92% original retrieval performance.

Automator · Machine Learning · 學成 · AutoML · Performer ·

2022 年 4 月 18 日

AutoMLBench: A Comprehensive Experimental Evaluation of Automated Machine Learning Frameworks

Hassan Eldeeb,Mohamed Maher,Oleh Matsuk,Abdelrahman Aldallal,Radwa Elshawi,Sherif Sak

Nowadays, machine learning is playing a crucial role in harnessing the power of the massive amounts of data that we are currently producing every day in our digital world. With the booming demand for machine learning applications, it has been recognized that the number of knowledgeable data scientists can not scale with the growing data volumes and application needs in our digital world. In response to this demand, several automated machine learning (AutoML) techniques and frameworks have been developed to fill the gap of human expertise by automating the process of building machine learning pipelines. In this study, we present a comprehensive evaluation and comparison of the performance characteristics of six popular AutoML frameworks, namely, Auto-Weka, AutoSKlearn, TPOT, Recipe, ATM, and SmartML across 100 data sets from established AutoML benchmark suites. Our experimental evaluation considers different aspects for its comparison including the performance impact of several design decisions including time budget, size of search space, meta-learning, and ensemble construction. The results of our study reveal various interesting insights that can significantly guide and impact the design of AutoML frameworks.

Networking · 代碼 · 線性的 · Performer · Performance ·

2022 年 4 月 18 日

Completion Delay of Random Linear Network Coding in Full-Duplex Relay Networks

Rina Su,Qifu Tyler Sun,Zhongshan Zhang

As the next-generation wireless networks thrive, full-duplex and relaying techniques are combined to improve the network performance. Random linear network coding (RLNC) is another popular technique to enhance the efficiency and reliability in wireless communications. In this paper, in order to explore the potential of RLNC in full-duplex relay networks, we investigate two fundamental perfect RLNC schemes and theoretically analyze their completion delay performance. The first scheme is a straightforward application of conventional perfect RLNC studied in wireless broadcast, so it involves no additional process at the relay. Its performance serves as an upper bound among all perfect RLNC schemes. The other scheme allows sufficiently large buffer and unconstrained linear coding at the relay. It attains the optimal performance and serves as a lower bound among all RLNC schemes. For both schemes, closed-form formulae to characterize the expected completion delay at a single receiver as well as for the whole system are derived. Numerical results are also demonstrated to justify the theoretical characterizations, and compare the two new schemes with the existing one.

可約的 · 服務器 · 邊 · Continuity · Performer ·

2022 年 4 月 15 日

Server Free Wireless Federated Learning: Architecture, Algorithm, and Analysis

Howard H. Yang,Zihan Chen,Tony Q. S. Quek

We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.

學成 · 高斯分布 · UniFormer · Pair · contrastive ·

2022 年 4 月 14 日

Testing distributional assumptions of learning algorithms

Ronitt Rubinfeld,Arsen Vasilyan

There are many important high dimensional function classes that have fast agnostic learning algorithms when strong assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be sufficiently confident that the data indeed satisfies the distributional assumption, so that one can trust in the output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs $(\mathcal{A},\mathcal{T})$, such that if the distribution on examples in the data passes the tester $\mathcal{T}$ then one can safely trust the output of the agnostic learner $\mathcal{A}$ on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with a combined run-time of $n^{\tilde{O}(1/\epsilon^4)}$. This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussian distribution testers do not exist for the $L_1$ and EMD distance measures. A key step in the analysis is a novel characterization of concentration and anti-concentration properties of a distribution whose low-degree moments approximately match those of a Gaussian. We also use tools from polynomial approximation theory. In contrast, we show strong lower bounds on the combined run-times of tester-learner pairs for the problems of agnostically learning convex sets under the Gaussian distribution and for monotone Boolean functions under the uniform distribution over $\{0,1\}^n$. Through these lower bounds we exhibit natural problems where there is a dramatic gap between standard agnostic learning run-time and the run-time of the best tester-learner pair.

多峰值 · 學成 · Extensibility · 深度學習 · Processing（編程語言） ·

2021 年 5 月 24 日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Jabeen Summaira,Xi Li,Amin Muhammad Shoib,Songyuan Li,Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals. Detailed analysis of past and current baseline approaches and an in-depth study of recent advancements in multimodal deep learning applications has been provided. A fine-grained taxonomy of various multimodal deep learning applications is proposed, elaborating on different applications in more depth. Architectures and datasets used in these applications are also discussed, along with their evaluation metrics. Last, main issues are highlighted separately for each domain along with their possible future research directions.

社區發現 · 結點 · 生成模型 · 表示學習 · 多項分布 ·

2019 年 9 月 17 日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Fan-Yun Sun,Meng Qu,Jordan Hoffmann,Chin-Wei Huang,Jian Tang

from arxiv, Accepted Paper at NeurIPS 2019

This paper focuses on two fundamental tasks of graph analysis: community detection and node representation learning, which capture the global and local structures of graphs, respectively. In the current literature, these two tasks are usually independently studied while they are actually highly correlated. We propose a probabilistic generative model called vGraph to learn community membership and node representation collaboratively. Specifically, we assume that each node can be represented as a mixture of communities, and each community is defined as a multinomial distribution over nodes. Both the mixing coefficients and the community distribution are parameterized by the low-dimensional representations of the nodes and communities. We designed an effective variational inference algorithm which regularizes the community membership of neighboring nodes to be similar in the latent space. Experimental results on multiple real-world graphs show that vGraph is very effective in both community detection and node representation learning, outperforming many competitive baselines in both tasks. We show that the framework of vGraph is quite flexible and can be easily extended to detect hierarchical communities.

Networking · Neural Networks · MoDELS · Performer · 模型性能 ·

2019 年 9 月 8 日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Yu Cheng,Duo Wang,Pan Zhou,Tao Zhang

from arxiv, Published in IEEE Signal Processing Magazine, arXiv version including some recent works

Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.

知識表示 · Things · 推薦系統 · MoDELS · 邊 ·

2018 年 5 月 10 日

A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things

Yinhao Li,Awa Alqahtani,Ellis Solaiman,Charith Perera,Prem Prakash Jayaraman,Boualem Benatallah,Rajiv Ranjan

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.