清纯唯美另类亚洲欧美综合_精品激情在线观看视频_91精品国产高清久久_欧美日韩精品一区二区性色A V_午夜刺激在线观看无打码_国产色精品一区二区三区_亚洲一级无码日韩不卡二区

Monitoring wastewater concentrations of SARS-CoV-2 yields a low-cost, noninvasive method for tracking disease prevalence and provides early warning signs of upcoming outbreaks in the serviced communities. There is tremendous clinical and public health interest in understanding the exact dynamics between wastewater viral loads and infection rates in the population. As both data sources may contain substantial noise and missingness, in addition to spatial and temporal dependencies, properly modeling this relationship must address these numerous complexities simultaneously while providing interpretable and clear insights. We propose a novel Bayesian functional concurrent regression model that accounts for both spatial and temporal correlations while estimating the dynamic effects between wastewater concentrations and positivity rates over time. We explicitly model the time lag between the two series and provide full posterior inference on the possible delay between spikes in wastewater concentrations and subsequent outbreaks. We estimate a time lag likely between 5 to 11 days between spikes in wastewater levels and reported clinical positivity rates. Additionally, we find a dynamic relationship between wastewater concentration levels and the strength of its association with positivity rates that fluctuates between outbreaks and non-outbreaks.

相關內容

泛函

關注 0

損失函數（機器學習） · 泛函 · 損失 · Taxonomy · Machine Learning ·

2023 年 1 月 13 日

A survey and taxonomy of loss functions in machine learning

Lorenzo Ciampiconi,Adam Elwood,Marco Leonardi,Ashraf Mohamed,Alessandro Rozza

Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

泛化理論 · UniFormer · 未標記 · TOOLS · 可辨認的 ·

2021 年 10 月 17 日

Explaining generalization in deep learning: progress and fundamental limits

Vaishnavh Nagarajan

from arxiv, arXiv admin note: text overlap with arXiv:1902.04742

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the thesis, we will empirically study how training deep networks via stochastic gradient descent implicitly controls the networks' capacity. Subsequently, to show how this leads to better generalization, we will derive {\em data-dependent} {\em uniform-convergence-based} generalization bounds with improved dependencies on the parameter count. Uniform convergence has in fact been the most widely used tool in deep learning literature, thanks to its simplicity and generality. Given its popularity, in this thesis, we will also take a step back to identify the fundamental limits of uniform convergence as a tool to explain generalization. In particular, we will show that in some example overparameterized settings, {\em any} uniform convergence bound will provide only a vacuous generalization bound. With this realization in mind, in the last part of the thesis, we will change course and introduce an {\em empirical} technique to estimate generalization using unlabeled data. Our technique does not rely on any notion of uniform-convergece-based complexity and is remarkably precise. We will theoretically show why our technique enjoys such precision. We will conclude by discussing how future work could explore novel ways to incorporate distributional assumptions in generalization bounds (such as in the form of unlabeled data) and explore other tools to derive bounds, perhaps by modifying uniform convergence or by developing completely new tools altogether.

Continuity · state-of-the-art · 學成 · Extensibility · Networking ·

2021 年 4 月 16 日

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange,Rahaf Aljundi,Marc Masana,Sarah Parisot,Xu Jia,Ales Leonardis,Gregory Slabaugh,Tinne Tuytelaars

from arxiv, Accepted TPAMI paper, including Appendix, code publicly available

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

特化 · 可約的 · Neural Networks · 剪枝 · Networking ·

2021 年 1 月 31 日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Torsten Hoefler,Dan Alistarh,Tal Ben-Nun,Nikoli Dryden,Alexandra Peste

from arxiv, 90 pages, 26 figures

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

contrastive · 學成 · 對比學習 · Extensibility · SSL ·

2020 年 6 月 18 日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Krishna Chaitanya,Ertunc Erdil,Neerav Karani,Ender Konukoglu

from arxiv, 16 pages, 2 figures, 7 tables. This article is a pre-print and is currently under review at a conference

A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Self-supervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

學成 · Performer · 深度學習 · Processing（編程語言） · 圖像處理 ·

2018 年 7 月 31 日

Deep learning in agriculture: A survey

Andreas Kamilaris,Francesc X. Prenafeta-Boldu

Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.

DeepWalk · 網絡嵌入 · node2vec · Networking · 分解的 ·

2017 年 12 月 12 日

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

Jiezhong Qiu,Yuxiao Dong,Hao Ma,Jian Li,Kuansan Wang,Jie Tang

from arxiv, 9 pages, published in WSDM 2018 proceedings

Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.