日本人体黄色三级视频,日本一区二区三区免费在线观看

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be tailored to specific tasks in order to incorporate considerations such as the input length, resolution, and dimentionality. In this work, we overcome the need for problem-specific CNN architectures with our Continuous Convolutional Neural Network (CCNN): a single CNN architecture equipped with continuous convolutional kernels that can be used for tasks on data of arbitrary resolution, dimensionality and length without structural changes. Continuous convolutional kernels model long range dependencies at every layer, and remove the need for downsampling layers and task-dependent depths needed in current CNN architectures. We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$\mathrm{D}$) and visual data (2$\mathrm{D}$). Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.

相關內容

CNN

關注 2

估計/估計量 · Continuity · Performer · Analysis · FAST ·

2022 年 7 月 24 日

Fast convergence rates for dose-response estimation

Matteo Bonvini,Edward H. Kennedy

We consider the problem of estimating a dose-response curve, both globally and locally at a point. Continuous treatments arise often in practice, e.g. in the form of time spent on an operation, distance traveled to a location or dosage of a drug. Letting A denote a continuous treatment variable, the target of inference is the expected outcome if everyone in the population takes treatment level A=a. Under standard assumptions, the dose-response function takes the form of a partial mean. Building upon the recent literature on nonparametric regression with estimated outcomes, we study three different estimators. As a global method, we construct an empirical-risk-minimization-based estimator with an explicit characterization of second-order remainder terms. As a local method, we develop a two-stage, doubly-robust (DR) learner. Finally, we construct a mth-order estimator based on the theory of higher-order influence functions. Under certain conditions, this higher order estimator achieves the fastest rate of convergence that we are aware of for this problem. However, the other two approaches are easier to implement using off-the-shelf software, since they are formulated as two-stage regression tasks. For each estimator, we provide an upper bound on the mean-square error and investigate its finite-sample performance in a simulation. Finally, we describe a flexible, nonparametric method to perform sensitivity analysis to the no-unmeasured-confounding assumption when the treatment is continuous.

Analysis · 劃分 · state-of-the-art · 可約的 · Projection ·

2022 年 7 月 24 日

CARGO: AI-Guided Dependency Analysis for Migrating Monolithic Applications to Microservices Architecture

Vikram Nitin,Shubhi Asthana,Baishakhi Ray,Rahul Krishna

from arxiv, ASE '22, October 10-14, 2022, Ann Arbor, MI, USA

Microservices Architecture (MSA) has become a de-facto standard for designing cloud-native enterprise applications due to its efficient infrastructure setup, service availability, elastic scalability, dependability, and better security. Existing (monolithic) systems must be decomposed into microservices to harness these characteristics. Since manual decomposition of large scale applications can be laborious and error-prone, AI-based systems to detect decomposition strategies are gaining popularity. However, the usefulness of these approaches is limited by the expressiveness of the program representation and their inability to model the application's dependency on critical external resources such as databases. Consequently, partitioning recommendations offered by current tools result in architectures that result in (a) distributed monoliths, and/or (b) force the use of (often criticized) distributed transactions. This work attempts to overcome these challenges by introducing CARGO({short for [C]ontext-sensitive l[A]bel p[R]opa[G]ati[O]n})-a novel un-/semi-supervised partition refinement technique that uses a context- and flow-sensitive system dependency graph of the monolithic application to refine and thereby enrich the partitioning quality of the current state-of-the-art algorithms. CARGO was used to augment four state-of-the-art microservice partitioning techniques that were applied on five Java EE applications (including one industrial scale proprietary project). Experiments demostrate that CARGO can improve the partition quality of all modern microservice partitioning techniques. Further, CARGO substantially reduces distributed transactions and a real-world performance evaluation of a benchmark application (deployed under varying loads) shows that CARGO also lowers the overall the latency of the deployed microservice application by 11% and increases throughput by 120% on average.

數據增強 · 生成方法 · 訓練數據 · Processing（編程語言） · Performer ·

2022 年 7 月 22 日

Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers

Markus Bayer,Marc-André Kaufhold,Bj?rn Buchhold,Marcel Keller,J?rg Dallmeyer,Christian Reuter

from arxiv, 17 pages, 3 figure, 5 tables

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve classifiers by artificially created training data. In NLP, there is the challenge of establishing universal rules for text transformations which provide new linguistic patterns. In this paper, we present and evaluate a text generation method suitable to increase the performance of classifiers for long and short texts. We achieved promising improvements when evaluating short as well as long text tasks with the enhancement by our text generation method. Especially with regard to small data analytics, additive accuracy gains of up to 15.53% and 3.56% are achieved within a constructed low data regime, compared to the no augmentation baseline and another data augmentation technique. As the current track of these constructed regimes is not universally applicable, we also show major improvements in several real world low data tasks (up to +4.84 F1-score). Since we are evaluating the method from many perspectives (in total 11 datasets), we also observe situations where the method might not be suitable. We discuss implications and patterns for the successful application of our approach on different types of datasets.

數據集 · 可約的 · Analysis · Airbnb · 可辨認的 ·

2022 年 7 月 22 日

What's in the laundromat? Mapping and characterising offshore owned domestic property in London

Jonathan Bourne,Andrea Ingianni,Rex McKenzie

from arxiv, 27 pages, 7 figures, 7 tables

The UK, particularly London, is a global hub for money laundering, a significant portion of which uses domestic property. However, understanding the distribution and characteristics of offshore domestic property in the UK is challenging due to data availability. This paper attempts to remedy that situation by enhancing a publicly available dataset of UK property owned by offshore companies. We create a data processing pipeline which draws on several datasets and machine learning techniques to create a parsed set of addresses classified into six use classes. The enhanced dataset contains 138,000 properties 44,000 more than the original dataset. The majority are domestic (95k), with a disproportionate amount of those in London (42k). The average offshore domestic property in London is worth 1.33 million GBP collectively this amounts to approximately 56 Billion GBP. We perform an in-depth analysis of the offshore domestic property in London, comparing the price, distribution and entropy/concentration with Airbnb property, low-use/empty property and conventional domestic property. We estimate that the total amount of offshore, low-use and airbnb property in London is between 144,000 and 164,000 and that they are collectively worth between 145-174 billion GBP. Furthermore, offshore domestic property is more expensive and has higher entropy/concentration than all other property types. In addition, we identify two different types of offshore property, nested and individual, which have different price and distribution characteristics. Finally, we release the enhanced offshore property dataset, the complete low-use London dataset and the pipeline for creating the enhanced dataset to reduce the barriers to studying this topic.

Performer · 對數幾率Sigmoid · Neural Networks · Networking · 激活函數 ·

2021 年 9 月 29 日

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

Shiv Ram Dubey,Satish Kumar Singh,Bidyut Baran Chaudhuri

from arxiv, Submitted to Springer

Neural networks have shown tremendous growth in recent years to solve numerous problems. Various types of neural networks have been introduced to deal with different types of problems. However, the main goal of any neural network is to transform the non-linearly separable input data into more linearly separable abstract features using a hierarchy of layers. These layers are combinations of linear and nonlinear functions. The most popular and common non-linearity layers are activation functions (AFs), such as Logistic Sigmoid, Tanh, ReLU, ELU, Swish and Mish. In this paper, a comprehensive overview and survey is presented for AFs in neural networks for deep learning. Different classes of AFs such as Logistic Sigmoid and Tanh based, ReLU based, ELU based, and Learning based are covered. Several characteristics of AFs such as output range, monotonicity, and smoothness are also pointed out. A performance comparison is also performed among 18 state-of-the-art AFs with different networks on different types of data. The insights of AFs are presented to benefit the researchers for doing further research and practitioners to select among different choices. The code used for experimental comparison is released at: \url{//github.com/shivram1987/ActivationFunctions}.

特化 · 可約的 · Neural Networks · 剪枝 · Networking ·

2021 年 1 月 31 日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Torsten Hoefler,Dan Alistarh,Tal Ben-Nun,Nikoli Dryden,Alexandra Peste

from arxiv, 90 pages, 26 figures

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

卷積神經網絡 · Neural Networks · Performer · Seven · Processing（編程語言） ·

2019 年 1 月 17 日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Asifullah Khan,Anabia Sohail,Umme Zahoora,Aqsa Saeed Qureshi

from arxiv, Number of Pages: 60 Number of Figures: 11 Number of Tables:1

Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.

圖像分割 · 超參數 · state-of-the-art · Networking · Automator ·

2018 年 7 月 19 日

Automatically Designing CNN Architectures for Medical Image Segmentation

Aliasghar Mortazi,Ulas Bagci

from arxiv, Accepted to Machine Learning in Medical Imaging (MLMI 2018)

Deep neural network architectures have traditionally been designed and explored with human expertise in a long-lasting trial-and-error process. This process requires huge amount of time, expertise, and resources. To address this tedious problem, we propose a novel algorithm to optimally find hyperparameters of a deep network architecture automatically. We specifically focus on designing neural architectures for medical image segmentation task. Our proposed method is based on a policy gradient reinforcement learning for which the reward function is assigned a segmentation evaluation utility (i.e., dice index). We show the efficacy of the proposed method with its low computational cost in comparison with the state-of-the-art medical image segmentation networks. We also present a new architecture design, a densely connected encoder-decoder CNN, as a strong baseline architecture to apply the proposed hyperparameter search algorithm. We apply the proposed algorithm to each layer of the baseline architectures. As an application, we train the proposed system on cine cardiac MR images from Automated Cardiac Diagnosis Challenge (ACDC) MICCAI 2017. Starting from a baseline segmentation architecture, the resulting network architecture obtains the state-of-the-art results in accuracy without performing any trial-and-error based architecture design approaches or close supervision of the hyperparameters changes.

SOFT · 硬性注意力 · 注意力機制 · Performer · MoDELS ·

2018 年 1 月 31 日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Tao Shen,Tianyi Zhou,Guodong Long,Jing Jiang,Sen Wang,Chengqi Zhang

from arxiv, 12 pages, 3 figures

Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.

條件隨機場 · 隨機場 · INFORMS · 圖像分割 · 卷積神經網絡 ·

2017 年 12 月 27 日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Fahim Irfan Alam,Jun Zhou,Alan Wee-Chung Liew,Xiuping Jia,Jocelyn Chanussot,Yongsheng Gao

from arxiv, Submitted for Journal (Version 2)

Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.