姑娘日本电影免费观看全集中文,成人亚洲国产综合精品夜色,欧美系列在线,2020久久天天躁狠狠躁夜夜,在线中文字幕播放精品

Climate change results in altered air and water temperatures. Increases affect physicochemical properties, such as oxygen concentration, and can shift species distribution and survival, with consequences for ecosystem functioning and services. These ecosystem services have integral value for humankind and are forecasted to alter under climate warming. A mechanistic understanding of the drivers and magnitude of expected changes is essential in identifying system resilience and mitigation measures. In this work, we present a selection of state-of-the-art Neural Networks (NN) for the prediction of water temperatures in six streams in Germany. We show that the use of methods that compare observed and predicted values, exemplified with the Root Mean Square Error (RMSE), is not sufficient for their assessment. Hence we introduce additional analysis methods for our models to complement the state-of-the-art metrics. These analyses evaluate the NN's robustness, possible maximal and minimal values, and the impact of single input parameters on the output. We thus contribute to understanding the processes within the NN and help applicants choose architectures and input parameters for reliable water temperature prediction models.

相關內容

Neural Networks

關注 1648

神經網絡（Neural Networks）是世界上三個最古老的神經建模學會的檔案期刊:國際神經網絡學會(INNS)、歐洲神經網絡學會(ENNS)和日本神經網絡學會(JNNS)。神經網絡提供了一個論壇，以發展和培育一個國際社會的學者和實踐者感興趣的所有方面的神經網絡和相關方法的計算智能。神經網絡歡迎高質量論文的提交，有助于全面的神經網絡研究，從行為和大腦建模，學習算法，通過數學和計算分析，系統的工程和技術應用，大量使用神經網絡的概念和技術。這一獨特而廣泛的范圍促進了生物和技術研究之間的思想交流，并有助于促進對生物啟發的計算智能感興趣的跨學科社區的發展。因此，神經網絡編委會代表的專家領域包括心理學，神經生物學，計算機科學，工程，數學，物理。該雜志發表文章、信件和評論以及給編輯的信件、社論、時事、軟件調查和專利信息。文章發表在五個部分之一:認知科學，神經科學，學習系統，數學和計算分析、工程和應用。官網地址：

INTERACT · Better · state-of-the-art · 近似 · 推斷 ·

2021 年 12 月 1 日

Long Term Motion Prediction Using Keyposes

Sena Kiciroglu,Wei Wang,Mathieu Salzmann,Pascal Fua

from arxiv, Code publicly available at: //github.com/senakicir/KeyposePrediction

Long term human motion prediction is essential in safety-critical applications such as human-robot interaction and autonomous driving. In this paper we show that to achieve long term forecasting, predicting human pose at every time instant is unnecessary. Instead, it is more effective to predict a few keyposes and approximate intermediate ones by linearly interpolating the keyposes. We will demonstrate that our approach enables us to predict realistic motions for up to 5 seconds in the future, which is far larger than the typical 1 second encountered in the literature. Furthermore, because we model future keyposes probabilistically, we can generate multiple plausible future motions by sampling at inference time. Over this extended time period, our predictions are more realistic, more diverse and better preserve the motion dynamics than those state-of-the-art methods yield.

優化器 · 向量空間 · 可約的 · 推斷 · Performer ·

2021 年 12 月 1 日

Posterior Temperature Optimization in Variational Inference for Inverse Problems

Max-Heinrich Laves,Malte T?lle,Alexander Schlaefer,Sandy Engelhardt

from arxiv, Accepted at Bayesian Deep Learning workshop, NeurIPS 2021

Bayesian methods feature useful properties for solving inverse problems, such as tomographic reconstruction. The prior distribution introduces regularization, which helps solving the ill-posed problem and reduces overfitting. In practice, this often results in a suboptimal posterior temperature and the full potential of the Bayesian approach is not realized. In this paper, we optimize both the parameters of the prior distribution and the posterior temperature using Bayesian optimization. Well-tempered posteriors lead to better predictive performance and improved uncertainty calibration, which we demonstrate for the task of sparse-view CT reconstruction.

雅克比 · 評論員 · Networking · Neural Networks · 層 ·

2021 年 11 月 30 日

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Darshil Doshi,Tianyu He,Andrey Gromov

from arxiv, 28 pages, 8 figures

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new way to diagnose (both theoretically and empirically) this criticality. To that end, we introduce partial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0<l$. These quantities are particularly useful when the network architecture involves many different layers. We discuss various properties of the partial Jacobians such as their scaling with depth and relation to the neural tangent kernel (NTK). We derive the recurrence relations for the partial Jacobians and utilize them to analyze criticality of deep MLP networks with (and without) LayerNorm. We find that the normalization layer changes the optimal values of hyperparameters and critical exponents. We argue that LayerNorm is more stable when applied to preactivations, rather than activations due to larger correlation depth.

MoDELS · 近似 · Processing（編程語言） · 控制器 · 近似誤差 ·

2021 年 11 月 30 日

Prediction with Approximated Gaussian Process Dynamical Models

Thomas Beckers,Sandra Hirche

from arxiv, This article has been accepted for publication by IEEE

The modeling and simulation of dynamical systems is a necessary step for many control approaches. Using classical, parameter-based techniques for modeling of modern systems, e.g., soft robotics or human-robot interaction, is often challenging or even infeasible due to the complexity of the system dynamics. In contrast, data-driven approaches need only a minimum of prior knowledge and scale with the complexity of the system. In particular, Gaussian process dynamical models (GPDMs) provide very promising results for the modeling of complex dynamics. However, the control properties of these GP models are just sparsely researched, which leads to a "blackbox" treatment in modeling and control scenarios. In addition, the sampling of GPDMs for prediction purpose respecting their non-parametric nature results in non-Markovian dynamics making the theoretical analysis challenging. In this article, we present approximated GPDMs which are Markov and analyze their control theoretical properties. Among others, the approximated error is analyzed and conditions for boundedness of the trajectories are provided. The outcomes are illustrated with numerical examples that show the power of the approximated models while the the computational time is significantly reduced.

Networking · Neural Networks · 模型評估 · 人工神經網絡 · 通道 ·

2021 年 11 月 30 日

A Scheme of Channel Prediction Based on Artificial Neural Network

Zirui Wen,Ruisi He,Bo Ai,Chen Huang,Mi Yang,Zhangdui Zhong

Accurate channel modeling is the foundation of communication system design. However, the traditional measurement-based modeling approach has increasing challenges for the scenarios with insufficient measurement data. To obtain enough data for channel modeling, the Artificial Neural Network (ANN) is used in this paper to predict channel data. The high mobility railway channel is considered, which is a typical scenario where it is challenging to obtain enough data for modeling within a short sampling interval. Three types of ANNs, the Back Propagation Network, Radial Basis Function Neural Network and Extreme Learning Machine, are considered to predict channel path loss and shadow fading. The Root-Mean-Square error is used to evaluate prediction accuracy. The factors that may influence prediction accuracy are compared and discussed, including the type of network, number of neurons and proportion of training data. It is found that a larger number of neurons can significantly reduce prediction error, whereas the influence of proportion of training data is relatively small. The results can be used to improve modeling accuracy of path loss and shadow fading when measurement data is reduced.

前向 · Transformer · 情景 · Performer · Better ·

2021 年 8 月 11 日

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Songhua Liu,Tianwei Lin,Dongliang He,Fu Li,Ruifeng Deng,Xin Li,Errui Ding,Hao Wang

from arxiv, Accepted by ICCV 2021 (oral). Codes will be released on //github.com/wzmsltw/PaintTransformer

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since there is no dataset available for training the Paint Transformer, we devise a self-training pipeline such that it can be trained without any off-the-shelf dataset while still achieving excellent generalization capability. Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs. Codes and models are available.

CTR · UniFormer · MoDELS · Criteo · 模型評估 ·

2020 年 9 月 12 日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Jieming Zhu,Jinyang Liu,Shuai Yang,Qi Zhang,Xiuqiang He

from arxiv, Feebacks and comments are welcome!

In many applications, such as recommender systems, online advertising, and product search, click-through rate (CTR) prediction is a critical task, because its accuracy has a direct impact on both platform revenue and user experience. In recent years, with the prevalence of deep learning, CTR prediction has been widely studied in both academia and industry, resulting in an abundance of deep CTR models. Unfortunately, there is still a lack of a standardized benchmark and uniform evaluation protocols for CTR prediction. This leads to the non-reproducible and even inconsistent experimental results among these studies. In this paper, we present an open benchmark (namely FuxiCTR) for reproducible research and provide a rigorous comparison of different models for CTR prediction. Specifically, we ran over 4,600 experiments for a total of more than 12,000 GPU hours in a uniform framework to re-evaluate 24 existing models on two widely-used datasets, Criteo and Avazu. Surprisingly, our experiments show that many models have smaller differences than expected and sometimes are even inconsistent with what reported in the literature. We believe that our benchmark could not only allow researchers to gauge the effectiveness of new models conveniently, but also share some good practices to fairly compare with the state of the arts. We will release all the code and benchmark settings.

MoDELS · 可理解性 · Extensibility · 生成模型 · Engineering ·

2020 年 4 月 13 日

Reverse Engineering Configurations of Neural Text Generation Models

Yi Tay,Dara Bahri,Che Zheng,Clifford Brunk,Donald Metzler,Andrew Tomkins

from arxiv, ACL 2020

This paper seeks to develop a deeper understanding of the fundamental properties of neural text generations models. The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area. Previously, the extent and degree to which these artifacts surface in generated text has not been well studied. In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated a piece of text, and we conduct an extensive suite of diagnostic tests to observe whether modeling choices (e.g., sampling methods, top-$k$ probabilities, model architectures, etc.) leave detectable artifacts in the text they generate. Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by observing the generated text alone. This suggests that neural text generators may be more sensitive to various modeling choices than previously thought.

學成 · Neural Networks · 表示學習 · 分離的 · Performer ·

2018 年 12 月 10 日

Adaptive Neural Trees

Ryutaro Tanno,Kai Arulkumaran,Daniel C. Alexander,Antonio Criminisi,Aditya Nori

Deep neural networks and decision trees operate on largely separate paradigms; typically, the former performs representation learning with pre-specified architectures, while the latter is characterised by learning hierarchies over pre-specified features with data-driven architectures. We unite the two via adaptive neural trees (ANTs), a model that incorporates representation learning into edges, routing functions and leaf nodes of a decision tree, along with a backpropagation-based training algorithm that adaptively grows the architecture from primitive modules (e.g., convolutional layers). ANTs allow increased interpretability via hierarchical clustering, e.g., learning meaningful class associations, such as separating natural vs. man-made objects. We demonstrate this on classification and regression tasks, achieving over 99% and 90% accuracy on the MNIST and CIFAR-10 datasets, and outperforming standard neural networks, random forests and gradient boosted trees on the SARCOS dataset. Furthermore, ANT optimisation naturally adapts the architecture to the size and complexity of the training data.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.