清纯唯美另类亚洲欧美综合-亚洲AV无码一区二区三区久久

We define and study a fully-convolutional neural network stochastic model, NN-Turb, which generates a 1-dimensional field with some turbulent velocity statistics. In particular, the generated process satisfies the Kolmogorov 2/3 law for second order structure function. It also presents negative skewness across scales (i.e. Kolmogorov 4/5 law) and exhibits intermittency as characterized by skewness and flatness. Furthermore, our model is never in contact with turbulent data and only needs the desired statistical behavior of the structure functions across scales for training.

相關內容

統計量

關注 3

MoDELS · Networking · Tensor · Neural Networks · GPT-2 ·

2024 年 1 月 29 日

TQCompressor: improving tensor decomposition methods in neural networks via permutations

V. Abronin,A. Naumov,D. Mazur,D. Bystrov,K. Tsarova,Ar. Melnikov,I. Oseledets,S. Dolgov,R. Brasher,M. Perelshtein

We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions. We explore the challenges posed by the computational and storage demands of pre-trained language models in NLP tasks and propose a permutation-based enhancement to Kronecker decomposition. This enhancement makes it possible to reduce loss in model expressivity which is usually associated with factorization. We demonstrate this method applied to the GPT-2$_{small}$. The result of the compression is TQCompressedGPT-2 model, featuring 81 mln. parameters compared to 124 mln. in the GPT-2$_{small}$. We make TQCompressedGPT-2 publicly available. We further enhance the performance of the TQCompressedGPT-2 through a training strategy involving multi-step knowledge distillation, using only a 3.1% of the OpenWebText. TQCompressedGPT-2 surpasses DistilGPT-2 and KnGPT-2 in comparative evaluations, marking an advancement in the efficient and effective deployment of models in resource-constrained environments.

MoDELS · 混合模型 · Analysis ·

2024 年 1 月 29 日

Comparative analysis of phase-field and intrinsic cohesive zone models for fracture simulations in multiphase materials with interfaces: Investigation of the influence of the microstructure on the fracture properties

Rasoul Najafi Koopas,Shahed Rezaei,Natalie Rauter,Richard Ostwald,Rolf Lammering

This study evaluates four widely used fracture simulation methods, comparing their computational expenses and implementation complexities within the Finite Element (FE) framework when employed on heterogeneous solids. Fracture methods considered encompass the intrinsic Cohesive Zone Model (CZM) using zero-thickness cohesive interface elements (CIEs), the Standard Phase-Field Fracture (SPFM) approach, the Cohesive Phase-Field fracture (CPFM) approach, and an innovative hybrid model. The hybrid approach combines the CPFM fracture method with the CZM, specifically applying the CZM within the interface zone. A significant finding from this investigation is that the CPFM method is in agreement with the hybrid model when the interface zone thickness is not excessively small. This implies that the CPFM fracture methodology may serve as a unified fracture approach for multiphase materials, provided the interface zone's thickness is comparable to that of the other phases. In addition, this research provides valuable insights that can advance efforts to fine-tune material microstructures. An investigation of the influence of the interface material properties, morphological features and spatial arrangement of inclusions showes a pronounced effect of these parameters on the fracture toughness of the material.

推斷 · 頻率主義學派 · 平滑 · 可辨認的 · 估計/估計量 ·

2024 年 1 月 29 日

Bayesian one- and two-sided inference on the local effective dimension

Eduard Belitser

It is a challenge to manage infinite- or high-dimensional data in situations where storage, transmission, or computation resources are constrained. In the simplest scenario when the data consists of a noisy infinite-dimensional signal, we introduce the notion of local \emph{effective dimension} (i.e., pertinent to the underlying signal), formulate and study the problem of its recovery on the basis of noisy data. This problem can be associated to the problems of adaptive quantization, (lossy) data compression, oracle signal estimation. We apply a Bayesian approach and study frequentists properties of the resulting posterior, a purely frequentist version of the results is also proposed. We derive certain upper and lower bounds results about identifying the local effective dimension which show that only the so called \emph{one-sided inference} on the local effective dimension can be ensured whereas the \emph{two-sided inference}, on the other hand, is in general impossible. We establish the \emph{minimal} conditions under which two-sided inference can be made. Finally, connection to the problem of smoothness estimation for some traditional smoothness scales (Sobolev scales) is considered.

Networking · 正則化項 · 廣義線性模型 · MoDELS · 線性模型 ·

2024 年 1 月 28 日

Doubly regularized generalized linear models for spatial observations with high-dimensional covariates

Arjun Sondhi,Si Cheng,Ali Shojaie

A discrete spatial lattice can be cast as a network structure over which spatially-correlated outcomes are observed. A second network structure may also capture similarities among measured features, when such information is available. Incorporating the network structures when analyzing such doubly-structured data can improve predictive power, and lead to better identification of important features in the data-generating process. Motivated by applications in spatial disease mapping, we develop a new doubly regularized regression framework to incorporate these network structures for analyzing high-dimensional datasets. Our estimators can easily be implemented with standard convex optimization algorithms. In addition, we describe a procedure to obtain asymptotically valid confidence intervals and hypothesis tests for our model parameters. We show empirically that our framework provides improved predictive accuracy and inferential power compared to existing high-dimensional spatial methods. These advantages hold given fully accurate network information, and also with networks which are partially misspecified or uninformative. The application of the proposed method to modeling COVID-19 mortality data suggests that it can improve prediction of deaths beyond standard spatial models, and that it selects relevant covariates more often.

Neural Networks · Networking · 優化器 · 控制器 · dynamic programming ·

2024 年 1 月 27 日

Deep multitask neural networks for solving some stochastic optimal control problems

Christian Yeo

from arxiv, 9 pages

Most existing neural network-based approaches for solving stochastic optimal control problems using the associated backward dynamic programming principle rely on the ability to simulate the underlying state variables. However, in some problems, this simulation is infeasible, leading to the discretization of state variable space and the need to train one neural network for each data point. This approach becomes computationally inefficient when dealing with large state variable spaces. In this paper, we consider a class of this type of stochastic optimal control problems and introduce an effective solution employing multitask neural networks. To train our multitask neural network, we introduce a novel scheme that dynamically balances the learning across tasks. Through numerical experiments on real-world derivatives pricing problems, we prove that our method outperforms state-of-the-art approaches.

INTERACT · 矩 · 估計/估計量 · 均值 · 線性的 ·

2024 年 1 月 27 日

A method of moments estimator for interacting particle systems and their mean field limit

Grigorios A. Pavliotis,Andrea Zanoni

We study the problem of learning unknown parameters in stochastic interacting particle systems with polynomial drift, interaction and diffusion functions from the path of one single particle in the system. Our estimator is obtained by solving a linear system which is constructed by imposing appropriate conditions on the moments of the invariant distribution of the mean field limit and on the quadratic variation of the process. Our approach is easy to implement as it only requires the approximation of the moments via the ergodic theorem and the solution of a low-dimensional linear system. Moreover, we prove that our estimator is asymptotically unbiased in the limits of infinite data and infinite number of particles (mean field limit). In addition, we present several numerical experiments that validate the theoretical analysis and show the effectiveness of our methodology to accurately infer parameters in systems of interacting particles.

正則化項 · Networking · 延拓法 · 路徑 · Continuity ·

2024 年 1 月 26 日

A multiobjective continuation method to compute the regularization path of deep neural networks

Augustina C. Amakor,Konstantin Sonntag,Sebastian Peitz

from arxiv, 7 pages, 6 figures

Sparsity is a highly desired feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models (due to the smaller number of relevant features), and robustness. In machine learning approaches based on linear models, it is well known that there exists a connecting path between the sparsest solution in terms of the $\ell^1$ norm,i.e., zero weights and the non-regularized solution, which is called the regularization path. Very recently, there was a first attempt to extend the concept of regularization paths to DNNs by means of treating the empirical loss and sparsity ($\ell^1$ norm) as two conflicting criteria and solving the resulting multiobjective optimization problem. However, due to the non-smoothness of the $\ell^1$ norm and the high number of parameters, this approach is not very efficient from a computational perspective. To overcome this limitation, we present an algorithm that allows for the approximation of the entire Pareto front for the above-mentioned objectives in a very efficient manner. We present numerical examples using both deterministic and stochastic gradients. We furthermore demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.

集成 · MoDELS · 統計量 · 估計/估計量 · 訓練數據 ·

2024 年 1 月 25 日

Clustering-based spatial interpolation of parametric post-processing models

Sándor Baran,Mária Lakatos

from arxiv, 19 pages, 6 figures

Since the start of the operational use of ensemble prediction systems, ensemble-based probabilistic forecasting has become the most advanced approach in weather prediction. However, despite the persistent development of the last three decades, ensemble forecasts still often suffer from the lack of calibration and might exhibit systematic bias, which calls for some form of statistical post-processing. Nowadays, one can choose from a large variety of post-processing approaches, where parametric methods provide full predictive distributions of the investigated weather quantity. Parameter estimation in these models is based on training data consisting of past forecast-observation pairs, thus post-processed forecasts are usually available only at those locations where training data are accessible. We propose a general clustering-based interpolation technique of extending calibrated predictive distributions from observation stations to any location in the ensemble domain where there are ensemble forecasts at hand. Focusing on the ensemble model output statistics (EMOS) post-processing technique, in a case study based on wind speed ensemble forecasts of the European Centre for Medium-Range Weather Forecasts, we demonstrate the predictive performance of various versions of the suggested method and show its superiority over the regionally estimated and interpolated EMOS models and the raw ensemble forecasts as well.

binary · Neural Networks · Networking · 模型評估 · 可約的 ·

2024 年 1 月 25 日

Binary structured physics-informed neural networks for solving equations with rapidly changing solutions

Yanzhi Liu,Ruifan Wu,Ying Jiang

Physics-informed neural networks (PINNs), rooted in deep learning, have emerged as a promising approach for solving partial differential equations (PDEs). By embedding the physical information described by PDEs into feedforward neural networks, PINNs are trained as surrogate models to approximate solutions without the need for label data. Nevertheless, even though PINNs have shown remarkable performance, they can face difficulties, especially when dealing with equations featuring rapidly changing solutions. These difficulties encompass slow convergence, susceptibility to becoming trapped in local minima, and reduced solution accuracy. To address these issues, we propose a binary structured physics-informed neural network (BsPINN) framework, which employs binary structured neural network (BsNN) as the neural network component. By leveraging a binary structure that reduces inter-neuron connections compared to fully connected neural networks, BsPINNs excel in capturing the local features of solutions more effectively and efficiently. These features are particularly crucial for learning the rapidly changing in the nature of solutions. In a series of numerical experiments solving Burgers equation, Euler equation, Helmholtz equation, and high-dimension Poisson equation, BsPINNs exhibit superior convergence speed and heightened accuracy compared to PINNs. From these experiments, we discover that BsPINNs resolve the issues caused by increased hidden layers in PINNs resulting in over-smoothing, and prevent the decline in accuracy due to non-smoothness of PDEs solutions.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.