蜜桃少妇AV久久久久久久-影视先锋AV中文字幕

Traditional implicit generative models are capable of learning highly complex data distributions. However, their training involves distinguishing real data from synthetically generated data using adversarial discriminators, which can lead to unstable training dynamics and mode dropping issues. In this work, we build on the \textit{invariant statistical loss} (ISL) method introduced in \cite{de2024training}, and extend it to handle heavy-tailed and multivariate data distributions. The data generated by many real-world phenomena can only be properly characterised using heavy-tailed probability distributions, and traditional implicit methods struggle to effectively capture their asymptotic behavior. To address this problem, we introduce a generator trained with ISL, that uses input noise from a generalised Pareto distribution (GPD). We refer to this generative scheme as Pareto-ISL for conciseness. Our experiments demonstrate that Pareto-ISL accurately models the tails of the distributions while still effectively capturing their central characteristics. The original ISL function was conceived for 1D data sets. When the actual data is $n$-dimensional, a straightforward extension of the method was obtained by targeting the $n$ marginal distributions of the data. This approach is computationally infeasible and ineffective in high-dimensional spaces. To overcome this, we extend the 1D approach using random projections and define a new loss function suited for multivariate data, keeping problems tractable by adjusting the number of projections. We assess its performance in multidimensional generative modeling and explore its potential as a pretraining technique for generative adversarial networks (GANs) to prevent mode collapse, reporting promising results and highlighting its robustness across various hyperparameter settings.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 優化器 · 平滑 · 泛函 · 類別 ·

2024 年 12 月 12 日

Doubly-robust inference and optimality in structure-agnostic models with smoothness

Matteo Bonvini,Edward H. Kennedy,Oliver Dukes,Sivaraman Balakrishnan

from arxiv, 68 pages, 3 figures

We study the problem of constructing an estimator of the average treatment effect (ATE) with observational data. The celebrated doubly-robust, augmented-IPW (AIPW) estimator generally requires consistent estimation of both nuisance functions for standard root-n inference, and moreover that the product of the errors of the nuisances should shrink at a rate faster than $n^{-1/2}$. A recent strand of research has aimed to understand the extent to which the AIPW estimator can be improved upon (in a minimax sense). Under structural assumptions on the nuisance functions, the AIPW estimator is typically not minimax-optimal, and improvements can be made using higher-order influence functions (Robins et al, 2017). Conversely, without any assumptions on the nuisances beyond the mean-square-error rates at which they can be estimated, the rate achieved by the AIPW estimator is already optimal (Balakrishnan et al, 2023; Jin and Syrgkanis, 2024). We make three main contributions. First, we propose a new hybrid class of distributions that combine structural agnosticism regarding the nuisance function space with additional smoothness constraints. Second, we calculate minimax lower bounds for estimating the ATE in the new class, as well as in the pure structure-agnostic one. Third, we propose a new estimator of the ATE that enjoys doubly-robust asymptotic linearity; it can yield asymptotically valid Wald-type confidence intervals even when the propensity score or the outcome model is inconsistently estimated, or estimated at a slow rate. Under certain conditions, we show that its rate of convergence in the new class can be much faster than that achieved by the AIPW estimator and, in particular, matches the minimax lower bound rate, thereby establishing its optimality. Finally, we complement our theoretical findings with simulations.

MoDELS · Learning · 設計 · Engineering · 混合模型 ·

2024 年 12 月 12 日

FUsion-based ConstitutivE model (FuCe): Towards model-data augmentation in constitutive modelling

Tushar,Sawan Kumar,Souvik Chakraborty

Constitutive modelling is crucial for engineering design and simulations to accurately describe material behavior. However, traditional phenomenological models often struggle to capture the complexities of real materials under varying stress conditions due to their fixed forms and limited parameters. While recent advances in deep learning have addressed some limitations of classical models, purely data-driven methods tend to require large datasets, lack interpretability, and struggle to generalize beyond their training data. To tackle these issues, we introduce "Fusion-based Constitutive model (FuCe): Towards model-data augmentation in constitutive modelling". This approach combines established phenomenological models with an ICNN architecture, designed to train on the limited and noisy force-displacement data typically available in practical applications. The hybrid model inherently adheres to necessary constitutive conditions. During inference, Monte Carlo dropout is employed to generate Bayesian predictions, providing mean values and confidence intervals that quantify uncertainty. We demonstrate the model's effectiveness by learning two isotropic constitutive models and one anisotropic model with a single fibre direction, across six different stress states. The framework's applicability is also showcased in finite element simulations across three geometries of varying complexities. Our results highlight the framework's superior extrapolation capabilities, even when trained on limited and noisy data, delivering accurate and physically meaningful predictions across all numerical examples.

集成 · MoDELS · 模型評估 · 可理解性 · Performer ·

2024 年 12 月 12 日

Beyond forecast leaderboards: Measuring individual model importance based on contribution to ensemble accuracy

Minsu Kim,Evan L. Ray,Nicholas G. Reich

from arxiv, 28 pages, 8 figures in the main text; includes supplementary material

Ensemble forecasts often outperform forecasts from individual standalone models, and have been used to support decision-making and policy planning in various fields. As collaborative forecasting efforts to create effective ensembles grow, so does interest in understanding individual models' relative importance in the ensemble. To this end, we propose two practical methods that measure the difference between ensemble performance when a given model is or is not included in the ensemble: a leave-one-model-out algorithm and a leave-all-subsets-of-models-out algorithm, which is based on the Shapley value. We explore the relationship between these metrics, forecast accuracy, and the similarity of errors, both analytically and through simulations. We illustrate this measure of the value a component model adds to an ensemble in the presence of other models using US COVID-19 death forecasts. This study offers valuable insight into individual models' unique features within an ensemble, which standard accuracy metrics alone cannot reveal.

MoDELS · 模型平均 · 估計/估計量 · Weight · 成對型 ·

2024 年 12 月 12 日

Dynamic prediction of an event using multiple longitudinal markers: a model averaging approach

Reza Hashemi,Taban Baghfalaki,Viviane Philipps,Helene Jacqmin-Gadda

Dynamic event prediction, using joint modeling of survival time and longitudinal variables, is extremely useful in personalized medicine. However, the estimation of joint models including many longitudinal markers is still a computational challenge because of the high number of random effects and parameters to be estimated. In this paper, we propose a model averaging strategy to combine predictions from several joint models for the event, including one longitudinal marker only or pairwise longitudinal markers. The prediction is computed as the weighted mean of the predictions from the one-marker or two-marker models, with the time-dependent weights estimated by minimizing the time-dependent Brier score. This method enables us to combine a large number of predictions issued from joint models to achieve a reliable and accurate individual prediction. Advantages and limits of the proposed methods are highlighted in a simulation study by comparison with the predictions from well-specified and misspecified all-marker joint models as well as the one-marker and two-marker joint models. Using the PBC2 data set, the method is used to predict the risk of death in patients with primary biliary cirrhosis. The method is also used to analyze a French cohort study called the 3C data. In our study, seventeen longitudinal markers are considered to predict the risk of death.

優化器 · 類別 · 泛化理論 · Learning · MoDELS ·

2024 年 12 月 11 日

On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems

Getachew K. Befekadu

from arxiv, 9 pages, 3 figures, 3 tables

In this paper, we provide a mathematical framework for improving generalization in a class of learning problems which is related to point estimations for modeling of high-dimensional nonlinear functions. In particular, we consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term which is scaled by a small parameter. Here, the optimization problem consists of a cost functional, which is associated with how to gauge the quality of the estimated model parameters at a certain fixed final time w.r.t. the model validating dataset, while the weakly-controlled gradient system, whose the time-evolution is guided by the model training dataset and its perturbed version with small random noise. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems, i.e., a set of decomposed optimization problems, so as to aggregate the corresponding approximate optimal solutions that are reasonably sufficient for improving generalization in such a class of learning problems. Moreover, we also provide an estimate for the rate of convergence for such approximate optimal solutions. Finally, we present some numerical results for a typical case of nonlinear regression problem.

回合 · Learning · Agent · MoDELS · state-of-the-art ·

2024 年 12 月 11 日

GenPlan: Generative sequence models as adaptive planners

Akash Karthikeyan,Yash Vardhan Pant

from arxiv, Accepted in AAAI 2025. Project page: //aku02.github.io/projects/genplan/

Offline reinforcement learning has shown tremendous success in behavioral planning by learning from previously collected demonstrations. However, decision-making in multitask missions still presents significant challenges. For instance, a mission might require an agent to explore an unknown environment, discover goals, and navigate to them, even if it involves interacting with obstacles along the way. Such behavioral planning problems are difficult to solve due to: a) agents failing to adapt beyond the single task learned through their reward function, and b) the inability to generalize to new environments not covered in the training demonstrations, e.g., environments where all doors were unlocked in the demonstrations. Consequently, state-of-the-art decision making methods are limited to missions where the required tasks are well-represented in the training demonstrations and can be solved within a short (temporal) planning horizon. To address this, we propose GenPlan: a stochastic and adaptive planner that leverages discrete-flow models for generative sequence modeling, enabling sample-efficient exploration and exploitation. This framework relies on an iterative denoising procedure to generate a sequence of goals and actions. This approach captures multi-modal action distributions and facilitates goal and task discovery, thereby enhancing generalization to out-of-distribution tasks and environments, i.e., missions not part of the training data. We demonstrate the effectiveness of our method through multiple simulation environments. Notably, GenPlan outperforms the state-of-the-art methods by over 10% on adaptive planning tasks, where the agent adapts to multi-task missions while leveraging demonstrations on single-goal-reaching tasks.

MoDELS · INFORMS · 估計/估計量 · 相關系數 · 全 ·

2024 年 12 月 11 日

Hypothesis tests and model parameter estimation on data sets with missing correlation information

Lukas Koch

from arxiv, 19 pages, 10 figures; follow-up of arxiv.org:2102.06172; added section on Goodness of Fit and composite hypothesis tests

Ideally, all analyses of normally distributed data should include the full covariance information between all data points. In practice, the full covariance matrix between all data points is not always available. Either because a result was published without a covariance matrix, or because one tries to combine multiple results from separate publications. For simple hypothesis tests, it is possible to define robust test statistics that will behave conservatively in the presence on unknown correlations. For model parameter fits, one can inflate the variance by a factor to ensure that things remain conservative at least up to a chosen confidence level. This paper describes a class of robust test statistics for simple hypothesis tests, as well as an algorithm to determine the necessary inflation factor for model parameter fits and Goodness of Fit tests and composite hypothesis tests. It then presents some example applications of the methods to real neutrino interaction data and model comparisons.

Integration · MoDELS · 評論員 · 可行 · Processing（編程語言） ·

2024 年 12 月 11 日

A computational framework to predict weld integrity and microstructural heterogeneity: application to hydrogen transmission

J. Wijnen,J. Parker,M. Gagliano,E. Martínez-Pa?eda

We present a novel computational framework to assess the structural integrity of welds. In the first stage of the simulation framework, local fractions of microstructural constituents within weld regions are predicted based on steel composition and welding parameters. The resulting phase fraction maps are used to define heterogeneous properties that are subsequently employed in structural integrity assessments using an elastoplastic phase field fracture model. The framework is particularised to predicting failure in hydrogen pipelines, demonstrating its potential to assess the feasibility of repurposing existing pipeline infrastructure to transport hydrogen. First, the process model is validated against experimental microhardness maps for vintage and modern pipeline welds. Additionally, the influence of welding conditions on hardness and residual stresses is investigated, demonstrating that variations in heat input, filler material composition, and weld bead order can significantly affect the properties within the weld region. Coupled hydrogen diffusion-fracture simulations are then conducted to determine the critical pressure at which hydrogen transport pipelines will fail. To this end, the model is enriched with a microstructure-sensitive description of hydrogen transport and hydrogen-dependent fracture resistance. The analysis of an X52 pipeline reveals that even 2 mm defects in a hard heat-affected zone can drastically reduce the critical failure pressure.

MoDELS · 推斷 · Learning · 連結 · 可約的 ·

2024 年 12 月 9 日

Efficient user history modeling with amortized inference for deep learning recommendation models

Lars Hertel,Neil Daftary,Fedor Borisyuk,Aman Gupta,Rahul Mazumder

from arxiv, 5 pages, 3 figures, WWW 2025

We study user history modeling via Transformer encoders in deep learning recommendation models (DLRM). Such architectures can significantly improve recommendation quality, but usually incur high latency cost necessitating infrastructure upgrades or very small Transformer models. An important part of user history modeling is early fusion of the candidate item and various methods have been studied. We revisit early fusion and compare concatenation of the candidate to each history item against appending it to the end of the list as a separate item. Using the latter method, allows us to reformulate the recently proposed amortized history inference algorithm M-FALCON \cite{zhai2024actions} for the case of DLRM models. We show via experimental results that appending with cross-attention performs on par with concatenation and that amortization significantly reduces inference costs. We conclude with results from deploying this model on the LinkedIn Feed and Ads surfaces, where amortization reduces latency by 30\% compared to non-amortized inference.

損失函數（機器學習） · 泛函 · 損失 · Taxonomy · Machine Learning ·

2023 年 1 月 13 日

A survey and taxonomy of loss functions in machine learning

Lorenzo Ciampiconi,Adam Elwood,Marco Leonardi,Ashraf Mohamed,Alessandro Rozza

Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.