国产免费一区二区三区在线能观看,碰碰女人公开免费视频,国产美女精品一区网站在线播放

Energy systems optimization problems are complex due to strongly non-linear system behavior and multiple competing objectives, e.g. economic gain vs. environmental impact. Moreover, a large number of input variables and different variable types, e.g. continuous and categorical, are challenges commonly present in real-world applications. In some cases, proposed optimal solutions need to obey explicit input constraints related to physical properties or safety-critical operating conditions. This paper proposes a novel data-driven strategy using tree ensembles for constrained multi-objective optimization of black-box problems with heterogeneous variable spaces for which underlying system dynamics are either too complex to model or unknown. In an extensive case study comprised of synthetic benchmarks and relevant energy applications we demonstrate the competitive performance and sampling efficiency of the proposed algorithm compared to other state-of-the-art tools, making it a useful all-in-one solution for real-world applications with limited evaluation budgets.

相關內容

優化器

關注 4

蒙特卡羅 · MoDELS · 馬爾可夫鏈蒙特卡羅 · 馬爾可夫鏈 · 圖 ·

2022 年 1 月 7 日

DERGMs: Degeneracy-restricted exponential random graph models

Vishesh Karwa,Sonja Petrovi?,Denis Baji?

from arxiv, Version 3

Exponential random graph models, or ERGMs, are a flexible and general class of models for modeling dependent data. While the early literature has shown them to be powerful in capturing many network features of interest, recent work highlights difficulties related to the models' ill behavior, such as most of the probability mass being concentrated on a very small subset of the parameter space. This behavior limits both the applicability of an ERGM as a model for real data and inference and parameter estimation via the usual Markov chain Monte Carlo algorithms. To address this problem, we propose a new exponential family of models for random graphs that build on the standard ERGM framework. Specifically, we solve the problem of computational intractability and `degenerate' model behavior by an interpretable support restriction. We introduce a new parameter based on the graph-theoretic notion of degeneracy, a measure of sparsity whose value is commonly low in real-worlds networks. The new model family is supported on the sample space of graphs with bounded degeneracy and is called degeneracy-restricted ERGMs, or DERGMs for short. Since DERGMs generalize ERGMs -- the latter is obtained from the former by setting the degeneracy parameter to be maximal -- they inherit good theoretical properties, while at the same time place their mass more uniformly over realistic graphs. The support restriction allows the use of new (and fast) Monte Carlo methods for inference, thus making the models scalable and computationally tractable. We study various theoretical properties of DERGMs and illustrate how the support restriction improves the model behavior. We also present a fast Monte Carlo algorithm for parameter estimation that avoids many issues faced by Markov Chain Monte Carlo algorithms used for inference in ERGMs.

Neural Networks · 可理解性 · Networking · 泛函 · MoDELS ·

2022 年 1 月 6 日

Physics-informed neural networks for solving thermo-mechanics problems of functionally graded material

Mayank Raj,Pramod Kumbhar,Ratna Kumar Annabattula

from arxiv, 40 pages, 28 figures

Differential equations are indispensable to engineering and hence to innovation. In recent years, physics-informed neural networks (PINN) have emerged as a novel method for solving differential equations. PINN method has the advantage of being meshless, scalable, and can potentially be intelligent in terms of transferring the knowledge learned from solving one differential equation to the other. The exploration in this field has majorly been limited to solving linear-elasticity problems, crack propagation problems. This study uses PINNs to solve coupled thermo-mechanics problems of materials with functionally graded properties. An in-depth analysis of the PINN framework has been carried out by understanding the training datasets, model architecture, and loss functions. The efficacy of the PINN models in solving thermo-mechanics differential equations has been measured by comparing the obtained solutions either with analytical solutions or finite element method-based solutions. While R2 score of more than 99% has been achieved in predicting primary variables such as displacement and temperature fields, achieving the same for secondary variables such as stress turns out to be more challenging. This study is the first to implement the PINN framework for solving coupled thermo-mechanics problems on composite materials. This study is expected to enhance the understanding of the novel PINN framework and will be seminal for further research on PINNs.

最優化 · INFORMS · state-of-the-art · 優化器 · 回合 ·

2022 年 1 月 6 日

Observability-Aware Trajectory Optimization: Theory, Viability, and State of the Art

Christopher Grebe,Emmett Wise,Jonathan Kelly

from arxiv, In Proceedings of the IEEE International Conference on Multisensor Fusion and Integration (MFI'21), Karlsruhe, Germany, Sep. 23-25, 2021

Ideally, robots should move in ways that maximize the knowledge gained about the state of both their internal system and the external operating environment. Trajectory design is a challenging problem that has been investigated from a variety of perspectives, ranging from information-theoretic analyses to leaning-based approaches. Recently, observability-based metrics have been proposed to find trajectories that enable rapid and accurate state and parameter estimation. The viability and efficacy of these methods is not yet well understood in the literature. In this paper, we compare two state-of-the-art methods for observability-aware trajectory optimization and seek to add important theoretical clarifications and valuable discussion about their overall effectiveness. For evaluation, we examine the representative task of sensor-to-sensor extrinsic self-calibration using a realistic physics simulator. We also study the sensitivity of these algorithms to changes in the information content of the exteroceptive sensor measurements.

Machine Learning · Performer · 優化器 · GPU · 學成 ·

2022 年 1 月 5 日

Dynamic GPU Energy Optimization for Machine Learning Training Workloads

Farui Wang,Weizhe Zhang,Shichao Lai,Meng Hao,Zheng Wang

from arxiv, Accepted to be published at IEEE Transactions on Parallel and Distributed System (IEEE TPDS)

GPUs are widely used to accelerate the training of machine learning workloads. As modern machine learning models become increasingly larger, they require a longer time to train, leading to higher GPU energy consumption. This paper presents GPOEO, an online GPU energy optimization framework for machine learning training workloads. GPOEO dynamically determines the optimal energy configuration by employing novel techniques for online measurement, multi-objective prediction modeling, and search optimization. To characterize the target workload behavior, GPOEO utilizes GPU performance counters. To reduce the performance counter profiling overhead, it uses an analytical model to detect the training iteration change and only collects performance counter data when an iteration shift is detected. GPOEO employs multi-objective models based on gradient boosting and a local search algorithm to find a trade-off between execution time and energy consumption. We evaluate the GPOEO by applying it to 71 machine learning workloads from two AI benchmark suites running on an NVIDIA RTX3080Ti GPU. Compared with the NVIDIA default scheduling strategy, GPOEO delivers a mean energy saving of 16.2% with a modest average execution time increase of 5.1%.

相關系數 · 協方差矩陣 · 樣本 · 可辨認的 · 幾乎必然 ·

2022 年 1 月 4 日

Large sample correlation matrices: a comparison theorem and its applications

Johannes Heiny

from arxiv, 20 pages

In this paper, we show that the diagonal of a high-dimensional sample covariance matrix stemming from $n$ independent observations of a $p$-dimensional time series with finite fourth moments can be approximated in spectral norm by the diagonal of the population covariance matrix. We assume that $n,p\to \infty$ with $p/n$ tending to a constant which might be positive or zero. As applications, we provide an approximation of the sample correlation matrix ${\mathbf R}$ and derive a variety of results for its eigenvalues. We identify the limiting spectral distribution of ${\mathbf R}$ and construct an estimator for the population correlation matrix and its eigenvalues. Finally, the almost sure limits of the extreme eigenvalues of ${\mathbf R}$ in a generalized spiked correlation model are analyzed.

學成 · INTERACT · MoDELS · 深度強化學習 · 強化學習 ·

2022 年 1 月 3 日

Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning

Michael Curry,Alexander Trott,Soham Phade,Yu Bai,Stephan Zheng

Real economies can be seen as a sequential imperfect-information game with many heterogeneous, interacting strategic agents of various agent types, such as consumers, firms, and governments. Dynamic general equilibrium models are common economic tools to model the economic activity, interactions, and outcomes in such systems. However, existing analytical and computational methods struggle to find explicit equilibria when all agents are strategic and interact, while joint learning is unstable and challenging. Amongst others, a key reason is that the actions of one economic agent may change the reward function of another agent, e.g., a consumer's expendable income changes when firms change prices or governments change taxes. We show that multi-agent deep reinforcement learning (RL) can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types, in economic simulations with many agents, through the use of structured learning curricula and efficient GPU-only simulation and training. Conceptually, our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing, that are commonly used for analytical tractability. Our GPU implementation enables training and analyzing economies with a large number of agents within reasonable time frames, e.g., training completes within a day. We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes. We validate the learned meta-game epsilon-Nash equilibria through approximate best-response analyses, show that RL policies align with economic intuitions, and that our approach is constructive, e.g., by explicitly learning a spectrum of meta-game epsilon-Nash equilibria in open RBC models.

深度強化學習 · 強化學習 · 學成 · 生成方法 · 過擬合 ·

2021 年 11 月 18 日

A Survey of Generalisation in Deep Reinforcement Learning

Robert Kirk,Amy Zhang,Edward Grefenstette,Tim Rockt?schel

The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works. We go on to categorise existing benchmarks for generalisation, as well as current methods for tackling the generalisation problem. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in generalisation, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for generalisation, and we recommend building benchmarks in underexplored problem settings such as offline RL generalisation and reward-function variation.

優化器 · MoDELS · 異常點 · Performer · AIM ·

2021 年 6 月 25 日

Optimal Counterfactual Explanations in Tree Ensembles

Axel Parmentier,Thibaut Vidal

from arxiv, Authors Accepted Manuscript (AAM), to be published in the Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021. Additional typo corrections. Open source code available at //github.com/vidalt/OCEAN

Counterfactual explanations are usually generated through heuristics that are sensitive to the search's initial conditions. The absence of guarantees of performance and robustness hinders trustworthiness. In this paper, we take a disciplined approach towards counterfactual explanations for tree ensembles. We advocate for a model-based search aiming at "optimal" explanations and propose efficient mixed-integer programming approaches. We show that isolation forests can be modeled within our framework to focus the search on plausible explanations with a low outlier score. We provide comprehensive coverage of additional constraints that model important objectives, heterogeneous data types, structural constraints on the feature space, along with resource and actionability restrictions. Our experimental analyses demonstrate that the proposed search approach requires a computational effort that is orders of magnitude smaller than previous mathematical programming algorithms. It scales up to large data sets and tree ensembles, where it provides, within seconds, systematic explanations grounded on well-defined models solved to optimality.

學成 · 約束 · 強化學習 · contrastive · 評論員 ·

2021 年 5 月 21 日

Inverse Constrained Reinforcement Learning

Usman Anwar,Shehryar Malik,Alireza Aghasi,Ali Ahmed

from arxiv, Camera-ready version for ICML 2021

In real world settings, numerous constraints are present which are hard to specify mathematically. However, for the real world deployment of reinforcement learning (RL), it is critical that RL agents are aware of these constraints, so that they can act safely. In this work, we consider the problem of learning constraints from demonstrations of a constraint-abiding agent's behavior. We experimentally validate our approach and show that our framework can successfully learn the most likely constraints that the agent respects. We further show that these learned constraints are \textit{transferable} to new agents that may have different morphologies and/or reward functions. Previous works in this regard have either mainly been restricted to tabular (discrete) settings, specific types of constraints or assume the environment's transition dynamics. In contrast, our framework is able to learn arbitrary \textit{Markovian} constraints in high-dimensions in a completely model-free setting. The code can be found it: \url{//github.com/shehryar-malik/icrl}.

DQN · 泛化理論 · 正則化項 · 學成 · Performer ·

2019 年 1 月 30 日

Generalization and Regularization in DQN

Jesse Farebrother,Marlos C. Machado,Michael Bowling

Deep reinforcement learning (RL) algorithms have shown an impressive ability to learn complex control policies in high-dimensional environments. However, despite the ever-increasing performance on popular benchmarks such as the Arcade Learning Environment (ALE), policies learned by deep RL algorithms often struggle to generalize when evaluated in remarkably similar environments. In this paper, we assess the generalization capabilities of DQN, one of the most traditional deep RL algorithms in the field. We provide evidence suggesting that DQN overspecializes to the training environment. We comprehensively evaluate the impact of traditional regularization methods, $\ell_2$-regularization and dropout, and of reusing the learned representations to improve the generalization capabilities of DQN. We perform this study using different game modes of Atari 2600 games, a recently introduced modification for the ALE which supports slight variations of the Atari 2600 games traditionally used for benchmarking. Despite regularization being largely underutilized in deep RL, we show that it can, in fact, help DQN learn more general features. These features can then be reused and fine-tuned on similar tasks, considerably improving the sample efficiency of DQN.