久草精品视频在线观看,在线一区二区观看,天天视频手机视频

Parameterized quantum circuits as machine learning models are typically well described by their representation as a partial Fourier series of the input features, with frequencies uniquely determined by the feature map's generator Hamiltonians. Ordinarily, these data-encoding generators are chosen in advance, fixing the space of functions that can be represented. In this work we consider a generalization of quantum models to include a set of trainable parameters in the generator, leading to a trainable frequency (TF) quantum model. We numerically demonstrate how TF models can learn generators with desirable properties for solving the task at hand, including non-regularly spaced frequencies in their spectra and flexible spectral richness. Finally, we showcase the real-world effectiveness of our approach, demonstrating an improved accuracy in solving the Navier-Stokes equations using a TF model with only a single parameter added to each encoding operation. Since TF models encompass conventional fixed frequency models, they may offer a sensible default choice for variational quantum machine learning.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 向量空間 · 可理解性 · Better · 優化器 ·

2024 年 6 月 4 日

Understanding Stochastic Natural Gradient Variational Inference

Kaiwen Wu,Jacob R. Gardner

from arxiv, ICML 2024

Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the \emph{stochastic} setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first $\mathcal{O}(\frac{1}{T})$ non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent (\aka black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of $\mathcal{O}(\frac{1}{T})$ is unlikely without some significant new understanding of optimizing the ELBO using natural gradients.

優化器 · GROUP · MoDELS · 穩健性 · 極大 ·

2024 年 6 月 3 日

Finding Optimally Robust Data Mixtures via Concave Maximization

Anvith Thudi,Chris J. Maddison

Training on mixtures of data distributions is now common in many modern machine learning pipelines, useful for performing well on several downstream tasks. Group distributionally robust optimization (group DRO) is one popular way to learn mixture weights for training a specific model class, but group DRO methods suffer for non-linear models due to non-convex loss functions and when the models are non-parametric. We address these challenges by proposing to solve a more general DRO problem, giving a method we call MixMax. MixMax selects mixture weights by maximizing a particular concave objective with entropic mirror ascent, and, crucially, we prove that optimally fitting this mixture distribution over the set of bounded predictors returns a group DRO optimal model. Experimentally, we tested MixMax on a sequence modeling task with transformers and on a variety of non-parametric learning problems. In all instances MixMax matched or outperformed the standard data mixing and group DRO baselines, and in particular, MixMax improved the performance of XGBoost over the only baseline, data balancing, for variations of the ACSIncome and CelebA annotations datasets.

可交換的 · 可約的 · Performer · Extensibility · 可辨認的 ·

2024 年 6 月 3 日

Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics

Haoyang Zheng,Hengrong Du,Qi Feng,Wei Deng,Guang Lin

from arxiv, 28 pages, 13 figures

Replica exchange stochastic gradient Langevin dynamics (reSGLD) is an effective sampler for non-convex learning in large-scale datasets. However, the simulation may encounter stagnation issues when the high-temperature chain delves too deeply into the distribution tails. To tackle this issue, we propose reflected reSGLD (r2SGLD): an algorithm tailored for constrained non-convex exploration by utilizing reflection steps within a bounded domain. Theoretically, we observe that reducing the diameter of the domain enhances mixing rates, exhibiting a $\textit{quadratic}$ behavior. Empirically, we test its performance through extensive experiments, including identifying dynamical systems with physical constraints, simulations of constrained multi-modal distributions, and image classification tasks. The theoretical and empirical findings highlight the crucial role of constrained exploration in improving the simulation efficiency.

MoDELS · 模型選擇 · 類別 · INTERACT · Performer ·

2024 年 6 月 1 日

Understanding Model Selection For Learning In Strategic Environments

Tinashe Handina,Eric Mazumdar

from arxiv, Reworded title, fixed typos and changed organization from previous version

The deployment of ever-larger machine learning models reflects a growing consensus that the more expressive the model class one optimizes over$\unicode{x2013}$and the more data one has access to$\unicode{x2013}$the more one can improve performance. As models get deployed in a variety of real-world scenarios, they inevitably face strategic environments. In this work, we consider the natural question of how the interplay of models and strategic interactions affects the relationship between performance at equilibrium and the expressivity of model classes. We find that strategic interactions can break the conventional view$\unicode{x2013}$meaning that performance does not necessarily monotonically improve as model classes get larger or more expressive (even with infinite data). We show the implications of this result in several contexts including strategic regression, strategic classification, and multi-agent reinforcement learning. In particular, we show that each of these settings admits a Braess' paradox-like phenomenon in which optimizing over less expressive model classes allows one to achieve strictly better equilibrium outcomes. Motivated by these examples, we then propose a new paradigm for model selection in games wherein an agent seeks to choose amongst different model classes to use as their action set in a game.

Microsoft Surface · Weight · 穩健性 · 泛函 · 生成權重 ·

2024 年 6 月 1 日

Robust Biharmonic Skinning Using Geometric Fields

Ana Dodik,Vincent Sitzmann,Justin Solomon,Oded Stein

Skinning is a popular way to rig and deform characters for animation, to compute reduced-order simulations, and to define features for geometry processing. Methods built on skinning rely on weight functions that distribute the influence of each degree of freedom across the mesh. Automatic skinning methods generate these weight functions with minimal user input, usually by solving a variational problem on a mesh whose boundary is the skinned surface. This formulation necessitates tetrahedralizing the volume inside the surface, which brings with it meshing artifacts, the possibility of tetrahedralization failure, and the impossibility of generating weights for surfaces that are not closed. We introduce a mesh-free and robust automatic skinning method that generates high-quality skinning weights comparable to the current state of the art without volumetric meshes. Our method reliably works even on open surfaces and triangle soups where current methods fail. We achieve this through the use of a Lagrangian representation for skinning weights, which circumvents the need for finite elements while optimizing the biharmonic energy.

MoDELS · SimPLe · 易處理的 · TOOLS · Continuity ·

2024 年 5 月 31 日

Good Modelling Software Practices

Carsten Lemmen,Philipp Sebastian Sommer

from arxiv, 1 Figure

In socio-environmental sciences, models are frequently used as tools to represent, understand, project and predict the behaviour of these complex systems. Along the modelling chain, Good Modelling Practices have been evolving that ensure -- amongst others -- that models are transparent and replicable. Whenever such models are represented in software, good modelling meets Good software Practices, such as a tractable development workflow, good code, collaborative development and governance, continuous integration and deployment, and Good Scientific Practices, such as attribution of copyrights and acknowledgement of intellectual property, publication of a software paper and archiving. Too often in existing socio-environmental model software, these practices have been regarded as an add-on to be considered at a later stage only; in fact, many modellers have shied away from publishing their model as open source out of fear that having to add good practices is too demanding. We here argue for making a habit of following a list of simple and not so simple practices early on in the implementation of the model life cycle. We contextualise cherry-picked and hands-on practices for supporting Good Modelling Practices, and we demonstrate their application in the example context of the Viable North Sea fisheries socio-ecological systems model.

泛函 · 激活函數 · 線性的 · 值域 · ReLU ·

2024 年 5 月 25 日

Expanded Gating Ranges Improve Activation Functions

Allen Hao Huang

Activation functions are core components of all deep learning architectures. Currently, the most popular activation functions are smooth ReLU variants like GELU and SiLU. These are self-gated activation functions where the range of the gating function is between zero and one. In this paper, we explore the viability of using arctan as a gating mechanism. A self-gated activation function that uses arctan as its gating function has a monotonically increasing first derivative. To make this activation function competitive, it is necessary to introduce a trainable parameter for every MLP block to expand the range of the gating function beyond zero and one. We find that this technique also improves existing self-gated activation functions. We conduct an empirical evaluation of Expanded ArcTan Linear Unit (xATLU), Expanded GELU (xGELU), and Expanded SiLU (xSiLU) and show that they outperform existing activation functions within a transformer architecture. Additionally, expanded gating ranges show promising results in improving first-order Gated Linear Units (GLU).

塑造 · 解碼 · MoDELS · 學成 · 生成模型 ·

2018 年 12 月 6 日

Learning Implicit Fields for Generative Shape Modeling

Zhiqin Chen,Hao Zhang

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality.

Performer · 深度強化學習 · 學成 · entity · 強化學習 ·

2018 年 6 月 28 日

Relational Deep Reinforcement Learning

Vinicius Zambaldi,David Raposo,Adam Santoro,Victor Bapst,Yujia Li,Igor Babuschkin,Karl Tuyls,David Reichert,Timothy Lillicrap,Edward Lockhart,Murray Shanahan,Victoria Langston,Razvan Pascanu,Matthew Botvinick,Oriol Vinyals,Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.