2020久久精品亚洲热综合_九九九精品视频网站_一级毛片免费AAA202AAAA_18禁黄无遮挡免费网站动漫国语_久久成人网久久成人网久久成人网_自拍偷区亚洲综合第一页欧_AAAAAV人妻少妇精品一区

Meta-learning, the notion of learning to learn, enables learning systems to quickly and flexibly solve new tasks. This usually involves defining a set of outer-loop meta-parameters that are then used to update a set of inner-loop parameters. Most meta-learning approaches use complicated and computationally expensive bi-level optimisation schemes to update these meta-parameters. Ideally, systems should perform multiple orders of meta-learning, i.e. to learn to learn to learn and so on, to accelerate their own learning. Unfortunately, standard meta-learning techniques are often inappropriate for these higher-order meta-parameters because the meta-optimisation procedure becomes too complicated or unstable. Inspired by the higher-order meta-learning we observe in real-world evolution, we show that using simple population-based evolution implicitly optimises for arbitrarily-high order meta-parameters. First, we theoretically prove and empirically show that population-based evolution implicitly optimises meta-parameters of arbitrarily-high order in a simple setting. We then introduce a minimal self-referential parameterisation, which in principle enables arbitrary-order meta-learning. Finally, we show that higher-order meta-learning improves performance on time series forecasting tasks.

相關內容

學(xue)習的學(xue)習

關注 2

MASS · Networking · Neural Networks · 可辨認的 · Performer ·

2023 年 5 月 8 日

Predicting nuclear masses with product-unit networks

Babette Dellen,Uwe Jaekel,Paulo S. A. Freitas,John W. Clark

Accurate estimation of nuclear masses and their prediction beyond the experimentally explored domains of the nuclear landscape are crucial to an understanding of the fundamental origin of nuclear properties and to many applications of nuclear science, most notably in quantifying the $r$-process of stellar nucleosynthesis. Neural networks have been applied with some success to the prediction of nuclear masses, but they are known to have shortcomings in application to extrapolation tasks. In this work, we propose and explore a novel type of neural network for mass prediction in which the usual neuron-like processing units are replaced by complex-valued product units that permit multiplicative couplings of inputs to be learned from the input data. This generalized network model is tested on both interpolation and extrapolation data sets drawn from the Atomic Mass Evaluation. Its performance is compared with that of several neural-network architectures, substantiating its suitability for nuclear mass prediction. Additionally, a prediction-uncertainty measure for such complex-valued networks is proposed that serves to identify regions of expected low prediction error.

估計/估計量 · 近似 · MoDELS · 動力系統 · 周期的 ·

2023 年 5 月 6 日

Fourier Series-Based Approximation of Time-Varying Parameters in Ordinary Differential Equations

Anna Fitzpatrick,Molly Folino,Andrea Arnold

from arxiv, 24 pages, 12 figures, 3 tables

Many real-world systems modeled using differential equations involve unknown or uncertain parameters. Standard approaches to address parameter estimation inverse problems in this setting typically focus on estimating constants; yet some unobservable system parameters may vary with time without known evolution models. In this work, we propose a novel approximation method inspired by the Fourier series to estimate time-varying parameters in deterministic dynamical systems modeled with ordinary differential equations. Using ensemble Kalman filtering in conjunction with Fourier series-based approximation models, we detail two possible implementation schemes for sequentially updating the time-varying parameter estimates given noisy observations of the system states. We demonstrate the capabilities of the proposed approach in estimating periodic parameters, both when the period is known and unknown, as well as non-periodic time-varying parameters of different forms with several computed examples using a forced harmonic oscillator. Results emphasize the importance of the frequencies and number of approximation model terms on the time-varying parameter estimates and corresponding dynamical system predictions.

EASE · 相互獨立的 · 束搜索 · 分離的 · 貪心 ·

2023 年 5 月 6 日

Joint order assignment and picking station scheduling in KIVA warehouses with multiple stations

Xiying Yang,Guowei Hua,Li Zhang,T. C. E Cheng,Tsan Ming Choi

We consider the problem of allocating orders to multiple stations and sequencing the interlinked order and rack processing flows in each station in the robot-assisted KIVA warehouse. The various decisions involved in the problem, which are closely associated and must be solved in real time, are often tackled separately for ease of treatment. However, exploiting the synergy between order assignment and picking station scheduling benefits picking efficiency. We develop a comprehensive mathematical model that takes the synergy into consideration to minimize the total number of rack visits. To solve this intractable problem, we develop an efficient algorithm based on simulated annealing and beam search. Computational studies show that our proposed approach outperforms the rule-based greedy policy and the independent picking station scheduling method in terms of solution quality, saving over one-third and one-fifth of rack visits compared with the former and latter, respectively.

Fashion MNIST (數據集) · Performer · 模型評估 · 表示 · Processing（編程語言） ·

2023 年 5 月 4 日

A Novel Evolutionary Algorithm for Hierarchical Neural Architecture Search

Aristeidis Christoforidis,George Kyriakides,Konstantinos Margaritis

In this work, we propose a novel evolutionary algorithm for neural architecture search, applicable to global search spaces. The algorithm's architectural representation organizes the topology in multiple hierarchical modules, while the design process exploits this representation, in order to explore the search space. We also employ a curation system, which promotes the utilization of well performing sub-structures to subsequent generations. We apply our method to Fashion-MNIST and NAS-Bench101, achieving accuracies of $93.2\%$ and $94.8\%$ respectively in a relatively small number of generations.

剪枝 · MoDELS · 模型評估 · 損失 · Networking ·

2023 年 5 月 4 日

CrAM: A Compression-Aware Minimizer

Alexandra Peste,Adrian Vladu,Eldar Kurtic,Christoph H. Lampert,Dan Alistarh

from arxiv, Accepted to ICLR 2023

Deep neural networks (DNNs) often have to be compressed, via pruning and/or quantization, before they can be deployed in practical settings. In this work we propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way, in order to produce models whose local loss behavior is stable under compression operations such as pruning. Thus, dense models trained via CrAM should be compressible post-training, in a single step, without significant accuracy loss. Experimental results on standard benchmarks, such as residual networks for ImageNet classification and BERT models for language modelling, show that CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning: specifically, we can prune models in one-shot to 70-80% sparsity with almost no accuracy loss, and to 90% with reasonable ($\sim 1\%$) accuracy loss, which is competitive with gradual compression methods. Additionally, CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware. The code for reproducing the results is available at //github.com/IST-DASLab/CrAM .

主動學習 · 自由能 · Extensibility · 學成 · TAP ·

2021 年 12 月 2 日

Active Learning for Domain Adaptation: An Energy-based Approach

Binhui Xie,Longhui Yuan,Shuang Li,Chi Harold Liu,Xinjing Cheng,Guoren Wang

from arxiv, Accepted by AAAI 2022. Code is available at //github.com/BIT-DA/EADA

Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

簇 · Performer · 數據集 · MoDELS · DBSCAN ·

2019 年 10 月 30 日

Meta-Learning to Cluster

Yibo Jiang,Nakul Verma

Clustering is one of the most fundamental and wide-spread techniques in exploratory data analysis. Yet, the basic approach to clustering has not really changed: a practitioner hand-picks a task-specific clustering loss to optimize and fit the given data to reveal the underlying cluster structure. Some types of losses---such as k-means, or its non-linear version: kernelized k-means (centroid based), and DBSCAN (density based)---are popular choices due to their good empirical performance on a range of applications. Although every so often the clustering output using these standard losses fails to reveal the underlying structure, and the practitioner has to custom-design their own variation. In this work we take an intrinsically different approach to clustering: rather than fitting a dataset to a specific clustering loss, we train a recurrent model that learns how to cluster. The model uses as training pairs examples of datasets (as input) and its corresponding cluster identities (as output). By providing multiple types of training datasets as inputs, our model has the ability to generalize well on unseen datasets (new clustering tasks). Our experiments reveal that by training on simple synthetically generated datasets or on existing real datasets, we can achieve better clustering performance on unseen real-world datasets when compared with standard benchmark clustering techniques. Our meta clustering model works well even for small datasets where the usual deep learning models tend to perform worse.

事件抽取 · 學成 · 逆強化學習 · GAN · 估計/估計量 ·

2018 年 4 月 21 日

Event Extraction with Generative Adversarial Imitation Learning

Tongtao Zhang,Heng Ji

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.