两个人的电影全免费观看720,亚洲色大成人WWW

Max-stable processes provide natural models for the modelling of spatial extreme values observed at a set of spatial sites. Full likelihood inference for max-stable data is, however, complicated by the form of the likelihood function as it contains a sum over all partitions of sites. As such, the number of terms to sum over grows rapidly with the number of sites and quickly becomes prohibitively burdensome to compute. We propose a variational inference approach to full likelihood inference that circumvents the problematic sum. To achieve this, we first posit a parametric family of partition distributions from which partitions can be sampled. Second, we optimise the parameters of the family in conjunction with the max-stable model to find the partition distribution best supported by the data, and to estimate the max-stable model parameters. In a simulation study we show that our method enables full likelihood inference in higher dimensions than previous methods, and is readily applicable to data sets with a large number of observations. Furthermore, our method can easily be extended to a Bayesian setting. Code is available at //github.com/LPAndersson/MaxStableVI.jl.

相關內容

推斷

關注 5

賭博機/老虎機 · 離散化 · 優化器 · Bandits · Learning ·

2023 年 2 月 15 日

Genetic multi-armed bandits: a reinforcement learning approach for discrete optimization via simulation

Deniz Preil,Michael Krapp

This paper proposes a new algorithm, referred to as GMAB, that combines concepts from the reinforcement learning domain of multi-armed bandits and random search strategies from the domain of genetic algorithms to solve discrete stochastic optimization problems via simulation. In particular, the focus is on noisy large-scale problems, which often involve a multitude of dimensions as well as multiple local optima. Our aim is to combine the property of multi-armed bandits to cope with volatile simulation observations with the ability of genetic algorithms to handle high-dimensional solution spaces accompanied by an enormous number of feasible solutions. For this purpose, a multi-armed bandit framework serves as a foundation, where each observed simulation is incorporated into the memory of GMAB. Based on this memory, genetic operators guide the search, as they provide powerful tools for exploration as well as exploitation. The empirical results demonstrate that GMAB achieves superior performance compared to benchmark algorithms from the literature in a large variety of test problems. In all experiments, GMAB required considerably fewer simulations to achieve similar or (far) better solutions than those generated by existing methods. At the same time, GMAB's overhead with regard to the required runtime is extremely small due to the suggested tree-based implementation of its memory. Furthermore, we prove its convergence to the set of global optima as the simulation effort goes to infinity.

相似度 · Learning · 代價 · 可約的 · INFORMS ·

2023 年 2 月 15 日

Aleksandr Beznosikov,Alexander Gasnikov

from arxiv, 19 pages, 2 algorithms, 1 table

Variational inequalities are a broad and flexible class of problems that includes minimization, saddle point, fixed point problems as special cases. Therefore, variational inequalities are used in a variety of applications ranging from equilibrium search to adversarial learning. Today's realities with the increasing size of data and models demand parallel and distributed computing for real-world machine learning problems, most of which can be represented as variational inequalities. Meanwhile, most distributed approaches has a significant bottleneck - the cost of communications. The three main techniques to reduce both the total number of communication rounds and the cost of one such round are the use of similarity of local functions, compression of transmitted information and local updates. In this paper, we combine all these approaches. Such a triple synergy did not exist before for variational inequalities and saddle problems, nor even for minimization problems. The methods presented in this paper have the best theoretical guarantees of communication complexity and are significantly ahead of other methods for distributed variational inequalities. The theoretical results are confirmed by adversarial learning experiments on synthetic and real datasets.

知識 (knowledge) · MoDELS · 蒸餾 · 集成 · 集成學習 ·

2023 年 2 月 14 日

Multi-teacher knowledge distillation as an effective method for compressing ensembles of neural networks

Konrad Zuchniak

from arxiv, Doctoral dissertation in the field of computer science, machine learning. Application of knowledge distillation as aggregation of ensemble models. Along with several uses. 140 pages, 67 figures, 13 tables

Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have achieved great success, but the enormous computational complexity and gigantic storage requirements make it extremely difficult to implement them in real-time applications. On the other hand, the size of the dataset is still a real problem in many domains. Data are often missing, too expensive, or impossible to obtain for other reasons. Ensemble learning is partially a solution to the problem of small datasets and overfitting. However, ensemble learning in its basic version is associated with a linear increase in computational complexity. We analyzed the impact of the ensemble decision-fusion mechanism and checked various methods of sharing the decisions including voting algorithms. We used the modified knowledge distillation framework as a decision-fusion mechanism which allows in addition compressing of the entire ensemble model into a weight space of a single model. We showed that knowledge distillation can aggregate knowledge from multiple teachers in only one student model and, with the same computational complexity, obtain a better-performing model compared to a model trained in the standard manner. We have developed our own method for mimicking the responses of all teachers at the same time, simultaneously. We tested these solutions on several benchmark datasets. In the end, we presented a wide application use of the efficient multi-teacher knowledge distillation framework. In the first example, we used knowledge distillation to develop models that could automate corrosion detection on aircraft fuselage. The second example describes detection of smoke on observation cameras in order to counteract wildfires in forests.

泛函 · Continuity · Learning · MoDELS · 推斷 ·

2023 年 2 月 13 日

Variational Mixture of HyperGenerators for Learning Distributions Over Functions

Batuhan Koyuncu,Pablo Sanchez-Martin,Ignacio Peis,Pablo M. Olmos,Isabel Valera

Recent approaches build on implicit neural representations (INRs) to propose generative models over function spaces. However, they are computationally intensive when dealing with inference tasks, such as missing data imputation, or directly cannot tackle them. In this work, we propose a novel deep generative model, named VAMoH. VAMoH combines the capabilities of modeling continuous functions using INRs and the inference capabilities of Variational Autoencoders (VAEs). In addition, VAMoH relies on a normalizing flow to define the prior, and a mixture of hypernetworks to parametrize the data log-likelihood. This gives VAMoH a high expressive capability and interpretability. Through experiments on a diverse range of data types, such as images, voxels, and climate data, we show that VAMoH can effectively learn rich distributions over continuous functions. Furthermore, it can perform inference-related tasks, such as conditional super-resolution generation and in-painting, as well or better than previous approaches, while being less computationally demanding.

核化 · 最速下降法 · 正定核函數 · 再生核希爾伯特空間 · INTERACT ·

2023 年 2 月 12 日

On the geometry of Stein variational gradient descent

A. Duncan,N. Nuesken,L. Szpruch

from arxiv, 40 pages, 4 figures

Bayesian inference problems require sampling or approximating high-dimensional probability distributions. The focus of this paper is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm. This construction leads to interacting particle systems, the mean-field limit of which is a gradient flow on the space of probability distributions equipped with a certain geometrical structure. We leverage this viewpoint to shed some light on the convergence properties of the algorithm, in particular addressing the problem of choosing a suitable positive definite kernel function. Our analysis leads us to considering certain nondifferentiable kernels with adjusted tails. We demonstrate significant performance gains of these in various numerical experiments.

縮放 · 估計/估計量 · 推斷 · Weight · 可約的 ·

2023 年 2 月 10 日

Balancing Approach for Causal Inference at Scale

Sicheng Lin,Meng Xu,Xi Zhang,Shih-Kang Chao,Ying-Kai Huang,Xiaolin Shi

from arxiv, Submitted to KDD 2023

With the modern software and online platforms to collect massive amount of data, there is an increasing demand of applying causal inference methods at large scale when randomized experimentation is not viable. Weighting methods that directly incorporate covariate balancing have recently gained popularity for estimating causal effects in observational studies. These methods reduce the manual efforts required by researchers to iterate between propensity score modeling and balance checking until a satisfied covariate balance result. However, conventional solvers for determining weights lack the scalability to apply such methods on large scale datasets in companies like Snap Inc. To address the limitations and improve computational efficiency, in this paper we present scalable algorithms, DistEB and DistMS, for two balancing approaches: entropy balancing and MicroSynth. The solvers have linear time complexity and can be conveniently implemented in distributed computing frameworks such as Spark, Hive, etc. We study the properties of balancing approaches at different scales up to 1 million treated units and 487 covariates. We find that with larger sample size, both bias and variance in the causal effect estimation are significantly reduced. The results emphasize the importance of applying balancing approaches on large scale datasets. We combine the balancing approach with a synthetic control framework and deploy an end-to-end system for causal impact estimation at Snap Inc.

樣本 · 白化 · 推斷 · 采樣法 · 差分進化 ·

2023 年 2 月 10 日

Comparison of Step Samplers for Nested Sampling

Johannes Buchner

from arxiv, accepted MaxEnt 2022 proceeding, published in Physical Sciences Forum. UltraNest nested sampling package //johannesbuchner.github.io/UltraNest/

Bayesian inference with nested sampling requires a likelihood-restricted prior sampling method, which draws samples from the prior distribution that exceed a likelihood threshold. For high-dimensional problems, Markov Chain Monte Carlo derivatives have been proposed. We numerically study ten algorithms based on slice sampling, hit-and-run and differential evolution algorithms in ellipsoidal, non-ellipsoidal and non-convex problems from 2 to 100 dimensions. Mixing capabilities are evaluated with the nested sampling shrinkage test. This makes our results valid independent of how heavy-tailed the posteriors are. Given the same number of steps, slice sampling is outperformed by hit-and-run and whitened slice sampling, while whitened hit-and-run does not provide as good results. Proposing along differential vectors of live point pairs also leads to the highest efficiencies, and appears promising for multi-modal problems. The tested proposals are implemented in the UltraNest nested sampling package, enabling efficient low and high-dimensional inference of a large class of practical inference problems relevant to astronomy, cosmology, particle physics and astronomy.

Learning · 潛在 · Seven · 表示學習 · 表示 ·

2023 年 2 月 10 日

Out-of-Distribution Representation Learning for Time Series Classification

Wang Lu,Jindong Wang,Xinwei Sun,Yiqiang Chen,Xing Xie

from arxiv, ICLR 2023; code is at: //github.com/microsoft/robustlearn

Time series classification is an important problem in real world. Due to its non-stationary property that the distribution changes over time, it remains challenging to build models for generalization to unseen distributions. In this paper, we propose to view the time series classification problem from the distribution perspective. We argue that the temporal complexity attributes to the unknown latent distributions within. To this end, we propose DIVERSIFY to learn generalized representations for time series classification. DIVERSIFY takes an iterative process: it first obtains the worst-case distribution scenario via adversarial training, then matches the distributions of the obtained sub-domains. We also present some theoretical insights. We conduct experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition with a total of seven datasets in different settings. Results demonstrate that DIVERSIFY significantly outperforms other baselines and effectively characterizes the latent distributions by qualitative and quantitative analysis. Code is available at: //github.com/microsoft/robustlearn.

混合專家模型 · Processing（編程語言） · Performer · 可約的 · MoDELS ·

2023 年 2 月 9 日

Gaussian Process-Gated Hierarchical Mixtures of Experts

Yuhao Liu,Marzieh Ajirak,Petar Djuric

In this paper, we propose novel Gaussian process-gated hierarchical mixtures of experts (GPHMEs) that are used for building gates and experts. Unlike in other mixtures of experts where the gating models are linear to the input, the gating functions of our model are inner nodes built with Gaussian processes based on random features that are non-linear and non-parametric. Further, the experts are also built with Gaussian processes and provide predictions that depend on test data. The optimization of the GPHMEs is carried out by variational inference. There are several advantages of the proposed GPHMEs. One is that they outperform tree-based HME benchmarks that partition the data in the input space. Another advantage is that they achieve good performance with reduced complexity. A third advantage of the GPHMEs is that they provide interpretability of deep Gaussian processes and more generally of deep Bayesian neural networks. Our GPHMEs demonstrate excellent performance for large-scale data sets even with quite modest sizes.

社區發現 · 結點 · 生成模型 · 表示學習 · 多項分布 ·

2019 年 9 月 17 日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Fan-Yun Sun,Meng Qu,Jordan Hoffmann,Chin-Wei Huang,Jian Tang

from arxiv, Accepted Paper at NeurIPS 2019

This paper focuses on two fundamental tasks of graph analysis: community detection and node representation learning, which capture the global and local structures of graphs, respectively. In the current literature, these two tasks are usually independently studied while they are actually highly correlated. We propose a probabilistic generative model called vGraph to learn community membership and node representation collaboratively. Specifically, we assume that each node can be represented as a mixture of communities, and each community is defined as a multinomial distribution over nodes. Both the mixing coefficients and the community distribution are parameterized by the low-dimensional representations of the nodes and communities. We designed an effective variational inference algorithm which regularizes the community membership of neighboring nodes to be similar in the latent space. Experimental results on multiple real-world graphs show that vGraph is very effective in both community detection and node representation learning, outperforming many competitive baselines in both tasks. We show that the framework of vGraph is quite flexible and can be easily extended to detect hierarchical communities.