国产免费一区二区三区在线能观看-一级黄色视频一区

The Gaussian process state-space model (GPSSM) has garnered considerable attention over the past decade. However, the standard GP with a preliminary kernel, such as the squared exponential kernel or Mat\'{e}rn kernel, that is commonly used in GPSSM studies, limits the model's representation power and substantially restricts its applicability to complex scenarios. To address this issue, we propose a new class of probabilistic state-space models called TGPSSMs, which leverage a parametric normalizing flow to enrich the GP priors in the standard GPSSM, enabling greater flexibility and expressivity. Additionally, we present a scalable variational inference algorithm that offers a flexible and optimal structure for the variational distribution of latent states. The proposed algorithm is interpretable and computationally efficient due to the sparse GP representation and the bijective nature of normalizing flow. Moreover, we incorporate a constrained optimization framework into the algorithm to enhance the state-space representation capabilities and optimize the hyperparameters, leading to superior learning and inference performance. Experimental results on synthetic and real datasets corroborate that the proposed TGPSSM outperforms several state-of-the-art methods. The accompanying source code is available at \url{//github.com/zhidilin/TGPSSM}.

相關內容

狀態空間

關注 1

Analysis · MoDELS · Learning · 表示 · Extensibility ·

2023 年 5 月 24 日

Deep state-space modeling for explainable representation, analysis, and generation of professional human poses

Brenda Elizabeth Olivas-Padilla,Alina Glushkova,Sotiris Manitsaris

from arxiv, Under review

The analysis of human movements has been extensively studied due to its wide variety of practical applications, such as human-robot interaction, human learning applications, or clinical diagnosis. Nevertheless, the state-of-the-art still faces scientific challenges when modeling human movements. To begin, new models must account for the stochasticity of human movement and the physical structure of the human body in order to accurately predict the evolution of full-body motion descriptors over time. Second, while utilizing deep learning algorithms, their explainability in terms of body posture predictions needs to be improved as they lack comprehensible representations of human movement. This paper addresses these challenges by introducing three novel methods for creating explainable representations of human movement. In this study, human body movement is formulated as a state-space model adhering to the structure of the Gesture Operational Model (GOM), whose parameters are estimated through the application of deep learning and statistical algorithms. The trained models are used for the full-body dexterity analysis of expert professionals, in which dynamic associations between body joints are identified, and for generating artificially professional movements.

MoDELS · 近似 · 分離的 · 正則化項 · 估計/估計量 ·

2023 年 5 月 24 日

Approximation and existence of a viscoelastic phase-field model for tumour growth in two and three dimensions

Harald Garcke,Dennis Trautwein

In this work, we present a phase-field model for tumour growth, where a diffuse interface separates a tumour from the surrounding host tissue. In our model, we consider transport processes by an internal, non-solenoidal velocity field. We include viscoelastic effects with the help of a general Oldroyd-B type description with relaxation and possible stress generation by growth. The elastic energy density is coupled to the phase-field variable which allows to model invasive growth towards areas with less mechanical resistance. The main analytical result is the existence of weak solutions in two and three space dimensions in the case of additional stress diffusion. The idea behind the proof is to use a numerical approximation with a fully-practical, stable and (subsequence) converging finite element scheme. The physical properties of the model are preserved with the help of a regularization technique, uniform estimates and a limit passage on the fully-discrete level. Finally, we illustrate the practicability of the discrete scheme with the help of numerical simulations in two and three dimensions.

語言模型化 · 優化器 · 估計/估計量 · MoDELS · 移動平均 ·

2023 年 5 月 23 日

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Hong Liu,Zhiyuan Li,David Hall,Percy Liang,Tengyu Ma

Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variants have been state-of-the-art for years, and more sophisticated second-order (Hessian-based) optimizers often incur too much per-step overhead. In this paper, we propose Sophia, Second-order Clipped Stochastic Optimization, a simple scalable second-order optimizer that uses a light-weight estimate of the diagonal Hessian as the pre-conditioner. The update is the moving average of the gradients divided by the moving average of the estimated Hessian, followed by element-wise clipping. The clipping controls the worst-case update size and tames the negative impact of non-convexity and rapid change of Hessian along the trajectory. Sophia only estimates the diagonal Hessian every handful of iterations, which has negligible average per-step time and memory overhead. On language modeling with GPT-2 models of sizes ranging from 125M to 770M, Sophia achieves a 2x speed-up compared with Adam in the number of steps, total compute, and wall-clock time. Theoretically, we show that Sophia adapts to the curvature in different components of the parameters, which can be highly heterogeneous for language modeling tasks. Our run-time bound does not depend on the condition number of the loss.

可理解性 · 高斯分布 · 核化 · CASES · 樣本 ·

2023 年 5 月 23 日

Towards Understanding the Dynamics of Gaussian--Stein Variational Gradient Descent

Tianle Liu,Promit Ghosal,Krishnakumar Balasubramanian,Natesh Pillai

from arxiv, 57 pages, 6 figures

Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian as long as the initializer is Gaussian. Inspired by this fact, we undertake a detailed theoretical study of the Gaussian-SVGD, i.e., SVGD projected to the family of Gaussian distributions via the bilinear kernel, or equivalently Gaussian variational inference (GVI) with SVGD. We present a complete picture by considering both the mean-field PDE and discrete particle systems. When the target is strongly log-concave, the mean-field Gaussian-SVGD dynamics is proven to converge linearly to the Gaussian distribution closest to the target in KL divergence. In the finite-particle setting, there is both uniform in time convergence to the mean-field limit and linear convergence in time to the equilibrium if the target is Gaussian. In the general case, we propose a density-based and a particle-based implementation of the Gaussian-SVGD, and show that several recent algorithms for GVI, proposed from different perspectives, emerge as special cases of our unified framework. Interestingly, one of the new particle-based instance from this framework empirically outperforms existing approaches. Our results make concrete contributions towards obtaining a deeper understanding of both SVGD and GVI.

約束優化 · Integration · 優化器 · Learning · 數學優化 ·

2023 年 5 月 23 日

iCOIL: Scenario Aware Autonomous Parking Via Integrated Constrained Optimization and Imitation Learning

Lexiong Huang,Ruihua Han,Guoliang Li,He Li,Shuai Wang,Yang Wang,Chengzhong Xu

from arxiv, 6 pages, 8 figures, IEEE ICDCS Workshops, 2023

Autonomous parking (AP) is an emering technique to navigate an intelligent vehicle to a parking space without any human intervention. Existing AP methods based on mathematical optimization or machine learning may lead to potential failures due to either excessive execution time or lack of generalization. To fill this gap, this paper proposes an integrated constrained optimization and imitation learning (iCOIL) approach to achieve efficient and reliable AP. The iCOIL method has two candidate working modes, i.e., CO and IL, and adopts a hybrid scenario analysis (HSA) model to determine the better mode under various scenarios. We implement and verify iCOIL on the Macao Car Racing Metaverse (MoCAM) platform. Results show that iCOIL properly adapts to different scenarios during the entire AP procedure, and achieves significantly larger success rates than other benchmarks.

優化器 · 泛函 · Tensor · 離散化 · 最優化 ·

2023 年 5 月 22 日

PROTES: Probabilistic Optimization with Tensor Sampling

Anastasia Batsheva,Andrei Chertkov,Gleb Ryzhakov,Ivan Oseledets

We developed a new method PROTES for black-box optimization, which is based on the probabilistic sampling from a probability density function given in the low-parametric tensor train format. We tested it on complex multidimensional arrays and discretized multivariable functions taken, among others, from real-world applications, including unconstrained binary optimization and optimal control problems, for which the possible number of elements is up to $2^{100}$. In numerical experiments, both on analytic model functions and on complex problems, PROTES outperforms existing popular discrete optimization methods (Particle Swarm Optimization, Covariance Matrix Adaptation, Differential Evolution, and others).

隨機采樣 · UniFormer · 樣本 · 設計 · 優化器 ·

2023 年 5 月 20 日

SF-SFD: Stochastic Optimization of Fourier Coefficients to Generate Space-Filling Designs

Manisha Garg,Tyler Chang,Krishnan Raghavan

Due to the curse of dimensionality, it is often prohibitively expensive to generate deterministic space-filling designs. On the other hand, when using na{\"i}ve uniform random sampling to generate designs cheaply, design points tend to concentrate in a small region of the design space. Although, it is preferable in these cases to utilize quasi-random techniques such as Sobol sequences and Latin hypercube designs over uniform random sampling in many settings, these methods have their own caveats especially in high-dimensional spaces. In this paper, we propose a technique that addresses the fundamental issue of measure concentration by updating high-dimensional distribution functions to produce better space-filling designs. Then, we show that our technique can outperform Latin hypercube sampling and Sobol sequences by the discrepancy metric while generating moderately-sized space-filling samples for high-dimensional problems.

相互獨立的 · Processing（編程語言） · MoDELS · 潛在 · 類別 ·

2023 年 5 月 19 日

Generating Independent Replicates Directly from the Posterior Distribution for a Class of Spatial Latent Gaussian Process Models

Jonathan R. Bradley,Madelyn Clinch

Markov chain Monte Carlo (MCMC) allows one to generate dependent replicates from a posterior distribution for effectively any Bayesian hierarchical model. However, MCMC can produce a significant computational burden. This motivates us to consider finding expressions of the posterior distribution that are computationally straightforward to obtain independent replicates from directly. We focus on a broad class of Bayesian latent Gaussian process (LGP) models that allow for spatially dependent data. First, we derive a new class of distributions we refer to as the generalized conjugate multivariate (GCM) distribution. The GCM distribution's theoretical development is similar to that of the CM distribution with two main differences; namely, (1) the GCM allows for latent Gaussian process assumptions, and (2) the GCM explicitly accounts for hyperparameters through marginalization. The development of GCM is needed to obtain independent replicates directly from the exact posterior distribution, which has an efficient projection/regression form. Hence, we refer to our method as Exact Posterior Regression (EPR). Illustrative examples are provided including simulation studies for weakly stationary spatial processes and spatial basis function expansions. An additional analysis of poverty incidence data from the U.S. Census Bureau's American Community Survey (ACS) using a conditional autoregressive model is presented.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.