99热日韩这里只有国产中文精品_A视频看现在的线_日韩一区二区乱码免费欢迎你_AV电影免费在线一区二区_无码久久久久久中文字幕2020_在线免费黄片一区二区_男女后进式猛烈XX00动态图片

A mixture of experts models the conditional density of a response variable using a mixture of regression models with covariate-dependent mixture weights. We extend the finite mixture of experts model by allowing the parameters in both the mixture components and the weights to evolve in time by following random walk processes. Inference for time-varying parameters in richly parameterized mixture of experts models is challenging. We propose a sequential Monte Carlo algorithm for online inference and based on a tailored proposal distribution built on ideas from linear Bayes methods and the EM algorithm. The method gives a unified treatment for mixtures with time-varying parameters, including the special case of static parameters. We assess the properties of the method on simulated data and on industrial data where the aim is to predict software faults in a continuously upgraded large-scale software project.

相關內容

混合專家模型

關注 5

MoDELS · 估計/估計量 · 約束 · 極大似然 · cancer ·

2022 年 11 月 17 日

Single-index mixture cure model under monotonicity constraints

Eni Musta,Tsz Pang Yuen

We consider survival data in the presence of a cure fraction, meaning that some subjects will never experience the event of interest. We assume a mixture cure model consisting of two sub-models: one for the probability of being uncured (incidence) and one for the survival of the uncured subjects (latency). Various approaches, ranging from parametric to nonparametric, have been used to model the effect of covariates on the incidence, with the logistic model being the most common one. We propose a monotone single-index model for the incidence and introduce a new estimation method that is based on the maximum likelihood approach and techniques from isotonic regression. The monotone single-index structure relaxes the parametric logistic assumption while maintaining interpretability of the regression coefficients. We investigate the consistency of the proposed estimator and show through a simulation study that, when the monotonicity assumption is satisfied, it performs better compared to the single-index/Cox mixture cure model. To illustrate its practical use, we use the new method to study melanoma cancer survival data.

MoDELS · 在線 · 可理解性 · Performer · 推斷 ·

2022 年 11 月 16 日

A Graph-Based Context-Aware Model to Understand Online Conversations

Vibhor Agarwal,Anthony P. Young,Sagar Joglekar,Nishanth Sastry

from arxiv, 25 pages, 9 figures. arXiv admin note: text overlap with arXiv:2202.08175

Online forums that allow for participatory engagement between users have been transformative for the public discussion of many important issues. However, such conversations can sometimes escalate into full-blown exchanges of hate and misinformation. Existing approaches in natural language processing (NLP), such as deep learning models for classification tasks, use as inputs only a single comment or a pair of comments depending upon whether the task concerns the inference of properties of the individual comments or the replies between pairs of comments, respectively. But in online conversations, comments and replies may be based on external context beyond the immediately relevant information that is input to the model. Therefore, being aware of the conversations' surrounding contexts should improve the model's performance for the inference task at hand. We propose GraphNLI, a novel graph-based deep learning architecture that uses graph walks to incorporate the wider context of a conversation in a principled manner. Specifically, a graph walk starts from a given comment and samples "nearby" comments in the same or parallel conversation threads, which results in additional embeddings that are aggregated together with the initial comment's embedding. We then use these enriched embeddings for downstream NLP prediction tasks that are important for online conversations. We evaluate GraphNLI on two such tasks - polarity prediction and misogynistic hate speech detection - and found that our model consistently outperforms all relevant baselines for both tasks. Specifically, GraphNLI with a biased root-seeking random walk performs with a macro-F1 score of 3 and 6 percentage points better than the best-performing BERT-based baselines for the polarity prediction and hate speech detection tasks, respectively.

優化器 · 賭博機/老虎機 · 泛函 · 邊緣計算 · 邊 ·

2022 年 11 月 16 日

Bayesian Optimization for Online Management in Dynamic Mobile Edge Computing

Jia Yan,Qin Lu,Georgios B. Giannakis

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Recent years have witnessed the emergence of mobile edge computing (MEC), on the premise of a cost-effective enhancement in the computational ability of hardware-constrained wireless devices (WDs) comprising the Internet of Things (IoT). In a general multi-server multi-user MEC system, each WD has a computational task to execute and has to select binary (off)loading decisions, along with the analog-amplitude resource allocation variables in an online manner, with the goal of minimizing the overall energy-delay cost (EDC) with dynamic system states. While past works typically rely on the explicit expression of the EDC function, the present contribution considers a practical setting, where in lieu of system state information, the EDC function is not available in analytical form, and instead only the function values at queried points are revealed. Towards tackling such a challenging online combinatorial problem with only bandit information, novel Bayesian optimization (BO) based approaches are put forth by leveraging the multi-armed bandit (MAB) framework. Per time slot, the discrete offloading decisions are first obtained via the MAB method, and the analog resource allocation variables are subsequently optimized using the BO selection rule. By exploiting both temporal and contextual information, two novel BO approaches, termed time-varying BO and contextual time-varying BO, are developed. Numerical tests validate the merits of the proposed BO approaches compared with contemporary benchmarks under different MEC network sizes.

MoDELS · Analysis · 泛函 · Processing（編程語言） · SimPLe ·

2022 年 11 月 16 日

Bayesian Nonparametric Erlang Mixture Modeling for Survival Analysis

Yunzhe Li,Juhee Lee,Athanasios Kottas

We develop a flexible Erlang mixture model for survival analysis. The model for the survival density is built from a structured mixture of Erlang densities, mixing on the integer shape parameter with a common scale parameter. The mixture weights are constructed through increments of a distribution function on the positive real line, which is assigned a Dirichlet process prior. The model has a relatively simple structure, balancing flexibility with efficient posterior computation. Moreover, it implies a mixture representation for the hazard function that involves time-dependent mixture weights, thus offering a general approach to hazard estimation. We extend the model to handle survival responses corresponding to multiple experimental groups, using a dependent Dirichlet process prior for the group-specific distributions that define the mixture weights. Model properties, prior specification, and posterior simulation are discussed, and the methodology is illustrated with synthetic and real data examples.

混合專家模型 · 泛化理論 · domain shift · Performer · 可辨認的 ·

2022 年 11 月 15 日

HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization

Jingang Qu,Thibault Faney,Ze Wang,Patrick Gallinari,Soleiman Yousef,Jean-Charles de Hemptinne

Due to the domain shift, machine learning systems typically fail to generalize well to domains different from those of training data, which is the problem that domain generalization (DG) aims to address. However, most mainstream DG algorithms lack interpretability and require domain labels, which are not available in many real-world scenarios. In this work, we propose a novel DG method, HMOE: Hypernetwork-based Mixture of Experts (MoE), that does not require domain labels and is more interpretable. We use hypernetworks to generate the weights of experts, allowing experts to share some useful meta-knowledge. MoE has proven adept at detecting and identifying heterogeneous patterns in data. For DG, heterogeneity exactly arises from the domain shift. We compare HMOE with other DG algorithms under a fair and unified benchmark-DomainBed. Extensive experiments show that HMOE can perform latent domain discovery from data of mixed domains and divide it into distinct clusters that are surprisingly more consistent with human intuition than original domain labels. Compared to other DG methods, HMOE shows competitive performance and achieves SOTA results in some cases without using domain labels.

核化 · 簇 · 可約的 · 可分離的 · Performer ·

2022 年 11 月 15 日

Coresets for Kernel Clustering

Shaofeng H. -C. Jiang,Robert Krauthgamer,Jianing Lou,Yubo Zhang

We devise coresets for kernel $k$-Means with a general kernel, and use them to obtain new, more efficient, algorithms. Kernel $k$-Means has superior clustering capability compared to classical $k$-Means, particularly when clusters are non-linearly separable, but it also introduces significant computational challenges. We address this computational issue by constructing a coreset, which is a reduced dataset that accurately preserves the clustering costs. Our main result is a coreset for kernel $k$-Means that works for a general kernel and has size $\mathrm{poly}(k\epsilon^{-1})$. Our new coreset both generalizes and greatly improves all previous results; moreover, it can be constructed in time near-linear in $n$. This result immediately implies new algorithms for kernel $k$-Means, such as a $(1+\epsilon)$-approximation in time near-linear in $n$, and a streaming algorithm using space and update time $\mathrm{poly}(k \epsilon^{-1} \log n)$. We validate our coreset on various datasets with different kernels. Our coreset performs consistently well, achieving small errors while using very few points. We show that our coresets can speed up kernel $k$-Means++ (the kernelized version of the widely used $k$-Means++ algorithm), and we further use this faster kernel $k$-Means++ for spectral clustering. In both applications, we achieve up to 1000x speedup while the error is comparable to baselines that do not use coresets.

近似 · Neural Networks · 循環神經網絡 · 動力系統 · Networking ·

2022 年 11 月 15 日

Universal Time-Uniform Trajectory Approximation for Random Dynamical Systems with Recurrent Neural Networks

Adrian N. Bishop

The capability of recurrent neural networks to approximate trajectories of a random dynamical system, with random inputs, on non-compact domains, and over an indefinite or infinite time horizon is considered. The main result states that certain random trajectories over an infinite time horizon may be approximated to any desired accuracy, uniformly in time, by a certain class of deep recurrent neural networks, with simple feedback structures. The formulation here contrasts with related literature on this topic, much of which is restricted to compact state spaces and finite time intervals. The model conditions required here are natural, mild, and easy to test, and the proof is very simple.

協方差矩陣 · 圖 · 正則化項 · 估計/估計量 · state-of-the-art ·

2022 年 11 月 14 日

A Density Evolution framework for Preferential Recovery of Covariance and Causal Graphs from Compressed Measurements

Muralikrishnna G. Sethuraman,Hang Zhang,Faramarz Fekri

In this paper, we propose a general framework for designing sensing matrix $\boldsymbol{A} \in \mathbb{R}^{d\times p}$, for estimation of sparse covariance matrix from compressed measurements of the form $\boldsymbol{y} = \boldsymbol{A}\boldsymbol{x} + \boldsymbol{n}$, where $\boldsymbol{y}, \boldsymbol{n} \in \mathbb{R}^d$, and $\boldsymbol{x} \in \mathbb{R}^p$. By viewing covariance recovery as inference over factor graphs via message passing algorithm, ideas from coding theory, such as \textit{Density Evolution} (DE), are leveraged to construct a framework for the design of the sensing matrix. The proposed framework can handle both (1) regular sensing, i.e., equal importance is given to all entries of the covariance, and (2) preferential sensing, i.e., higher importance is given to a part of the covariance matrix. Through experiments, we show that the sensing matrix designed via density evolution can match the state-of-the-art for covariance recovery in the regular sensing paradigm and attain improved performance in the preferential sensing regime. Additionally, we study the feasibility of causal graph structure recovery using the estimated covariance matrix obtained from the compressed measurements.

圖 · 學成 · MoDELS · Extensibility · 深度學習 ·

2022 年 2 月 24 日

Bayesian Deep Learning for Graphs

Federico Errica

from arxiv, PhD Thesis

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.