姑娘日本电影免费观看全集中文_女人让男人桶爽在线观看_日韩精品中文字幕一区二区_日本一区二区在线_人人莫人人干人人操人人人摸_狠狠色精品中文字幕久久一区_国产黄在线视频免费网站观看

We introduce a novel Bayesian approach for variable selection using Gaussian process regression, which is crucial for enhancing interpretability and model regularization. Our method employs nearest neighbor Gaussian processes, serving as scalable approximations of classical Gaussian processes. Variable selection is achieved by conditioning the process mean and covariance function on a random set that represents the indices of contributing variables. A priori beliefs regarding this set control the variable selection, while reference priors are assigned to the remaining model parameters, ensuring numerical robustness in the process covariance matrix. We propose a Metropolis-Within-Gibbs algorithm for model inference. Evaluation using simulated data, a computer experiment approximation, and two real-world data sets demonstrate the effectiveness of our approach.

相關內容

Processing（編程語言）

關注 121

Processing 是一門開(kai)源編(bian)程語言(yan)和與之配套(tao)的集(ji)成開(kai)發環境（IDE）的名稱。Processing 在(zai)電子(zi)藝術和視覺設(she)計社區被用來教授編(bian)程基礎，并運用于大(da)量(liang)的新媒體和互動藝術作品(pin)中。

Learning · 向量化 · 源領域 · 目標領域 · Performance ·

2024 年 11 月 4 日

Transferable Sequential Recommendation via Vector Quantized Meta Learning

Zhenrui Yue,Huimin Zeng,Yang Zhang,Julian McAuley,Dong Wang

from arxiv, Accepted to BigData 2024

While sequential recommendation achieves significant progress on capturing user-item transition patterns, transferring such large-scale recommender systems remains challenging due to the disjoint user and item groups across domains. In this paper, we propose a vector quantized meta learning for transferable sequential recommenders (MetaRec). Without requiring additional modalities or shared information across domains, our approach leverages user-item interactions from multiple source domains to improve the target domain performance. To solve the input heterogeneity issue, we adopt vector quantization that maps item embeddings from heterogeneous input spaces to a shared feature space. Moreover, our meta transfer paradigm exploits limited target data to guide the transfer of source domain knowledge to the target domain (i.e., learn to transfer). In addition, MetaRec adaptively transfers from multiple source tasks by rescaling meta gradients based on the source-target domain similarity, enabling selective learning to improve recommendation performance. To validate the effectiveness of our approach, we perform extensive experiments on benchmark datasets, where MetaRec consistently outperforms baseline methods by a considerable margin.

樣本 · 去噪 · Pivotal（公司） · 峰值 · 散度 ·

2024 年 11 月 3 日

Denoising Fisher Training For Neural Implicit Samplers

Weijian Luo,Wei Deng

Efficient sampling from un-normalized target distributions is pivotal in scientific computing and machine learning. While neural samplers have demonstrated potential with a special emphasis on sampling efficiency, existing neural implicit samplers still have issues such as poor mode covering behavior, unstable training dynamics, and sub-optimal performances. To tackle these issues, in this paper, we introduce Denoising Fisher Training (DFT), a novel training approach for neural implicit samplers with theoretical guarantees. We frame the training problem as an objective of minimizing the Fisher divergence by deriving a tractable yet equivalent loss function, which marks a unique theoretical contribution to assessing the intractable Fisher divergences. DFT is empirically validated across diverse sampling benchmarks, including two-dimensional synthetic distribution, Bayesian logistic regression, and high-dimensional energy-based models (EBMs). Notably, in experiments with high-dimensional EBMs, our best one-step DFT neural sampler achieves results on par with MCMC methods with up to 200 sampling steps, leading to a substantially greater efficiency over 100 times higher. This result not only demonstrates the superior performance of DFT in handling complex high-dimensional sampling but also sheds light on efficient sampling methodologies across broader applications.

估計/估計量 · MoDELS · 均值 · 可辨認的 · Continuity ·

2024 年 11 月 1 日

Structural Nested Mean Models Under Parallel Trends Assumptions

Zach Shahn,Oliver Dukes,Meghana Shamsunder,David Richardson,Eric Tchetgen Tchetgen,James Robins

We link and extend two approaches to estimating time-varying treatment effects on repeated continuous outcomes--time-varying Difference in Differences (DiD; see Roth et al. (2023) and Chaisemartin et al. (2023) for reviews) and Structural Nested Mean Models (SNMMs; see Vansteelandt and Joffe (2014) for a review). In particular, we show that SNMMs, which were previously only known to be nonparametrically identified under a no unobserved confounding assumption, are also identified under a generalized version of the parallel trends assumption typically used to justify time-varying DiD methods. Because SNMMs model a broader set of causal estimands, our results allow practitioners of existing time-varying DiD approaches to address additional types of substantive questions under similar assumptions. SNMMs enable estimation of time-varying effect heterogeneity, lasting effects of a `blip' of treatment at a single time point, effects of sustained interventions (possibly on continuous or multi-dimensional treatments) when treatment repeatedly changes value in the data, controlled direct effects, effects of dynamic treatment strategies that depend on covariate history, and more. Our results also allow analysts who apply SNMMs under the no unobserved confounding assumption to estimate some of the same causal effects under alternative identifying assumptions. We provide a method for sensitivity analysis to violations of our parallel trends assumption. We further explain how to estimate optimal treatment regimes via optimal regime SNMMs under parallel trends assumptions plus an assumption that there is no effect modification by unobserved confounders. Finally, we illustrate our methods with real data applications estimating effects of Medicaid expansion on uninsurance rates, effects of floods on flood insurance take-up, and effects of sustained changes in temperature on crop yields.

INTERACT · Hive · motivation · 系統架構 · 講稿 ·

2024 年 11 月 1 日

Enhancing Semantic Interoperability Across Materials Science With HIVE4MAT

Jane Greenberg,Kio Polson,Scott McClellan,Xintong Zhao,Alex Kalinowski,Yuan An

from arxiv, 11 pages, 1 figures, 3 tables, to be published in SeMatS 2024 workshop proceedings

HIVE4MAT is a linked data interactive application for navigating ontologies of value to materials science. HIVE enables automatic indexing of textual resources with standardized terminology. This article presents the motivation underlying HIVE4MAT, explains the system architecture, reports on two evaluations, and discusses future plans.

MoDELS · 近似 · Markov · Processing（編程語言） · 多樣性 ·

2024 年 10 月 31 日

Generative Fractional Diffusion Models

Gabriel Nobis,Maximilian Springenberg,Marco Aversa,Michael Detzel,Rembert Daems,Roderick Murray-Smith,Shinichi Nakajima,Sebastian Lapuschkin,Stefano Ermon,Tolga Birdal,Manfred Opper,Christoph Knochenhauer,Luis Oala,Wojciech Samek

We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tailed Brownian motion (BM) with independent increments. In this paper, we replace BM with an approximation of its non-Markovian counterpart, fractional Brownian motion (fBM), characterized by correlated increments and Hurst index $H \in (0,1)$, where $H=0.5$ recovers the classical BM. To ensure tractable inference and learning, we employ a recently popularized Markov approximation of fBM (MA-fBM) and derive its reverse-time model, resulting in generative fractional diffusion models (GFDM). We characterize the forward dynamics using a continuous reparameterization trick and propose augmented score matching to efficiently learn the score function, which is partly known in closed form, at minimal added cost. The ability to drive our diffusion model via MA-fBM offers flexibility and control. $H \leq 0.5$ enters the regime of rough paths whereas $H>0.5$ regularizes diffusion paths and invokes long-term memory. The Markov approximation allows added control by varying the number of Markov processes linearly combined to approximate fBM. Our evaluations on real image datasets demonstrate that GFDM achieves greater pixel-wise diversity and enhanced image quality, as indicated by a lower FID, offering a promising alternative to traditional diffusion models

論文 · 可辨認的 · 推斷 · 講稿 · 統計方法 ·

2024 年 10 月 31 日

The Nudge Average Treatment Effect

Eric J Tchetgen Tchetgen

from arxiv, 12 pages

The instrumental variable method is a prominent approach to recover under certain conditions, valid inference about a treatment causal effect even when unmeasured confounding might be present. In a groundbreaking paper, Imbens and Angrist (1994) established that a valid instrument nonparametrically identifies the average causal effect among compliers, also known as the local average treatment effect under a certain monotonicity assumption which rules out the existence of so-called defiers. An often-cited attractive property of monotonicity is that it facilitates a causal interpretation of the instrumental variable estimand without restricting the degree of heterogeneity of the treatment causal effect. In this paper, we introduce an alternative equally straightforward and interpretable condition for identification, which accommodates both the presence of defiers and heterogenous treatment effects. Mainly, we show that under our new conditions, the instrumental variable estimand recovers the average causal effect for the subgroup of units for whom the treatment is manipulable by the instrument, a subgroup which may consist of both defiers and compliers, therefore recovering an effect estimand we aptly call the Nudge Average Treatment Effect.

可行 · Pivotal（公司） · Cognition · Judea Pearl · SimPLe ·

2024 年 10 月 30 日

Natural Counterfactuals With Necessary Backtracking

Guang-Yuan Hao,Jiji Zhang,Biwei Huang,Hao Wang,Kun Zhang

from arxiv, Accepted to NeurIPS 2024

Counterfactual reasoning is pivotal in human cognition and especially important for providing explanations and making decisions. While Judea Pearl's influential approach is theoretically elegant, its generation of a counterfactual scenario often requires too much deviation from the observed scenarios to be feasible, as we show using simple examples. To mitigate this difficulty, we propose a framework of \emph{natural counterfactuals} and a method for generating counterfactuals that are more feasible with respect to the actual data distribution. Our methodology incorporates a certain amount of backtracking when needed, allowing changes in causally preceding variables to minimize deviations from realistic scenarios. Specifically, we introduce a novel optimization framework that permits but also controls the extent of backtracking with a naturalness criterion. Empirical experiments demonstrate the effectiveness of our method. The code is available at //github.com/GuangyuanHao/natural_counterfactuals.

圖形處理器 · 圖 · Neural Networks · Networking · Performer ·

2021 年 2 月 13 日

How Framelets Enhance Graph Neural Networks

Xuebin Zheng,Bingxin Zhou,Junbin Gao,Yu Guang Wang,Pietro Lio,Ming Li,Guido Montufar

from arxiv, 24 pages, 17 figures, 6 tables

This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. With the framelet system, we can decompose the graph feature into low-pass and high-pass frequencies as extracted features for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling strategy by aggregating the graph feature into low-pass and high-pass spectra, which considers both the feature values and geometry of the graph data and conserves the total information. The graph neural networks with the proposed framelet convolution and pooling achieve state-of-the-art performance in many types of node and graph prediction tasks. Moreover, we propose shrinkage as a new activation for the framelet convolution, which thresholds the high-frequency information at different scales. Compared to ReLU, shrinkage in framelet convolution improves the graph neural network model in terms of denoising and signal compression: noises in both node and structure can be significantly reduced by accurately cutting off the high-pass coefficients from framelet decomposition, and the signal can be compressed to less than half its original size with the prediction performance well preserved.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

BLEU · MoDELS · 注意力機制 · Transformer · Networking ·

2017 年 12 月 6 日

Attention Is All You Need

Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin

from arxiv, 15 pages, 5 figures

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.