We consider the influence maximization problem over a temporal graph, where there is a single fixed source. We deviate from the standard model of influence maximization, where the goal is to choose the set of most influential vertices. Instead, in our model we are given a fixed vertex, or source, and the goal is to find the best time steps to transmit so that the influence of this vertex is maximized. We frame this problem as a spreading process that follows a variant of the susceptible-infected-susceptible (SIS) model and we focus on four objective functions. In the MaxSpread objective, the goal is to maximize the total number of vertices that get infected at least once. In the MaxViral objective, the goal is to maximize the number of vertices that are infected at the same time step. In the MaxViralTstep objective, the goal is to maximize the number of vertices that are infected at a given time step. Finally, in MinNonViralTime, the goal is to maximize the total number of vertices that get infected every $d$ time steps. We perform a thorough complexity theoretic analysis for these four objectives over three different scenarios: (1) the unconstrained setting where the source can transmit whenever it wants; (2) the window-constrained setting where the source has to transmit at either a predetermined, or a shifting window; (3) the periodic setting where the temporal graph has a small period. We prove that all of these problems, with the exception of MaxSpread for periodic graphs, are intractable even for very simple underlying graphs.

DiffRec: 擴散推薦模型（SIGIR'23）

TLDR: 本文將擴散模(mo)型應用于推(tui)薦系統(tong)中(zhong)，提出了一種新穎的(de)擴散推(tui)薦模(mo)型 DiffRec 以實現(xian)個(ge)(ge)性化推(tui)薦，并提出兩個(ge)(ge)變體 L-DiffRec 與 T-DiffRec將其推(tui)廣至大規模(mo)推(tui)薦場景與時序(xu)信(xin)息建模(mo)中(zhong)，三個(ge)(ge)數據集(ji)上的(de)實驗結果驗證(zheng)了該方法的(de)優(you)越性。

論文：Diffusion Recommender Model (SIGIR'23)

代碼：

摘要

生(sheng)成(cheng)式推薦(jian)模(mo)(mo)型，如(ru)生(sheng)成(cheng)對(dui)抗網絡（GANs）和(he) 變分(fen)自編(bian)碼器（VAEs），被廣泛應(ying)用(yong)(yong)于建模(mo)(mo)用(yong)(yong)戶(hu)交(jiao)互(hu)(hu)的(de)(de)(de)(de)(de)(de)(de)生(sheng)成(cheng)過程(cheng)。然而(er)，這些生(sheng)成(cheng)式模(mo)(mo)型都存(cun)在固有的(de)(de)(de)(de)(de)(de)(de)局限性，如(ru)GANs 的(de)(de)(de)(de)(de)(de)(de)訓(xun)練過程(cheng)不(bu)穩定，VAEs 的(de)(de)(de)(de)(de)(de)(de)表(biao)達能力(li)受限等問題，這導致模(mo)(mo)型難(nan)以(yi)(yi)對(dui)復雜的(de)(de)(de)(de)(de)(de)(de)用(yong)(yong)戶(hu)交(jiao)互(hu)(hu)（各種干擾(rao)因素導致交(jiao)互(hu)(hu)含有噪聲(sheng)）進(jin)(jin)行(xing)精(jing)確的(de)(de)(de)(de)(de)(de)(de)建模(mo)(mo)。鑒(jian)于擴(kuo)散(san)模(mo)(mo)型（Diffusion Model, DMs）在圖像(xiang)生(sheng)成(cheng)方(fang)面相比于傳統生(sheng)成(cheng)模(mo)(mo)型的(de)(de)(de)(de)(de)(de)(de)顯(xian)著優勢(shi)，我們提出了(le)擴(kuo)散(san)推薦(jian)模(mo)(mo)型（Diffusion Recommender Model, DiffRec），以(yi)(yi)去(qu)噪的(de)(de)(de)(de)(de)(de)(de)方(fang)式學習用(yong)(yong)戶(hu)交(jiao)互(hu)(hu)的(de)(de)(de)(de)(de)(de)(de)生(sheng)成(cheng)過程(cheng)。為(wei)了(le)保(bao)留用(yong)(yong)戶(hu)交(jiao)互(hu)(hu)歷(li)史中的(de)(de)(de)(de)(de)(de)(de)個性化信(xin)息(xi)，DiffRec 減少(shao)了(le)擴(kuo)散(san)過程(cheng)中添加的(de)(de)(de)(de)(de)(de)(de)噪聲(sheng)，并且避免像(xiang)圖像(xiang)合成(cheng)領域一(yi)(yi)樣，將用(yong)(yong)戶(hu)交(jiao)互(hu)(hu)破壞為(wei)純(chun)噪聲(sheng)。此外，為(wei)了(le)應(ying)對(dui)推薦(jian)系統的(de)(de)(de)(de)(de)(de)(de)實際應(ying)用(yong)(yong)場景所面臨的(de)(de)(de)(de)(de)(de)(de)挑戰：大規模(mo)(mo)物(wu)品預測(ce)將消耗(hao)大量(liang)計算資源，以(yi)(yi)及用(yong)(yong)戶(hu)偏好會(hui)隨時間(jian)(jian)變化，我們提出 DiffRec 的(de)(de)(de)(de)(de)(de)(de)兩(liang)個變體。L-DiffRec 對(dui)物(wu)品聚類并進(jin)(jin)行(xing)維度壓縮，在隱(yin)空間(jian)(jian)中進(jin)(jin)行(xing)擴(kuo)散(san)過程(cheng)；T-DiffRec 根據交(jiao)互(hu)(hu)時間(jian)(jian)先后對(dui)用(yong)(yong)戶(hu)的(de)(de)(de)(de)(de)(de)(de)交(jiao)互(hu)(hu)賦予不(bu)同(tong)的(de)(de)(de)(de)(de)(de)(de)權重以(yi)(yi)編(bian)碼時序(xu)信(xin)息(xi)。我們在三個數據集上進(jin)(jin)行(xing)了(le)廣泛的(de)(de)(de)(de)(de)(de)(de)實驗，實驗結果和(he)進(jin)(jin)一(yi)(yi)步的(de)(de)(de)(de)(de)(de)(de)分(fen)析(xi)驗證了(le) DiffRec 及其兩(liang)個變體的(de)(de)(de)(de)(de)(de)(de)優越性。

研究動機

生成式(shi)推(tui)薦模(mo)型（GANs, VAEs）通常假設用戶與物品間的(de)交(jiao)互行為(wei)（例如，點(dian)擊）是由某些(xie)潛在(zai)因素（例如，用戶偏好(hao)）決定的(de)，而這與真實世界中交(jiao)互的(de)生成過程一致(zhi)，該類模(mo)型因此(ci)取得了顯(xian)著(zhu)的(de)成功。當(dang)前(qian)的(de)生成式(shi)推(tui)薦主(zhu)要分為(wei)兩類：

基于 GAN 的模型采用生成器估計用戶的交互概率，并利用對抗訓練優化模型參數，但對抗訓練通常不穩定，導致模型難以獲得令人滿意的性能；
基于 VAE 的模型使用編碼器來近似潛在因素的后驗分布，并最大化被觀測交互的似然函數，如圖 1(a) 所示。在推薦領域中，盡管 VAEs 的性能往往優于 GANs，但該類模型需要在后驗分布的可解性與模型的表示能力間進行權衡。

圖1. VAE, DiffRec, L-DiffRec 模型結構圖以及推薦系統的目標

擴散(san)模型(xing)，如圖(tu) 1(b) 所示，在前向過(guo)程(cheng)中通過(guo)逐步(bu)添加高(gao)斯噪聲以破壞圖(tu)像信息，反向過(guo)程(cheng)中逐步(bu)去噪以重(zhong)構(gou)信息；該前向過(guo)程(cheng)滿(man)足后驗分(fen)布的(de)可解性，同(tong)時也使得利用神經網絡逐步(bu)建模復雜分(fen)布成為可能(neng)，這(zhe)緩(huan)解了 VAEs 所面臨的(de)問題。同(tong)時，推薦系(xi)統(tong)(tong)的(de)目標與擴散(san)模型(xing)是相吻合(he)的(de)，這(zhe)是因為推薦系(xi)統(tong)(tong)本質上是基(ji)于帶噪聲的(de)歷史(shi)交互(hu)（比如錯誤的(de)負樣本和錯誤的(de)正樣本）來(lai)推斷(duan)未來(lai)的(de)交互(hu)概率(lv)，如圖(tu) 1(c) 所示。因此，擴散(san)模型(xing)在推薦領域(yu)有著巨大的(de)潛力，能(neng)夠(gou)利用其強(qiang)大的(de)表(biao)示能(neng)力更準(zhun)確地建模復雜的(de)交互(hu)生成過(guo)程(cheng)。

模型介紹

DiffRec

圖2. DiffRec 概述：柱狀圖表示某用戶與所有物品的交互概率

如圖(tu) 2 所示，DiffRec 主要由兩部分組成：對于給定的(de)(de)用戶(hu)歷史交互，(1) 前(qian)向(xiang)過(guo)(guo)程(cheng)加入高斯噪(zao)聲逐步(bu)(bu)破壞交互信息；(2) 反(fan)向(xiang)過(guo)(guo)程(cheng)中模型逐步(bu)(bu)去噪(zao)并恢復原始(shi)信息。通過(guo)(guo)逐步(bu)(bu)學習上述(shu)的(de)(de)去噪(zao)過(guo)(guo)程(cheng)，DiffRec 能夠(gou)模擬復雜的(de)(de)交互生成過(guo)(guo)程(cheng)，同(tong)時減輕(qing)真實世界中的(de)(de)噪(zao)聲所帶來的(de)(de)影響。DiffRec 訓練與推(tui)斷偽代碼見圖(tu) 4。

此外，與圖像生成(cheng)任務不同，為保證用(yong)戶(hu)(hu)的個性化信息，我們在訓練時(shi)并(bing)沒有將(jiang)用(yong)戶(hu)(hu)交互破(po)壞(huai)為純噪聲，并(bing)且在訓練和推斷(duan)時(shi)均(jun)減少了(le)前向過程中添加(jia)的噪聲，這類似(si)于(yu) MultiVAE [1] 中利用(yong) $\beta$ 來控制先驗約(yue)束的強度。

圖4. DiffRec 訓練與推斷偽代碼

L-DiffRec

生成式(shi)模型(xing)通(tong)常需要同時預(yu)測(ce)用戶與(yu)所(suo)(suo)有物品的(de)交互(hu)概率(lv)，該過程(cheng)對計算(suan)資源的(de)大(da)量(liang)消(xiao)耗(hao)限制了(le)模型(xing)在工業界中的(de)應(ying)用。為降低(di)計算(suan)成本，我們基于(yu) DiffRec 提出其(qi)變(bian)體 L-DiffRec。如圖(tu) 5 所(suo)(suo)示(shi)(shi)，L-DiffRec 首先基于(yu)物品表示(shi)(shi)（LightGCN 訓(xun)練(lian)所(suo)(suo)得）采用 k-means 對物品進(jin)行(xing)(xing)聚(ju)類，根據(ju)聚(ju)類結果將交互(hu)歷史進(jin)行(xing)(xing)相應(ying)劃(hua)分，進(jin)一(yi)步(bu)通(tong)過多個(ge)編碼器(qi)對每(mei)類交互(hu)進(jin)行(xing)(xing)維度(du)壓(ya)縮，隨(sui)后在隱(yin)空間中進(jin)行(xing)(xing)擴散(san)模型(xing)的(de)前向與(yu)反向過程(cheng)，再通(tong)過多個(ge)解碼器(qi)映射回真實維度(du)進(jin)行(xing)(xing)排序(xu)與(yu)推薦。

圖5. L-DiffRec 模型結構圖

T-DiffRec

由于用(yong)戶的喜好可能(neng)隨著時(shi)(shi)間發(fa)生變化，故向推薦模型中引入(ru)時(shi)(shi)序信息是非常重要的。我們認為用(yong)戶最近(jin)交互(hu)的物品更能(neng)反(fan)應用(yong)戶當前的喜好，故依據交互(hu)時(shi)(shi)間先后(hou)賦予交互(hu)不同的權重以編碼時(shi)(shi)序信息。該策(ce)略(lve)可應用(yong)于 DiffRec 和(he) L-DiffRec 分別得到 T-DiffRec 和(he) LT-DiffRec。

實驗分析

我們在三個公開數據集(ji)（Amazon-book, Yelp, ML-1M）上基(ji)于不同設定進行(xing)實驗(yan)以驗(yan)證 DiffRec 的優越(yue)性。

DiffRec

實(shi)驗結果表明多數生成式模型(xing)能(neng)夠取得比 MF 和 LightGCN 更好的性(xing)能(neng)，且 DiffRec 在三個(ge)數據(ju)集(ji)上(shang)均(jun)能(neng)取得優于其他(ta)基(ji)線(xian)模型(xing)的性(xing)能(neng)。此外，我(wo)們在基(ji)礎(chu)實(shi)驗之(zhi)上(shang)對 DiffRec 進行進一(yi)步分(fen)析，實(shi)驗結果驗證了前述(shu)關(guan)于個(ge)性(xing)化(hua)推薦(jian)與模型(xing)預測目標(biao)的猜想(xiang)。

L-DiffRec

為驗證 L-DiffRec 在推(tui)薦性(xing)能與節約計算資源上的(de)效果，我(wo)們選取主實(shi)驗中性(xing)能最好(hao)的(de) MultiVAE 進(jin)行對(dui)比，實(shi)驗結果表(biao)明(ming)，L-DiffRec 能夠取得與 DiffRec 相當的(de)性(xing)能，而其所(suo)需的(de)計算資源大大減少(shao)。

T-DiffRec

我們將 T-DiffRec 和 LT-DiffRec 與當前(qian) SOTA 的序(xu)(xu)列推薦模(mo)型 ACVAE [2] 相比較(jiao)，實驗(yan)結(jie)果表明 T-DiffRec 能夠有效建模(mo)時序(xu)(xu)信息，盡(jin)管其(qi)模(mo)型參(can)數相對較(jiao)多，但顯(xian)存消(xiao)耗(hao)遠(yuan)少于 ACVAE。

總結

本(ben)工作中，我們基(ji)于擴散模型提(ti)出一(yi)種(zhong)(zhong)新型的(de)生成(cheng)(cheng)式(shi)推薦(jian)(jian)范式(shi)——擴散推薦(jian)(jian)模型（DiffRec），并針對推薦(jian)(jian)系統在(zai)實際應用(yong)場景中所面臨(lin)的(de)挑(tiao)戰提(ti)出基(ji)于 DiffRec 的(de)兩種(zhong)(zhong)變(bian)體(ti)：L-DiffRec 和 T-DiffRec，并在(zai)三(san)個數(shu)據集上的(de)實驗(yan)結果(guo)驗(yan)證了 DiffRec 及(ji)其變(bian)體(ti)的(de)優越性。本(ben)工作為生成(cheng)(cheng)式(shi)推薦(jian)(jian)開辟了一(yi)個新的(de)研(yan)究方向，在(zai)此基(ji)礎上還有許(xu)多(duo)(duo)值得探(tan)索(suo)的(de)內容：(1) 為 L-DiffRec 和 T-DiffRec 設(she)計更(geng)好的(de)維度壓(ya)縮和時(shi)序信息建(jian)模策略；(2) 基(ji)于 DiffRec 探(tan)索(suo)可控(kong)推薦(jian)(jian)；(3) 嘗(chang)試(shi)更(geng)多(duo)(duo)的(de)先驗(yan)假設(she)（例如，除高(gao)斯分布(bu)外的(de)其它(ta)噪(zao)聲(sheng)分布(bu)假設(she)）以(yi)及(ji)不同的(de)模型結構。

參考文獻

[1] Xiaopeng Li and James She. 2017. Collaborative variational autoencoder for recommender systems. In KDD. ACM, 305–314

[2] Zhe Xie, Chengxuan Liu, Yichi Zhang, Hongtao Lu, Dong Wang, and Yue Ding. 2021. Adversarial and contrastive variational autoencoder for sequential recommendation. In WWW. ACM, 449–459.

付費5元查看完整內容

Networking · SimPLe · Automator · INFORMS · Prompt ·

2021 年 6 月 11 日

Neural Architecture Search without Training

Joseph Mellor,Jack Turner,Amos Storkey,Elliot J. Crowley

from arxiv, Accepted at ICML 2021 for a long presentation

The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.

圖卷積網絡 · 交通需求預測 ·

2021 年 1 月 31 日

[付費(fei)5元(yuan)查看完整內容]【AAAI2021】面向交通需求(qiu)預測的耦合(he)層圖卷積

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

圖(tu)(tu)卷(juan)積(ji)(ji)網(wang)絡（GCN）因(yin)為具(ju)備出(chu)色的(de)(de)(de)(de)(de)捕(bu)捉站點或區(qu)域之(zhi)間(jian)(jian)(jian)非歐式空(kong)間(jian)(jian)(jian)依賴性(xing)的(de)(de)(de)(de)(de)能力，已廣(guang)泛應(ying)用(yong)于交通需(xu)(xu)求(qiu)預測。然而(er)在(zai)(zai)大多數現有研究中，圖(tu)(tu)卷(juan)積(ji)(ji)是在(zai)(zai)基于先(xian)驗知識生成的(de)(de)(de)(de)(de)鄰(lin)接(jie)矩陣(zhen)上(shang)實(shi)(shi)(shi)現的(de)(de)(de)(de)(de)，這(zhe)樣的(de)(de)(de)(de)(de)鄰(lin)接(jie)矩陣(zhen)既不(bu)能準確反映站點的(de)(de)(de)(de)(de)實(shi)(shi)(shi)際空(kong)間(jian)(jian)(jian)關(guan)系，也不(bu)能自適(shi)應(ying)地(di)捕(bu)捉需(xu)(xu)求(qiu)的(de)(de)(de)(de)(de)多層(ceng)級空(kong)間(jian)(jian)(jian)依賴性(xing)。為解(jie)決上(shang)述問題，這(zhe)篇論(lun)文(wen)提(ti)(ti)出(chu)了(le)(le)一種(zhong)新(xin)穎的(de)(de)(de)(de)(de)圖(tu)(tu)卷(juan)積(ji)(ji)網(wang)絡進(jin)(jin)行交通需(xu)(xu)求(qiu)預測。首先(xian)，文(wen)章中提(ti)(ti)出(chu)了(le)(le)一種(zhong)新(xin)的(de)(de)(de)(de)(de)圖(tu)(tu)卷(juan)積(ji)(ji)架(jia)構(gou)，該(gai)圖(tu)(tu)卷(juan)積(ji)(ji)架(jia)構(gou)在(zai)(zai)不(bu)同(tong)的(de)(de)(de)(de)(de)層(ceng)具(ju)有不(bu)同(tong)的(de)(de)(de)(de)(de)鄰(lin)接(jie)矩陣(zhen)，并且所有的(de)(de)(de)(de)(de)鄰(lin)接(jie)矩陣(zhen)在(zai)(zai)訓練過(guo)程中都是可以自學習的(de)(de)(de)(de)(de)。其(qi)次(ci)，文(wen)中提(ti)(ti)出(chu)了(le)(le)一種(zhong)分層(ceng)耦合機(ji)制，該(gai)機(ji)制將上(shang)層(ceng)鄰(lin)接(jie)矩陣(zhen)與(yu)下層(ceng)鄰(lin)接(jie)矩陣(zhen)關(guan)聯起來。它還減少了(le)(le)模(mo)(mo)型(xing)中參數的(de)(de)(de)(de)(de)規(gui)模(mo)(mo)。最后，構(gou)建了(le)(le)一個端(duan)到(dao)端(duan)的(de)(de)(de)(de)(de)網(wang)絡，通過(guo)將隱藏的(de)(de)(de)(de)(de)空(kong)間(jian)(jian)(jian)狀態與(yu)門控循環單(dan)元集(ji)成在(zai)(zai)一起，給(gei)出(chu)最終的(de)(de)(de)(de)(de)預測結(jie)果，該(gai)單(dan)元可以同(tong)時(shi)捕(bu)獲多級空(kong)間(jian)(jian)(jian)相關(guan)性(xing)和時(shi)間(jian)(jian)(jian)動態。論(lun)文(wen)提(ti)(ti)出(chu)的(de)(de)(de)(de)(de)模(mo)(mo)型(xing)在(zai)(zai)兩個真實(shi)(shi)(shi)世界的(de)(de)(de)(de)(de)數據集(ji)NYC Citi Bike和NYC Taxi上(shang)進(jin)(jin)行了(le)(le)實(shi)(shi)(shi)驗，結(jie)果證明(ming)了(le)(le)該(gai)模(mo)(mo)型(xing)的(de)(de)(de)(de)(de)優越性(xing)能。

//www.zhuanzhi.ai/paper/3996bc72f87617093a55530269f6fdd8

付費5元查看完整內容

多峰值 · 注意力分布 · 注意力機制 · Networking · INTERACT ·

2018 年 5 月 21 日

Bilinear Attention Networks

Jin-Hwa Kim,Jaehyun Jun,Byoung-Tak Zhang

from arxiv, 12 pages including 2 page appendix, 4 figures

Attention networks in multimodal learning provide an efficient way to utilize given visual information selectively. However, the computational cost to learn attention distributions for every pair of multimodal input channels is prohibitively expensive. To solve this problem, co-attention builds two separate attention distributions for each modality neglecting the interaction between multimodal inputs. In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly. BAN considers bilinear interactions among two groups of input channels, while low-rank bilinear pooling extracts the joint representations for each pair of channels. Furthermore, we propose a variant of multimodal residual networks to exploit eight-attention maps of the BAN efficiently. We quantitatively and qualitatively evaluate our model on visual question answering (VQA 2.0) and Flickr30k Entities datasets, showing that BAN significantly outperforms previous methods and achieves new state-of-the-arts on both datasets.