国产亚洲欧美日韩精品色狠二区,人妻丰满AV中文久久不卡

Real-world content recommendation marketplaces exhibit certain behaviors and are imposed by constraints that are not always apparent in common static offline data sets. One example that is common in ad marketplaces is swift ad turnover. New ads are introduced and old ads disappear at high rates every day. Another example is ad discontinuity, where existing ads may appear and disappear from the market for non negligible amounts of time due to a variety of reasons (e.g., depletion of budget, pausing by the advertiser, flagging by the system, and more). These behaviors sometimes cause the model's loss surface to change dramatically over short periods of time. To address these behaviors, fresh models are highly important, and to achieve this (and for several other reasons) incremental training on small chunks of past events is often employed. These behaviors and algorithmic optimizations occasionally cause model parameters to grow uncontrollably large, or \emph{diverge}. In this work present a systematic method to prevent model parameters from diverging by imposing a carefully chosen set of constraints on the model's latent vectors. We then devise a method inspired by primal-dual optimization algorithms to fulfill these constraints in a manner which both aligns well with incremental model training, and does not require any major modifications to the underlying model training algorithm. We analyze, demonstrate, and motivate our method on OFFSET, a collaborative filtering algorithm which drives Yahoo native advertising, which is one of VZM's largest and faster growing businesses, reaching a run-rate of many hundreds of millions USD per year. Finally, we conduct an online experiment which shows a substantial reduction in the number of diverging instances, and a significant improvement to both user experience and revenue.

相關內容

散度

關注 0

Extensibility · Networking · INFORMS · 特征提取器 · Weight ·

2022 年 1 月 19 日

Privacy-Aware Human Mobility Prediction via Adversarial Networks

Yuting Zhan,Alex Kyllo,Afra Mashhadi,Hamed Haddadi

As various mobile devices and location-based services are increasingly developed in different smart city scenarios and applications, many unexpected privacy leakages have arisen due to geolocated data collection and sharing. While these geolocated data could provide a rich understanding of human mobility patterns and address various societal research questions, privacy concerns for users' sensitive information have limited their utilization. In this paper, we design and implement a novel LSTM-based adversarial mechanism with representation learning to attain a privacy-preserving feature representation of the original geolocated data (mobility data) for a sharing purpose. We quantify the utility-privacy trade-off of mobility datasets in terms of trajectory reconstruction risk, user re-identification risk, and mobility predictability. Our proposed architecture reports a Pareto Frontier analysis that enables the user to assess this trade-off as a function of Lagrangian loss weight parameters. The extensive comparison results on four representative mobility datasets demonstrate the superiority of our proposed architecture and the efficiency of the proposed privacy-preserving features extractor. Our results show that by exploring Pareto optimal setting, we can simultaneously increase both privacy (45%) and utility (32%).

Processing（編程語言） · MoDELS · Performer · Neural Networks · Networking ·

2022 年 1 月 19 日

Bayesian Neural Hawkes Process for Event Uncertainty Prediction

Manisha Dubey,Ragja Palakkadavath,P. K. Srijith

from arxiv, 13 pages, 6 tables, 4 plots

Event data consisting of time of occurrence of the events arises in several real-world applications. Recent works have introduced neural network based point processes for modeling event-times, and were shown to provide state-of-the-art performance in predicting event-times. However, neural point process models lack a good uncertainty quantification capability on predictions. A proper uncertainty quantification over event modeling will help in better decision making for many practical applications. Therefore, we propose a novel point process model, Bayesian Neural Hawkes process (BNHP) which leverages uncertainty modelling capability of Bayesian models and generalization capability of the neural networks to model event occurrence times. We augment the model with spatio-temporal modeling capability where it can consider uncertainty over predicted time and location of the events. Experiments on simulated and real-world datasets show that BNHP significantly improves prediction performance and uncertainty quantification for modelling events.

可辨認的 · 可理解性 · 簇 · Networking · Performer ·

2022 年 1 月 18 日

An Empirical Investigation of Worker Communities in TopCoder

Razieh Saremi,Hamid Shamszare,Marzieh Lotfalian Saremi,Ye Yang

from arxiv, 10 pages, 7 figure, 4 tables

Software crowdsourcing platforms employ extrinsic rewards such as rating or ranking systems to motivate workers. Such rating systems are noisy and provide limited knowledge about workers' preferences and performance. To develop better understanding of worker reliability and trustworthiness in software crowdsourcing, this paper reports an empirical study conducted on more than one year's real-world data from TopCoder, one of the leading software crowdsourcing platforms. To do so, first, we create a bipartite network of active workers based on common task registrations. Then, we use the Clauset-Newman-Moore graph clustering algorithm to identify worker clusters in the network. Finally, we conduct an empirical evaluation to measure and analyze workers' behavior per identified community in the platform by workers' rating. More specifically, workers' behavior is analyzed based on their performances in terms of reliability, trustworthiness, and success; their preferences in terms of efficiency and elasticity; and strategies in terms of comfort, confidence, and deceitfulness. The main result of this study identified four communities of active workers: mixed-ranked, high-ranked, mid-ranked, and low-ranked. This study shows that the low-ranked community associates with the highest reliable workers with an average reliability of 25%, while the mixed-ranked community contains the most trustworthy workers with average trustworthiness of 16%. Such empirical evidence is beneficial to help exploring resourcing options while understanding the relations among unknown resources to improve task success.

風險函數 · 優化器 · CASE · 泛函 · 損失函數（機器學習） ·

2022 年 1 月 16 日

Faster Rates of Private Stochastic Convex Optimization

Jinyan Su,Lijie Hu,Di Wang

from arxiv, To appear in The 33rd International Conference on Algorithmic Learning Theory. In this version, we fixed some typos and correct the prove of lower bound

In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) and provide excess population risks for some special classes of functions that are faster than the previous results of general convex and strongly convex functions. In the first part of the paper, we study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $\theta>1$. Specifically, we first show that under some mild assumptions on the loss functions, there is an algorithm whose output could achieve an upper bound of $\tilde{O}((\frac{1}{\sqrt{n}}+\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $(\epsilon, \delta)$-DP when $\theta\geq 2$, here $n$ is the sample size and $d$ is the dimension of the space. Then we address the inefficiency issue, improve the upper bounds by $\text{Poly}(\log n)$ factors and extend to the case where $\theta\geq \bar{\theta}>1$ for some known $\bar{\theta}$. Next we show that the excess population risk of population functions satisfying TNC with parameter $\theta\geq 2$ is always lower bounded by $\Omega((\frac{d}{n\epsilon})^\frac{\theta}{\theta-1}) $ and $\Omega((\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $\epsilon$-DP and $(\epsilon, \delta)$-DP, respectively. In the second part, we focus on a special case where the population risk function is strongly convex. Unlike the previous studies, here we assume the loss function is {\em non-negative} and {\em the optimal value of population risk is sufficiently small}. With these additional assumptions, we propose a new method whose output could achieve an upper bound of $O(\frac{d\log\frac{1}{\delta}}{n^2\epsilon^2}+\frac{1}{n^{\tau}})$ for any $\tau\geq 1$ in $(\epsilon,\delta)$-DP model if the sample size $n$ is sufficiently large.

隨機梯度下降 · 估計/估計量 · 去噪 · MoDELS · 圖像降噪 ·

2022 年 1 月 16 日

On Maximum-a-Posteriori estimation with Plug & Play priors and stochastic gradient descent

Rémi Laumont,Valentin de Bortoli,Andrés Almansa,Julie Delon,Alain Durmus,Marcelo Pereyra

Bayesian methods to solve imaging inverse problems usually combine an explicit data likelihood function with a prior distribution that explicitly models expected properties of the solution. Many kinds of priors have been explored in the literature, from simple ones expressing local properties to more involved ones exploiting image redundancy at a non-local scale. In a departure from explicit modelling, several recent works have proposed and studied the use of implicit priors defined by an image denoising algorithm. This approach, commonly known as Plug & Play (PnP) regularisation, can deliver remarkably accurate results, particularly when combined with state-of-the-art denoisers based on convolutional neural networks. However, the theoretical analysis of PnP Bayesian models and algorithms is difficult and works on the topic often rely on unrealistic assumptions on the properties of the image denoiser. This papers studies maximum-a-posteriori (MAP) estimation for Bayesian models with PnP priors. We first consider questions related to existence, stability and well-posedness, and then present a convergence proof for MAP computation by PnP stochastic gradient descent (PnP-SGD) under realistic assumptions on the denoiser used. We report a range of imaging experiments demonstrating PnP-SGD as well as comparisons with other PnP schemes.

Extensibility · 相互獨立的 · Performance · 評論員 · 優化器 ·

2021 年 7 月 1 日

AdaXpert: Adapting Neural Architecture for Growing Data

Shuaicheng Niu,Jiaxiang Wu,Guanghui Xu,Yifan Zhang,Yong Guo,Peilin Zhao,Peng Wang,Mingkui Tan

from arxiv, accepted by ICML 2021

In real-world applications, data often come in a growing manner, where the data volume and the number of classes may increase dynamically. This will bring a critical challenge for learning: given the increasing data volume or the number of classes, one has to instantaneously adjust the neural model capacity to obtain promising performance. Existing methods either ignore the growing nature of data or seek to independently search an optimal architecture for a given dataset, and thus are incapable of promptly adjusting the architectures for the changed data. To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data. Specifically, we introduce an architecture adjuster to generate a suitable architecture for each data snapshot, based on the previous architecture and the different extent between current and previous data distributions. Furthermore, we propose an adaptation condition to determine the necessity of adjustment, thereby avoiding unnecessary and time-consuming adjustments. Extensive experiments on two growth scenarios (increasing data volume and number of classes) demonstrate the effectiveness of the proposed method.

分解的 · 相互獨立的 · 變分自編碼 · MoDELS · 表示學習 ·

2021 年 3 月 23 日

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Mengyue Yang,Furui Liu,Zhitang Chen,Xinwei Shen,Jianye Hao,Jun Wang

Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data. The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations. However, in real scenarios, factors with semantics are not necessarily independent. Instead, there might be an underlying causal structure which renders these factors dependent. We thus propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent exogenous factors into causal endogenous ones that correspond to causally related concepts in data. We further analyze the model identifiabitily, showing that the proposed model learned from observations recovers the true one up to a certain degree. Experiments are conducted on various datasets, including synthetic and real word benchmark CelebA. Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy. Furthermore, we demonstrate that the proposed CausalVAE model is able to generate counterfactual data through "do-operation" to the causal factors.

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

Performer · 圖形處理器 · 圖 · Neural Networks · Extensibility ·

2020 年 10 月 29 日

Scalable Graph Neural Networks via Bidirectional Propagation

Ming Chen,Zhewei Wei,Bolin Ding,Yaliang Li,Ye Yuan,Xiaoyong Du,Ji-Rong Wen

from arxiv, NeurIPS 2020

Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data. Recently, there has been increased interest in designing GNN that scales to large graphs. Most existing methods use "graph sampling" or "layer-wise sampling" techniques to reduce training time. However, these methods still suffer from degrading performance and scalability problems when applying to graphs with billions of edges. This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes. Theoretical analysis shows that GBP is the first method that achieves sub-linear time complexity for both the precomputation and the training phases. An extensive empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time. Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1.8 billion edges in less than half an hour on a single machine.

圖 · Better · MoDELS · INTERACT · Networking ·

2018 年 5 月 16 日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Zhongyang Li,Xiao Ding,Ting Liu

from arxiv, This paper has been accepted by IJCAI 2018

Script event prediction requires a model to predict the subsequent event given an existing event context. Previous models based on event pairs or event chains cannot make full use of dense event connections, which may limit their capability of event prediction. To remedy this, we propose constructing an event graph to better utilize the event network information for script event prediction. In particular, we first extract narrative event chains from large quantities of news corpus, and then construct a narrative event evolutionary graph (NEEG) based on the extracted chains. NEEG can be seen as a knowledge base that describes event evolutionary principles and patterns. To solve the inference problem on NEEG, we present a scaled graph neural network (SGNN) to model event interactions and learn better event representations. Instead of computing the representations on the whole graph, SGNN processes only the concerned nodes each time, which makes our model feasible to large-scale graphs. By comparing the similarity between input context event representations and candidate event representations, we can choose the most reasonable subsequent event. Experimental results on widely used New York Times corpus demonstrate that our model significantly outperforms state-of-the-art baseline methods, by using standard multiple choice narrative cloze evaluation.