With modern IVIS becoming more capable and complex than ever, their evaluation becomes increasingly difficult. The analysis of large amounts of user behavior data can help to cope with this complexity and can support UX experts in designing IVIS that serve customer needs and are safe to operate while driving. We, therefore, propose a Multi-level User Behavior Visualization Framework providing effective visualizations of user behavior data that is collected via telematics from production vehicles. Our approach visualizes user behavior data on three different levels: (1) The Task Level View aggregates event sequence data generated through touchscreen interactions to visualize user flows. (2) The Flow Level View allows comparing the individual flows based on a chosen metric. (3) The Sequence Level View provides detailed insights into touch interactions, glance, and driving behavior. Our case study proves that UX experts consider our approach a useful addition to their design process.
Search and recommendation are the two most common approaches used by people to obtain information. They share the same goal -- satisfying the user's information need at the right time. There are already a lot of Internet platforms and Apps providing both search and recommendation services, showing us the demand and opportunity to simultaneously handle both tasks. However, most platforms consider these two tasks independently -- they tend to train separate search model and recommendation model, without exploiting the relatedness and dependency between them. In this paper, we argue that jointly modeling these two tasks will benefit both of them and finally improve overall user satisfaction. We investigate the interactions between these two tasks in the specific information content service domain. We propose first integrating the user's behaviors in search and recommendation into a heterogeneous behavior sequence, then utilizing a joint model for handling both tasks based on the unified sequence. More specifically, we design the Unified Information Search and Recommendation model (USER), which mines user interests from the integrated sequence and accomplish the two tasks in a unified way.
The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations. Three types of qualitative features are observed in the training loss curve: fast initial convergence, oscillations, and large spikes in the late phase. The sign gradient descent (signGD) flow, which is the limit of Adam when taking the learning rate to 0 while keeping the momentum parameters fixed, is used to explain the fast initial convergence. For the late phase of Adam, three different types of qualitative patterns are observed depending on the choice of the hyper-parameters: oscillations, spikes, and divergence. In particular, Adam converges much smoother and even faster when the values of the two momentum factors are close to each other. This observation is particularly important for scientific computing tasks, for which the training process usually proceeds into the high precision regime.
Context information in search sessions has proven to be useful for capturing user search intent. Existing studies explored user behavior sequences in sessions in different ways to enhance query suggestion or document ranking. However, a user behavior sequence has often been viewed as a definite and exact signal reflecting a user's behavior. In reality, it is highly variable: user's queries for the same intent can vary, and different documents can be clicked. To learn a more robust representation of the user behavior sequence, we propose a method based on contrastive learning, which takes into account the possible variations in user's behavior sequences. Specifically, we propose three data augmentation strategies to generate similar variants of user behavior sequences and contrast them with other sequences. In so doing, the model is forced to be more robust regarding the possible variations. The optimized sequence representation is incorporated into document ranking. Experiments on two real query log datasets show that our proposed model outperforms the state-of-the-art methods significantly, which demonstrates the effectiveness of our method for context-aware document ranking.
Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of a leading e commerce website, serving the main traffic of hundreds of millions of active users everyday.
Contextual reasoning is essential to understand events in long untrimmed videos. In this work, we systematically explore different captioning models with various contexts for the dense-captioning events in video task, which aims to generate captions for different events in the untrimmed video. We propose five types of contexts as well as two categories of event captioning models, and evaluate their contributions for event captioning from both accuracy and diversity aspects. The proposed captioning models are plugged into our pipeline system for the dense video captioning challenge. The overall system achieves the state-of-the-art performance on the dense-captioning events in video task with 9.91 METEOR score on the challenge testing set.
Most existing recommender systems leverage user behavior data of one type only, such as the purchase behavior in E-commerce that is directly related to the business KPI (Key Performance Indicator) of conversion rate. Besides the key behavioral data, we argue that other forms of user behaviors also provide valuable signal, such as views, clicks, adding a product to shop carts and so on. They should be taken into account properly to provide quality recommendation for users. In this work, we contribute a new solution named NMTR (short for Neural Multi-Task Recommendation) for learning recommender systems from user multi-behavior data. We develop a neural network model to capture the complicated and multi-type interactions between users and items. In particular, our model accounts for the cascading relationship among different types of behaviors (e.g., a user must click on a product before purchasing it). To fully exploit the signal in the data of multiple types of behaviors, we perform a joint optimization based on the multi-task learning framework, where the optimization on a behavior is treated as a task. Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data. Further analysis shows that modeling multiple behaviors is particularly useful for providing recommendation for sparse users that have very few interactions.
Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.
In this proposal we present the idea of a "macro recommender system", and "micro recommender system". Both systems can be considered as a recommender system for recommendation algorithms. A macro recommender system recommends the best performing recommendation algorithm to an organization that wants to build a recommender system. This way, an organization does not need to test many algorithms over long periods to find the best one for their particular platform. A micro recommender system recommends the best performing recommendation algorithm for each individual recommendation request. This proposal is based on the premise that there is no single-best algorithm for all users, items, and contexts. For instance, a micro recommender system might recommend one algorithm when recommendations for an elderly male user in the evening should be created. When recommendations for a young female user in the morning should be given, the micro recommender system might recommend a different algorithm.
Many recommendation algorithms rely on user data to generate recommendations. However, these recommendations also affect the data obtained from future users. This work aims to understand the effects of this dynamic interaction. We propose a simple model where users with heterogeneous preferences arrive over time. Based on this model, we prove that naive estimators, i.e. those which ignore this feedback loop, are not consistent. We show that consistent estimators are efficient in the presence of myopic agents. Our results are validated using extensive simulations.
Social Media users tend to mention entities when reacting to news events. The main purpose of this work is to create entity-centric aggregations of tweets on a daily basis. By applying topic modeling and sentiment analysis, we create data visualization insights about current events and people reactions to those events from an entity-centric perspective.