无码人妻一区二区三区在线不卡_91香蕉一二三区入口综合久久_免费无码视频一区_国产成人AV无码二三区吧_在线无码自拍区视频_欧美精品久久久久精品电影_国产一级精品毛片基地A

In the realm of search systems, multi-stage cascade architecture is a prevalent method, typically consisting of sequential modules such as matching, pre-ranking, and ranking. It is generally acknowledged that the model used in the pre-ranking stage must strike a balance between efficacy and efficiency. Thus, the most commonly employed architecture is the representation-focused vector product based model. However, this architecture lacks effective interaction between the query and document, resulting in a reduction in the effectiveness of the search system. To address this issue, we present a novel pre-ranking framework called RankDFM. Our framework leverages DeepFM as the backbone and employs a pairwise training paradigm to learn the ranking of videos under a query. The capability of RankDFM to cross features provides significant improvement in offline and online A/B testing performance. Furthermore, we introduce a learnable feature selection scheme to optimize the model and reduce the time required for online inference, equivalent to a tree model. Currently, RankDFM has been deployed in the search system of a shortvideo App, providing daily services to hundreds of millions users.

相關內容

排序

關注 313

排序是計算機內經常進行的一種操作，其目的是將一組“無序”的記錄序列調整為“有序”的記錄序列。分內部排序和外部排序。若整個排序過程不需要訪問外存便能完成，則稱此類排序問題為內部排序。反之，若參加排序的記錄數量很大，整個序列的排序過程不可能在內存中完成，則稱此類排序問題為外部排序。內部排序的過程是一個逐步擴大記錄的有序序列長度的過程。

INTERACT · Integration · 推薦系統 · 損失 · 噪聲 ·

2023 年 5 月 25 日

Integrating Item Relevance in Training Loss for Sequential Recommender Systems

Andrea Bacciu,Federico Siciliano,Nicola Tonellotto,Fabrizio Silvestri

Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items into account and (ii) introduce a novel relevance-aware loss function to train a SRS with multiple future items to make it more robust to noise. Our relevance-aware models obtain an improvement of ~1.2% of NDCG@10 and 0.88% in the traditional evaluation protocol, while in the new evaluation protocol, the improvement is ~1.63% of NDCG@10 and ~1.5% of HR w.r.t the best performing models.

多樣性 · 端到端 · 秩 · 在線 · 表示 ·

2023 年 5 月 24 日

Representation Online Matters: Practical End-to-End Diversification in Search and Recommender Systems

Pedro Silva,Bhawna Juneja,Shloka Desai,Ashudeep Singh,Nadia Fawaz

from arxiv, In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT '23), June 12--15, 2023, Chicago, IL, USA

As the use of online platforms continues to grow across all demographics, users often express a desire to feel represented in the content. To improve representation in search results and recommendations, we introduce end-to-end diversification, ensuring that diverse content flows throughout the various stages of these systems, from retrieval to ranking. We develop, experiment, and deploy scalable diversification mechanisms in multiple production surfaces on the Pinterest platform, including Search, Related Products, and New User Homefeed, to improve the representation of different skin tones in beauty and fashion content. Diversification in production systems includes three components: identifying requests that will trigger diversification, ensuring diverse content is retrieved from the large content corpus during the retrieval stage, and finally, balancing the diversity-utility trade-off in a self-adjusting manner in the ranking stage. Our approaches, which evolved from using Strong-OR logical operator to bucketized retrieval at the retrieval stage and from greedy re-rankers to multi-objective optimization using determinantal point processes for the ranking stage, balances diversity and utility while enabling fast iterations and scalable expansion to diversification over multiple dimensions. Our experiments indicate that these approaches significantly improve diversity metrics, with a neutral to a positive impact on utility metrics and improved user satisfaction, both qualitatively and quantitatively, in production.

Vision · Learning · 表示 · 表示學習 · 講稿 ·

2023 年 5 月 24 日

Efficient Large-Scale Vision Representation Learning

Eden Dolev,Alaa Awad,Denisa Roberts,Zahra Ebrahimzadeh,Marcin Mejran,Vaibhav Malpani,Mahir Yavuz

In this article, we present our approach to single-modality vision representation learning. Understanding vision representations of product content is vital for recommendations, search, and advertising applications in e-commerce. We detail and contrast techniques used to fine tune large-scale vision representation learning models in an efficient manner under low-resource settings, including several pretrained backbone architectures, both in the convolutional neural network as well as the vision transformer family. We highlight the challenges for e-commerce applications at-scale and highlight the efforts to more efficiently train, evaluate, and serve visual representations. We present ablation studies for several downstream tasks, including our visually similar ad recommendations. We evaluate the offline performance of the derived visual representations in downstream tasks. To this end, we present a novel text-to-image generative offline evaluation method for visually similar recommendation systems. Finally, we include online results from deployed machine learning systems in production at Etsy.

圖像配準 · 層 · Learning · 歸納偏好 · 深度學習 ·

2023 年 5 月 24 日

SITReg: Multi-resolution architecture for symmetric, inverse consistent, and topology preserving image registration using deformation inversion layers

Joel Honkamaa,Pekka Marttinen

Deep learning based deformable medical image registration methods have emerged as a strong alternative for classical iterative registration methods. Since image registration is in general an ill-defined problem, the usefulness of inductive biases of symmetricity, inverse consistency and topology preservation has been widely accepted by the research community. However, while many deep learning registration methods enforce these properties via loss functions, no prior deep learning registration method fulfills all of these properties by construct. Here, we propose a novel multi-resolution registration architecture which is by construct symmetric, inverse consistent, and topology preserving. We also develop an implicit layer for memory efficient inversion of the deformation fields. The proposed method achieves state-of-the-art registration accuracy on two datasets.

state-of-the-art · Performer · 統計量 · 真實值 · 樣例 ·

2023 年 5 月 24 日

RankSEG: A Consistent Ranking-based Framework for Segmentation

Ben Dai,Chunlin Li

from arxiv, 50 pages

Segmentation has emerged as a fundamental field of computer vision and natural language processing, which assigns a label to every pixel/feature to extract regions of interest from an image/text. To evaluate the performance of segmentation, the Dice and IoU metrics are used to measure the degree of overlap between the ground truth and the predicted segmentation. In this paper, we establish a theoretical foundation of segmentation with respect to the Dice/IoU metrics, including the Bayes rule and Dice-/IoU-calibration, analogous to classification-calibration or Fisher consistency in classification. We prove that the existing thresholding-based framework with most operating losses are not consistent with respect to the Dice/IoU metrics, and thus may lead to a suboptimal solution. To address this pitfall, we propose a novel consistent ranking-based framework, namely RankDice/RankIoU, inspired by plug-in rules of the Bayes segmentation rule. Three numerical algorithms with GPU parallel execution are developed to implement the proposed framework in large-scale and high-dimensional segmentation. We study statistical properties of the proposed framework. We show it is Dice-/IoU-calibrated, and its excess risk bounds and the rate of convergence are also provided. The numerical effectiveness of RankDice/mRankDice is demonstrated in various simulated examples and Fine-annotated CityScapes, Pascal VOC and Kvasir-SEG datasets with state-of-the-art deep learning architectures.

秩 · MoDELS · AUC · 在線 · 淘寶網 ·

2023 年 5 月 23 日

Rethinking the Role of Pre-ranking in Large-scale E-Commerce Searching System

Zhixuan Zhang,Yuheng Huang,Dan Ou,Sen Li,Longbin Li,Qingwen Liu,Xiaoyi Zeng

from arxiv, 13 pages, 7 figures, submitted to KDD 2023

E-commerce search systems such as Taobao Search, the largest e-commerce searching system in China, aim at providing users with the most preferred items (e.g., products). Due to the massive data and limited time for response, a typical industrial ranking system consists of three or more modules, including matching, pre-ranking, and ranking. The pre-ranking is widely considered a mini-ranking module, as it needs to rank hundreds of times more items than the ranking under limited latency. Existing researches focus on building a lighter model that imitates the ranking model. As such, the metric of a pre-ranking model follows the ranking model using Area Under ROC (AUC) for offline evaluation. However, such a metric is inconsistent with online A/B tests in practice, so engineers have to perform costly online tests to reach a convincing conclusion. In our work, we rethink the role of the pre-ranking. We argue that the primary goal of the pre-ranking stage is to return an optimal unordered set rather than an ordered list of items because it is the ranking that determines the final exposures. Since AUC measures the quality of an ordered item list, it is not suitable for evaluating the quality of the output unordered set. This paper proposes a new evaluation metric called All-Scenario Hitrate (ASH) for pre-ranking. ASH is proven effective in the offline evaluation and consistent with online A/B tests based on numerous experiments in Taobao Search. We also introduce an all-scenario-based multi-objective learning framework (ASMOL), which improves the ASH significantly. Surprisingly, the new pre-ranking model can outperforms the ranking model when outputting thousands of items. The phenomenon validates that the pre-ranking stage should not imitate the ranking blindly. With the improvements in ASH consistently translating to online improvement, it makes a 1.2% GMV improvement on Taobao Search.

massive MIMO · MIMO · 估計/估計量 · 通道 · Performer ·

2023 年 5 月 21 日

Pilotless Uplink for Massive MIMO Systems

P Aswathylakshmi,Radha Krishna Ganti

from arxiv, 6 pages, 9 figures, submitted to IEEE Global Communications Conference (Globecom) 2023

Massive MIMO antennas in cellular systems help support a large number of users in the same time-frequency resource and also provide significant array gain for uplink reception. However, channel estimation in such large antenna systems can be tricky, not only since pilot assignment for multiple users is challenging, but also because the pilot overhead especially for rapidly changing channels can diminish the system throughput quite significantly. A pilotless transceiver where the receiver can perform blind demodulation can solve these issues and boost system throughput by eliminating the need for pilots in channel estimation. In this paper, we propose an iterative matrix decomposition algorithm for the blind demodulation of massive MIMO OFDM signals. This new decomposition technique provides estimates of both the user symbols and the user channel in the frequency domain simultaneously (to a scaling factor) without any pilots. Simulation results demonstrate that the lack of pilots does not affect the error performance of the proposed algorithm when compared to maximal-ratio-combining (MRC) with pilot-based channel estimation across a wide range of signal strengths.

學成 · SSL · Taxonomy · 特化 · 未標記 ·

2022 年 3 月 29 日

Self-Supervised Learning for Recommender Systems: A Survey

Junliang Yu,Hongzhi Yin,Xin Xia,Tong Chen,Jundong Li,Zi Huang

from arxiv, 20pages. Submitted to TKDE

Neural architecture-based recommender systems have achieved tremendous success in recent years. However, when dealing with highly sparse data, they still fall short of expectation. Self-supervised learning (SSL), as an emerging technique to learn with unlabeled data, recently has drawn considerable attention in many fields. There is also a growing body of research proceeding towards applying SSL to recommendation for mitigating the data sparsity issue. In this survey, a timely and systematical review of the research efforts on self-supervised recommendation (SSR) is presented. Specifically, we propose an exclusive definition of SSR, on top of which we build a comprehensive taxonomy to divide existing SSR methods into four categories: contrastive, generative, predictive, and hybrid. For each category, the narrative unfolds along its concept and formulation, the involved methods, and its pros and cons. Meanwhile, to facilitate the development and evaluation of SSR models, we release an open-source library SELFRec, which incorporates multiple benchmark datasets and evaluation metrics, and has implemented a number of state-of-the-art SSR models for empirical comparison. Finally, we shed light on the limitations in the current research and outline the future research directions.

秩 · 變換 · Performer · Processing（編程語言） · 排序 ·

2020 年 10 月 13 日

Pretrained Transformers for Text Ranking: BERT and Beyond

Jimmy Lin,Rodrigo Nogueira,Andrew Yates

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This survey provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has, without exaggeration, revolutionized the fields of natural language processing (NLP), information retrieval (IR), and beyond. In this survey, we provide a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. We cover a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage ranking architectures and learned dense representations that attempt to perform ranking directly. There are two themes that pervade our survey: techniques for handling long documents, beyond the typical sentence-by-sentence processing approaches used in NLP, and techniques for addressing the tradeoff between effectiveness (result quality) and efficiency (query latency). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this survey also attempts to prognosticate where the field is heading.

圖像檢索 · 模型評估 · 數據集 · INFORMS · UniFormer ·

2018 年 12 月 4 日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Marvin Teichmann,Andre Araujo,Menglong Zhu,Jack Sim

Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uniform or class-agnostic region selection. In this paper, we first fill the void by providing a new dataset of landmark bounding boxes, based on the Google Landmarks dataset, that includes $94k$ images with manually curated boxes from $15k$ unique landmarks. Then, we demonstrate how a trained landmark detector, using our new dataset, can be leveraged to index image regions and improve retrieval accuracy while being much more efficient than existing regional methods. In addition, we further introduce a novel regional aggregated selective match kernel (R-ASMK) to effectively combine information from detected regions into an improved holistic image representation. R-ASMK boosts image retrieval accuracy substantially at no additional memory cost, while even outperforming systems that index image regions independently. Our complete image retrieval system improves upon the previous state-of-the-art by significant margins on the Revisited Oxford and Paris datasets. Code and data will be released.