亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Factorization machine (FM) variants are widely used for large scale real-time content recommendation systems, since they offer an excellent balance between model accuracy and low computational costs for training and inference. These systems are trained on tabular data with both numerical and categorical columns. Incorporating numerical columns poses a challenge, and they are typically incorporated using a scalar transformation or binning, which can be either learned or chosen a-priori. In this work, we provide a systematic and theoretically-justified way to incorporate numerical features into FM variants by encoding them into a vector of function values for a set of functions of one's choice. We view factorization machines as approximators of segmentized functions, namely, functions from a field's value to the real numbers, assuming the remaining fields are assigned some given constants, which we refer to as the segment. From this perspective, we show that our technique yields a model that learns segmentized functions of the numerical feature spanned by the set of functions of one's choice, namely, the spanning coefficients vary between segments. Hence, to improve model accuracy we advocate the use of functions known to have strong approximation power, and offer the B-Spline basis due to its well-known approximation power, availability in software libraries, and efficiency. Our technique preserves fast training and inference, and requires only a small modification of the computational graph of an FM model. Therefore, it is easy to incorporate into an existing system to improve its performance. Finally, we back our claims with a set of experiments, including synthetic, performance evaluation on several data-sets, and an A/B test on a real online advertising system which shows improved performance.

相關內容

Attention-based graph neural networks have made great progress in feature matching learning. However, insight of how attention mechanism works for feature matching is lacked in the literature. In this paper, we rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering. In order to facilitate the learning of matching and filtering, we inject the similarity of descriptors and relative positions into cross- and self-attention score, respectively. In this way, the attention can focus on learning residual matching and filtering functions with reference to the basic functions of measuring visual and spatial correlation. Moreover, we mine intra- and inter-neighbors according to the similarity of descriptors and relative positions. Then sparse attention for each point can be performed only within its neighborhoods to acquire higher computation efficiency. Feature matching networks equipped with our full and sparse residual attention learning strategies are termed ResMatch and sResMatch respectively. Extensive experiments, including feature matching, pose estimation and visual localization, confirm the superiority of our networks.

Providing a model that achieves a strong predictive performance and at the same time is interpretable by humans is one of the most difficult challenges in machine learning research due to the conflicting nature of these two objectives. To address this challenge, we propose a modification of the Radial Basis Function Neural Network model by equipping its Gaussian kernel with a learnable precision matrix. We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. In particular, the eigenvectors explain the directions of maximum sensitivity of the model revealing the active subspace and suggesting potential applications for supervised dimensionality reduction. At the same time, the eigenvectors highlight the relationship in terms of absolute variation between the input and the latent variables, thereby allowing us to extract a ranking of the input variables based on their importance to the prediction task enhancing the model interpretability. We conducted numerical experiments for regression, classification, and feature selection tasks, comparing our model against popular machine learning models and the state-of-the-art deep learning-based embedding feature selection techniques. Our results demonstrate that the proposed model does not only yield an attractive prediction performance with respect to the competitors but also provides meaningful and interpretable results that potentially could assist the decision-making process in real-world applications. A PyTorch implementation of the model is available on GitHub at the following link. //github.com/dannyzx/GRBF-NNs

Magnetic polarizability tensors (MPTs) provide an economical characterisation of conducting metallic objects and can aid in the solution of metal detection inverse problems, such as scrap metal sorting, searching for unexploded ordnance in areas of former conflict, and security screening at event venues and transport hubs. Previous work has established explicit formulae for their coefficients and a rigorous mathematical theory for the characterisation they provide. In order to assist with efficient computation of MPT spectral signatures of different objects to enable the construction of large dictionaries of characterisations for classification approaches, this work proposes a new, highly-efficient, strategy for predicting MPT coefficients. This is achieved by solving an eddy current type problem using hp-finite elements in combination with a proper orthogonal decomposition reduced order modelling (ROM) methodology and offers considerable computational savings over our previous approach. Furthermore, an adaptive approach is described for generating new frequency snapshots to further improve the accuracy of the ROM. To improve the resolution of highly conducting and magnetic objects, a recipe is proposed to choose the number and thicknesses of prismatic boundary layers for accurate resolution of thin skin depths in such problems. The paper includes a series of challenging examples to demonstrate the success of the proposed methodologies.

A function is differentially algebraic (or simply D-algebraic) if there is a polynomial relationship between some of its derivatives and the indeterminate variable. Many functions in the sciences, such as Mathieu functions, the Weierstrass elliptic functions, and holonomic or D-finite functions are D-algebraic. These functions form a field, and are closed under composition, taking functional inverse, and derivation. We present implementation for each underlying operation. We also give a systematic way for computing an algebraic differential equation from a linear differential equation with D-finite function coefficients. Each command is a feature of our Maple package $NLDE$ available at //mathrepo.mis.mpg.de/OperationsForDAlgebraicFunctions.

A spanner of a graph is a subgraph that preserves lengths of shortest paths up to a multiplicative distortion. For every $k$, a spanner with size $O(n^{1+1/k})$ and stretch $(2k+1)$ can be constructed by a simple centralized greedy algorithm, and this is tight assuming Erd\H{o}s girth conjecture. In this paper we study the problem of constructing spanners in a local manner, specifically in the Local Computation Model proposed by Rubinfeld et al. (ICS 2011). We provide a randomized Local Computation Agorithm (LCA) for constructing $(2r-1)$-spanners with $\tilde{O}(n^{1+1/r})$ edges and probe complexity of $\tilde{O}(n^{1-1/r})$ for $r \in \{2,3\}$, where $n$ denotes the number of vertices in the input graph. Up to polylogarithmic factors, in both cases, the stretch factor is optimal (for the respective number of edges). In addition, our probe complexity for $r=2$, i.e., for constructing a $3$-spanner, is optimal up to polylogarithmic factors. Our result improves over the probe complexity of Parter et al. (ITCS 2019) that is $\tilde{O}(n^{1-1/2r})$ for $r \in \{2,3\}$. Both our algorithms and the algorithms of Parter et al. use a combination of neighbor-probes and pair-probes in the above-mentioned LCAs. For general $k\geq 1$, we provide an LCA for constructing $O(k^2)$-spanners with $\tilde{O}(n^{1+1/k})$ edges using $O(n^{2/3}\Delta^2)$ neighbor-probes, improving over the $\tilde{O}(n^{2/3}\Delta^4)$ algorithm of Parter et al. By developing a new randomized LCA for graph decomposition, we further improve the probe complexity of the latter task to be $O(n^{2/3-(1.5-\alpha)/k}\Delta^2)$, for any constant $\alpha>0$. This latter LCA may be of independent interest.

Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.

Existing recommender systems extract the user preference based on learning the correlation in data, such as behavioral correlation in collaborative filtering, feature-feature, or feature-behavior correlation in click-through rate prediction. However, regretfully, the real world is driven by causality rather than correlation, and correlation does not imply causation. For example, the recommender systems can recommend a battery charger to a user after buying a phone, in which the latter can serve as the cause of the former, and such a causal relation cannot be reversed. Recently, to address it, researchers in recommender systems have begun to utilize causal inference to extract causality, enhancing the recommender system. In this survey, we comprehensively review the literature on causal inference-based recommendation. At first, we present the fundamental concepts of both recommendation and causal inference as the basis of later content. We raise the typical issues that the non-causality recommendation is faced. Afterward, we comprehensively review the existing work of causal inference-based recommendation, based on a taxonomy of what kind of problem causal inference addresses. Last, we discuss the open problems in this important research area, along with interesting future works.

Recommender system is one of the most important information services on today's Internet. Recently, graph neural networks have become the new state-of-the-art approach of recommender systems. In this survey, we conduct a comprehensive review of the literature in graph neural network-based recommender systems. We first introduce the background and the history of the development of both recommender systems and graph neural networks. For recommender systems, in general, there are four aspects for categorizing existing works: stage, scenario, objective, and application. For graph neural networks, the existing methods consist of two categories, spectral models and spatial ones. We then discuss the motivation of applying graph neural networks into recommender systems, mainly consisting of the high-order connectivity, the structural property of data, and the enhanced supervision signal. We then systematically analyze the challenges in graph construction, embedding propagation/aggregation, model optimization, and computation efficiency. Afterward and primarily, we provide a comprehensive overview of a multitude of existing works of graph neural network-based recommender systems, following the taxonomy above. Finally, we raise discussions on the open problems and promising future directions of this area. We summarize the representative papers along with their codes repositories in //github.com/tsinghua-fib-lab/GNN-Recommender-Systems.

Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL from the perspective of algorithmic modeling, applications and theoretical analyses. For algorithmic modeling, we give a definition of MTL and then classify different MTL algorithms into five categories, including feature learning approach, low-rank approach, task clustering approach, task relation learning approach and decomposition approach as well as discussing the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, we review online, parallel and distributed MTL models as well as dimensionality reduction and feature hashing to reveal their computational and storage advantages. Many real-world applications use MTL to boost their performance and we review representative works in this paper. Finally, we present theoretical analyses and discuss several future directions for MTL.

Click-through rate (CTR) prediction plays a critical role in recommender systems and online advertising. The data used in these applications are multi-field categorical data, where each feature belongs to one field. Field information is proved to be important and there are several works considering fields in their models. In this paper, we proposed a novel approach to model the field information effectively and efficiently. The proposed approach is a direct improvement of FwFM, and is named as Field-matrixed Factorization Machines (FmFM, or $FM^2$). We also proposed a new explanation of FM and FwFM within the FmFM framework, and compared it with the FFM. Besides pruning the cross terms, our model supports field-specific variable dimensions of embedding vectors, which acts as soft pruning. We also proposed an efficient way to minimize the dimension while keeping the model performance. The FmFM model can also be optimized further by caching the intermediate vectors, and it only takes thousands of floating-point operations (FLOPs) to make a prediction. Our experiment results show that it can out-perform the FFM, which is more complex. The FmFM model's performance is also comparable to DNN models which require much more FLOPs in runtime.

北京阿比特科技有限公司