亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Categorical encoders transform categorical features into numerical representations that are indispensable for a wide range of machine learning models. Existing encoder benchmark studies lack generalizability because of their limited choice of (1) encoders, (2) experimental factors, and (3) datasets. Additionally, inconsistencies arise from the adoption of varying aggregation strategies. This paper is the most comprehensive benchmark of categorical encoders to date, including an extensive evaluation of 32 configurations of encoders from diverse families, with 36 combinations of experimental factors, and on 50 datasets. The study shows the profound influence of dataset selection, experimental factors, and aggregation strategies on the benchmark's conclusions -- aspects disregarded in previous encoder benchmarks.

相關內容

Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.

Simulated Tempering (ST) is an MCMC algorithm for complex target distributions that operates on a path between the target and a more amenable reference distribution. Crucially, if the reference enables i.i.d. sampling, ST is regenerative and can be parallelized across independent tours. However, the difficulty of tuning ST has hindered its widespread adoption. In this work, we develop a simple nonreversible ST (NRST) algorithm, a general theoretical analysis of ST, and an automated tuning procedure for ST. A core contribution that arises from the analysis is a novel performance metric -- Tour Effectiveness (TE) -- that controls the asymptotic variance of estimates from ST for bounded test functions. We use the TE to show that NRST dominates its reversible counterpart. We then develop an automated tuning procedure for NRST algorithms that targets the TE while minimizing computational cost. This procedure enables straightforward integration of NRST into existing probabilistic programming languages. We provide extensive experimental evidence that our tuning scheme improves the performance and robustness of NRST algorithms on a diverse set of probabilistic models.

We introduce two new classes of covering codes in graphs for every positive integer $r$. These new codes are called local $r$-identifying and local $r$-locating-dominating codes and they are derived from $r$-identifying and $r$-locating-dominating codes, respectively. We study the sizes of optimal local 1-identifying codes in binary hypercubes. We obtain lower and upper bounds that are asymptotically tight. Together the bounds show that the cost of changing covering codes into local 1-identifying codes is negligible. For some small $n$ optimal constructions are obtained. Moreover, the upper bound is obtained by a linear code construction. Also, we study the densities of optimal local 1-identifying codes and local 1-locating-dominating codes in the infinite square grid, the hexagonal grid, the triangular grid, and the king grid. We prove that seven out of eight of our constructions have optimal densities.

With the increasing availability of large scale datasets, computational power and tools like automatic differentiation and expressive neural network architectures, sequential data are now often treated in a data-driven way, with a dynamical model trained from the observation data. While neural networks are often seen as uninterpretable black-box architectures, they can still benefit from physical priors on the data and from mathematical knowledge. In this paper, we use a neural network architecture which leverages the long-known Koopman operator theory to embed dynamical systems in latent spaces where their dynamics can be described linearly, enabling a number of appealing features. We introduce methods that enable to train such a model for long-term continuous reconstruction, even in difficult contexts where the data comes in irregularly-sampled time series. The potential for self-supervised learning is also demonstrated, as we show the promising use of trained dynamical models as priors for variational data assimilation techniques, with applications to e.g. time series interpolation and forecasting.

The evaluation of noisy binary classifiers on unlabeled data is treated as a streaming task: given a data sketch of the decisions by an ensemble, estimate the true prevalence of the labels as well as each classifier's accuracy on them. Two fully algebraic evaluators are constructed to do this. Both are based on the assumption that the classifiers make independent errors. The first is based on majority voting. The second, the main contribution of the paper, is guaranteed to be correct. But how do we know the classifiers are independent on any given test? This principal/agent monitoring paradox is ameliorated by exploiting the failures of the independent evaluator to return sensible estimates. A search for nearly error independent trios is empirically carried out on the \texttt{adult}, \texttt{mushroom}, and \texttt{two-norm} datasets by using the algebraic failure modes to reject evaluation ensembles as too correlated. The searches are refined by constructing a surface in evaluation space that contains the true value point. The algebra of arbitrarily correlated classifiers permits the selection of a polynomial subset free of any correlation variables. Candidate evaluation ensembles are rejected if their data sketches produce independent estimates too far from the constructed surface. The results produced by the surviving ensembles can sometimes be as good as 1\%. But handling even small amounts of correlation remains a challenge. A Taylor expansion of the estimates produced when independence is assumed but the classifiers are, in fact, slightly correlated helps clarify how the independent evaluator has algebraic `blind spots'.

We provide a construction of Gabor frames that encode local linearizations of a signal detected on a curved smooth manifold of arbitrary dimension, with Gabor filters that can detect the presence of higher-dimensional boundaries in the manifold signal. We describe an application in configuration spaces in robotics with sharp constrains. The construction is a higher-dimensional generalization of the geometric setting developed for the study of signal analysis in the visual cortex.

This paper introduces a prognostic method called FLASH that addresses the problem of joint modelling of longitudinal data and censored durations when a large number of both longitudinal and time-independent features are available. In the literature, standard joint models are either of the shared random effect or joint latent class type. Combining ideas from both worlds and using appropriate regularisation techniques, we define a new model with the ability to automatically identify significant prognostic longitudinal features in a high-dimensional context, which is of increasing importance in many areas such as personalised medicine or churn prediction. We develop an estimation methodology based on the EM algorithm and provide an efficient implementation. The statistical performance of the method is demonstrated both in extensive Monte Carlo simulation studies and on publicly available real-world datasets. Our method significantly outperforms the state-of-the-art joint models in predicting the latent class membership probability in terms of the C-index in a so-called ``real-time'' prediction setting, with a computational speed that is orders of magnitude faster than competing methods. In addition, our model automatically identifies significant features that are relevant from a practical perspective, making it interpretable.

Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are robust to large random perturbations of the input data remain susceptible to small, easily constructed, adversarial perturbations of their inputs. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability -- notably the simultaneous susceptibility of the (otherwise accurate) model to easily constructed adversarial attacks, and robustness to random perturbations of the input data. We confirm that the same phenomena are directly observed in practical neural networks trained on standard image classification problems, where even large additive random noise fails to trigger the adversarial instability of the network. A surprising takeaway is that even small margins separating a classifier's decision surface from training and testing data can hide adversarial susceptibility from being detected using randomly sampled perturbations. Counterintuitively, using additive noise during training or testing is therefore inefficient for eradicating or detecting adversarial examples, and more demanding adversarial training is required.

A general class of the almost instantaneous fixed-to-variable-length (AIFV) codes is proposed, which contains every possible binary code we can make when allowing finite bits of decoding delay. The contribution of the paper lies in the following. (i) Introducing $N$-bit-delay AIFV codes, constructed by multiple code trees with higher flexibility than the conventional AIFV codes. (ii) Proving that the proposed codes can represent any uniquely-encodable and uniquely-decodable variable-to-variable length codes. (iii) Showing how to express codes as multiple code trees with minimum decoding delay. (iv) Formulating the constraints of decodability as the comparison of intervals in the real number line. The theoretical results in this paper are expected to be useful for further study on AIFV codes.

Spatially misaligned data can be fused by using a Bayesian melding model that assumes that underlying all observations there is a spatially continuous Gaussian random field process. This model can be used, for example, to predict air pollution levels by combining point data from monitoring stations and areal data from satellite imagery. However, if the data presents preferential sampling, that is, if the observed point locations are not independent of the underlying spatial process, the inference obtained from models that ignore such a dependence structure might not be valid. In this paper, we present a Bayesian spatial model for the fusion of point and areal data that takes into account preferential sampling. The model combines the Bayesian melding specification and a model for the stochastically dependent sampling and underlying spatial processes. Fast Bayesian inference is performed using the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) approaches. The performance of the model is assessed using simulated data in a range of scenarios and sampling strategies that can appear in real settings. The model is also applied to predict air pollution in the USA.

北京阿比特科技有限公司