亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

As the digitization of travel industry accelerates, analyzing and understanding travelers' behaviors becomes increasingly important. However, traveler data frequently exhibit high data sparsity due to the relatively low frequency of user interactions with travel providers. Compounding this effect the multiplication of devices, accounts and platforms while browsing travel products online also leads to data dispersion. To deal with these challenges, probabilistic traveler matching can be used. Most existing solutions for user matching are not suitable for traveler matching as a traveler's browsing history is typically short and URLs in the travel industry are very heterogeneous with many tokens. To deal with these challenges, we propose the similarity based multi-view information fusion to learn a better user representation from URLs by treating the URLs as multi-view data. The experimental results show that the proposed multi-view user representation learning can take advantage of the complementary information from different views, highlight the key information in URLs and perform significantly better than other representation learning solutions for the user matching task.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · Learning · Performer · 遷移學習 · 目標檢測 ·
2024 年 2 月 9 日

The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets, and is not trained on the task-specific domain. We validate our approach on object detection tasks, specifically focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.

Multi-product formulas (MPF) are linear combinations of Trotter circuits offering high-quality simulation of Hamiltonian time evolution with fewer Trotter steps. Here we report two contributions aimed at making multi-product formulas more viable for near-term quantum simulations. First, we extend the theory of Trotter error with commutator scaling developed by Childs, Su, Tran et al. to multi-product formulas. Our result implies that multi-product formulas can achieve a quadratic reduction of Trotter error in 1-norm (nuclear norm) on arbitrary time intervals compared with the regular product formulas without increasing the required circuit depth or qubit connectivity. The number of circuit repetitions grows only by a constant factor. Second, we introduce dynamic multi-product formulas with time-dependent coefficients chosen to minimize a certain efficiently computable proxy for the Trotter error. We use a minimax estimation method to make dynamic multi-product formulas robust to uncertainty from algorithmic errors, sampling and hardware noise. We call this method Minimax MPF and we provide a rigorous bound on its error.

Many products in engineering are highly reliable with large mean lifetimes to failure. Performing lifetests under normal operations conditions would thus require long experimentation times and high experimentation costs. Alternatively, accelerated lifetests shorten the experimentation time by running the tests at higher than normal stress conditions, thus inducing more failures. Additionally, a log-linear regression model can be used to relate the lifetime distribution of the product to the level of stress it experiences. After estimating the parameters of this relationship, results can be extrapolated to normal operating conditions. On the other hand, censored data is common in reliability analysis. Interval-censored data arise when continuous inspection is difficult or infeasible due to technical or budgetary constraints. In this paper, we develop robust restricted estimators based on the density power divergence for step-stress accelerated life-tests under Weibull distributions with interval-censored data. We present theoretical asymptotic properties of the estimators and develop robust Rao-type test statistics based on the proposed robust estimators for testing composite null hypothesis on the model parameters.

Quantum networks crucially rely on the availability of high-quality entangled pairs of qubits, known as entangled links, distributed across distant nodes. Maintaining the quality of these links is a challenging task due to the presence of time-dependent noise, also known as decoherence. Entanglement purification protocols offer a solution by converting multiple low-quality entangled states into a smaller number of higher-quality ones. In this work, we introduce a framework to analyse the performance of entanglement buffering setups that combine entanglement consumption, decoherence, and entanglement purification. We propose two key metrics: the availability, which is the steady-state probability that an entangled link is present, and the average consumed fidelity, which quantifies the steady-state quality of consumed links. We then investigate a two-node system, where each node possesses two quantum memories: one for long-term entanglement storage, and another for entanglement generation. We model this setup as a continuous-time stochastic process and derive analytical expressions for the performance metrics. Our findings unveil a trade-off between the availability and the average consumed fidelity. We also bound these performance metrics for a buffering system that employs the well-known bilocal Clifford purification protocols. Importantly, our analysis demonstrates that, in the presence of noise, consistently purifying the buffered entanglement increases the average consumed fidelity, even when some buffered entanglement is discarded due to purification failures.

In the realm of statistical exploration, the manipulation of pseudo-random values to discern their impact on data distribution presents a compelling avenue of inquiry. This article investigates the question: Is it possible to add pseudo-random values without compelling a shift towards a normal distribution?. Employing Python techniques, the study explores the nuances of pseudo-random value addition within the context of additions, aiming to unravel the interplay between randomness and resulting statistical characteristics. The Materials and Methods chapter details the construction of datasets comprising up to 300 billion pseudo-random values, employing three distinct layers of manipulation. The Results chapter visually and quantitatively explores the generated datasets, emphasizing distribution and standard deviation metrics. The study concludes with reflections on the implications of pseudo-random value manipulation and suggests avenues for future research. In the layered exploration, the first layer introduces subtle normalization with increasing summations, while the second layer enhances normality. The third layer disrupts typical distribution patterns, leaning towards randomness despite pseudo-random value summation. Standard deviation patterns across layers further illuminate the dynamic interplay of pseudo-random operations on statistical characteristics. While not aiming to disrupt academic norms, this work modestly contributes insights into data distribution complexities. Future studies are encouraged to delve deeper into the implications of data manipulation on statistical outcomes, extending the understanding of pseudo-random operations in diverse contexts.

We introduce a method for computing immediately human interpretable yet accurate classifiers from tabular data. The classifiers obtained are short DNF-formulas, computed via first discretizing the original data to Boolean form and then using feature selection coupled with a very fast algorithm for producing the best possible Boolean classifier for the setting. We demonstrate the approach via 14 experiments, obtaining results with accuracies mainly similar to ones obtained via random forests, XGBoost, and existing results for the same datasets in the literature. In several cases, our approach in fact outperforms the reference results in relation to accuracy, even though the main objective of our study is the immediate interpretability of our classifiers. We also prove a new result on the probability that the classifier we obtain from real-life data corresponds to the ideally best classifier with respect to the background distribution the data comes from.

We propose a new loss function for supervised and physics-informed training of neural networks and operators that incorporates a posteriori error estimate. More specifically, during the training stage, the neural network learns additional physical fields that lead to rigorous error majorants after a computationally cheap postprocessing stage. Theoretical results are based upon the theory of functional a posteriori error estimates, which allows for the systematic construction of such loss functions for a diverse class of practically relevant partial differential equations. From the numerical side, we demonstrate on a series of elliptic problems that for a variety of architectures and approaches (physics-informed neural networks, physics-informed neural operators, neural operators, and classical architectures in the regression and physics-informed settings), we can reach better or comparable accuracy and in addition to that cheaply recover high-quality upper bounds on the error after training.

Federated Learning (FL) has emerged as a privacy-preserving machine learning paradigm facilitating collaborative training across multiple clients without sharing local data. Despite advancements in edge device capabilities, communication bottlenecks present challenges in aggregating a large number of clients; only a portion of the clients can update their parameters upon each global aggregation. This phenomenon introduces the critical challenge of stragglers in FL and the profound impact of client scheduling policies on global model convergence and stability. Existing scheduling strategies address staleness but predominantly focus on either timeliness or content. Motivated by this, we introduce the novel concept of Version Age of Information (VAoI) to FL. Unlike traditional Age of Information metrics, VAoI considers both timeliness and content staleness. Each client's version age is updated discretely, indicating the freshness of information. VAoI is incorporated into the client scheduling policy to minimize the average VAoI, mitigating the impact of outdated local updates and enhancing the stability of FL systems.

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.

Online customer data provides valuable information for product design and marketing research, as it can reveal the preferences of customers. However, analyzing these data using artificial intelligence (AI) for data-driven design is a challenging task due to potential concealed patterns. Moreover, in these research areas, most studies are only limited to finding customers' needs. In this study, we propose a game theory machine learning (ML) method that extracts comprehensive design implications for product development. The method first uses a genetic algorithm to select, rank, and combine product features that can maximize customer satisfaction based on online ratings. Then, we use SHAP (SHapley Additive exPlanations), a game theory method that assigns a value to each feature based on its contribution to the prediction, to provide a guideline for assessing the importance of each feature for the total satisfaction. We apply our method to a real-world dataset of laptops from Kaggle, and derive design implications based on the results. Our approach tackles a major challenge in the field of multi-criteria decision making and can help product designers and marketers, to understand customer preferences better with less data and effort. The proposed method outperforms benchmark methods in terms of relevant performance metrics.

北京阿比特科技有限公司