亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Distribution shift in machine learning models can be a primary cause of performance degradation. This paper delves into the characteristics of these shifts, primarily motivated by Real-Time Bidding (RTB) market models. We emphasize the challenges posed by class imbalance and sample selection bias, both potent instigators of distribution shifts. This paper introduces the Exponential Tilt Reweighting Alignment (ExTRA) algorithm, as proposed by Marty et al. (2023), to address distribution shifts in data. The ExTRA method is designed to determine the importance weights on the source data, aiming to minimize the KL divergence between the weighted source and target datasets. A notable advantage of this method is its ability to operate using labeled source data and unlabeled target data. Through simulated real-world data, we investigate the nature of distribution shift and evaluate the applicacy of the proposed model.

相關內容

Explainable AI has become a popular tool for validating machine learning models. Mismatches between the explained model's decision strategy and the user's domain knowledge (e.g. Clever Hans effects) have also been recognized as a starting point for improving faulty models. However, it is less clear what to do when the user and the explanation agree. In this paper, we demonstrate that acceptance of explanations by the user is not a guarantee for a machine learning model to function well, in particular, some Clever Hans effects may remain undetected. Such hidden flaws of the model can nevertheless be mitigated, and we demonstrate this by contributing a new method, Explanation-Guided Exposure Minimization (EGEM), that preemptively prunes variations in the ML model that have not been the subject of positive explanation feedback. Experiments on natural image data demonstrate that our approach leads to models that strongly reduce their reliance on hidden Clever Hans strategies, and consequently achieve higher accuracy on new data.

This paper develops an approach to detect identification failure in moment condition models. This is achieved by introducing a quasi-Jacobian matrix computed as the slope of a linear approximation of the moments on an estimate of the identified set. It is asymptotically singular when local and/or global identification fails, and equivalent to the usual Jacobian matrix which has full rank when the model is point and locally identified. Building on this property, a simple test with chi-squared critical values is introduced to conduct subvector inferences allowing for strong, semi-strong, and weak identification without \textit{a priori} knowledge about the underlying identification structure. Monte-Carlo simulations and an empirical application to the Long-Run Risks model illustrate the results.

A method is introduced for approximate marginal likelihood inference via adaptive Gaussian quadrature in mixed models with a single grouping factor. The core technical contribution is an algorithm for computing the exact gradient of the approximate log marginal likelihood. This leads to efficient maximum likelihood via quasi-Newton optimization that is demonstrated to be faster than existing approaches based on finite-differenced gradients or derivative-free optimization. The method is specialized to Bernoulli mixed models with multivariate, correlated Gaussian random effects; here computations are performed using an inverse log-Cholesky parameterization of the Gaussian density that involves no matrix decomposition during model fitting, while Wald confidence intervals are provided for variance parameters on the original scale. Simulations give evidence of these intervals attaining nominal coverage if enough quadrature points are used, for data comprised of a large number of very small groups exhibiting large between-group heterogeneity. In contrast, the Laplace approximation is shown to give especially poor coverage and high bias for data comprised of a large number of small groups. Adaptive quadrature mitigates this, and the methods in this paper improve the computational feasibility of this more accurate method. All results may be reproduced using code available at \url{//github.com/awstringer1/aghmm-paper-code}.

This paper studies the problem of learning interactive recommender systems from logged feedbacks without any exploration in online environments. We address the problem by proposing a general offline reinforcement learning framework for recommendation, which enables maximizing cumulative user rewards without online exploration. Specifically, we first introduce a probabilistic generative model for interactive recommendation, and then propose an effective inference algorithm for discrete and stochastic policy learning based on logged feedbacks. In order to perform offline learning more effectively, we propose five approaches to minimize the distribution mismatch between the logging policy and recommendation policy: support constraints, supervised regularization, policy constraints, dual constraints and reward extrapolation. We conduct extensive experiments on two public real-world datasets, demonstrating that the proposed methods can achieve superior performance over existing supervised learning and reinforcement learning methods for recommendation.

Statistical models are an essential tool to model, forecast and understand the hydrological processes in watersheds. In particular, the modeling of time lags associated with the time between rainfall occurrence and subsequent changes in streamflow, is of high practical importance. Since water can take a variety of flowpaths to generate streamflow, a series of distinct runoff pulses from different flowpath may combine to create the observed streamflow time series. Current state-of-the-art models are not able to sufficiently confront the problem complexity with interpretable parametrization, which would allow insights into the dynamics of the distinct flow paths for hydrological inference. The proposed Gaussian Sliding Windows Regression Model targets this problem by combining the concept of multiple windows sliding along the time axis with multiple linear regression. The window kernels, which indicate the weights applied to different time lags, are implemented via Gaussian-shaped kernels. As a result, each window can represent one flowpath and, thus, offers the potential for straightforward process inference. Experiments on simulated and real-world scenarios underline that the proposed model achieves accurate parameter estimates and competitive predictive performance, while fostering explainable and interpretable hydrological modeling.

The past decade has witnessed substantial growth of data-driven speech enhancement (SE) techniques thanks to deep learning. While existing approaches have shown impressive performance in some common datasets, most of them are designed only for a single condition (e.g., single-channel, multi-channel, or a fixed sampling frequency) or only consider a single task (e.g., denoising or dereverberation). Currently, there is no universal SE approach that can effectively handle diverse input conditions with a single model. In this paper, we make the first attempt to investigate this line of research. First, we devise a single SE model that is independent of microphone channels, signal lengths, and sampling frequencies. Second, we design a universal SE benchmark by combining existing public corpora with multiple conditions. Our experiments on a wide range of datasets show that the proposed single model can successfully handle diverse conditions with strong performance.

Large discrete action spaces (LDAS) remain a central challenge in reinforcement learning. Existing solution approaches can handle unstructured LDAS with up to a few million actions. However, many real-world applications in logistics, production, and transportation systems have combinatorial action spaces, whose size grows well beyond millions of actions, even on small instances. Fortunately, such action spaces exhibit structure, e.g., equally spaced discrete resource units. With this work, we focus on handling structured LDAS (SLDAS) with sizes that cannot be handled by current benchmarks: we propose Dynamic Neighborhood Construction (DNC), a novel exploitation paradigm for SLDAS. We present a scalable neighborhood exploration heuristic that utilizes this paradigm and efficiently explores the discrete neighborhood around the continuous proxy action in structured action spaces with up to $10^{73}$ actions. We demonstrate the performance of our method by benchmarking it against three state-of-the-art approaches designed for large discrete action spaces across two distinct environments. Our results show that DNC matches or outperforms state-of-the-art approaches while being computationally more efficient. Furthermore, our method scales to action spaces that so far remained computationally intractable for existing methodologies.

To improve the performance in identifying the faults under strong noise for rotating machinery, this paper presents a dynamic feature reconstruction signal graph method, which plays the key role of the proposed end-to-end fault diagnosis model. Specifically, the original mechanical signal is first decomposed by wavelet packet decomposition (WPD) to obtain multiple subbands including coefficient matrix. Then, with originally defined two feature extraction factors MDD and DDD, a dynamic feature selection method based on L2 energy norm (DFSL) is proposed, which can dynamically select the feature coefficient matrix of WPD based on the difference in the distribution of norm energy, enabling each sub-signal to take adaptive signal reconstruction. Next the coefficient matrices of the optimal feature sub-bands are reconstructed and reorganized to obtain the feature signal graphs. Finally, deep features are extracted from the feature signal graphs by 2D-Convolutional neural network (2D-CNN). Experimental results on a public data platform of a bearing and our laboratory platform of robot grinding show that this method is better than the existing methods under different noise intensities.

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

This paper surveys the machine learning literature and presents machine learning as optimization models. Such models can benefit from the advancement of numerical optimization techniques which have already played a distinctive role in several machine learning settings. Particularly, mathematical optimization models are presented for commonly used machine learning approaches for regression, classification, clustering, and deep neural networks as well new emerging applications in machine teaching and empirical model learning. The strengths and the shortcomings of these models are discussed and potential research directions are highlighted.

北京阿比特科技有限公司