亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as a solution. However, fitting models independently does not make efficient use of all available data. Conversely, fitting a single shared model to the full data set relies on imputation which often leads to biased results when missingness depends on unobserved factors. We propose an alternative approach, called sharing pattern submodels, which i) makes predictions that are robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels, and iii) has a short description, enabling improved interpretability. Parameter sharing is enforced through sparsity-inducing regularization which we prove leads to consistent estimation. Finally, we give conditions for when a sharing model is optimal, even when both missingness and the target outcome depend on unobserved variables. Classification and regression experiments on synthetic and real-world data sets demonstrate that our models achieve a favorable tradeoff between pattern specialization and information sharing.

相關內容

Interacting particle or agent systems that display a rich variety of swarming behaviours are ubiquitous in science and engineering. A fundamental and challenging goal is to understand the link between individual interaction rules and swarming. In this paper, we study the data-driven discovery of a second-order particle swarming model that describes the evolution of $N$ particles in $\mathbb{R}^d$ under radial interactions. We propose a learning approach that models the latent radial interaction function as Gaussian processes, which can simultaneously fulfill two inference goals: one is the nonparametric inference of {the} interaction function with pointwise uncertainty quantification, and the other one is the inference of unknown scalar parameters in the non-collective friction forces of the system. We formulate the learning problem as a statistical inverse problem and provide a detailed analysis of recoverability conditions, establishing that a coercivity condition is sufficient for recoverability. Given data collected from $M$ i.i.d trajectories with independent Gaussian observational noise, we provide a finite-sample analysis, showing that our posterior mean estimator converges in a Reproducing kernel Hilbert space norm, at an optimal rate in $M$ equal to the one in the classical 1-dimensional Kernel Ridge regression. As a byproduct, we show we can obtain a parametric learning rate in $M$ for the posterior marginal variance using $L^{\infty}$ norm, and the rate could also involve $N$ and $L$ (the number of observation time instances for each trajectory), depending on the condition number of the inverse problem. Numerical results on systems that exhibit different swarming behaviors demonstrate efficient learning of our approach from scarce noisy trajectory data.

The aim of this work is to study the effect of imposing different types of boundary conditions in the Exner model describing shallow water equations with sedimentation, when a train of waves is imposed at one of the boundaries. The numerical solver is a second order finite-volume scheme, with semi-implicit time discretization based on Implicit-Explicit (IMEX) schemes, which guarantee better stability properties than explicit ones. We show the effect of spurious reflected waves at the right edge of the computational domain, propose two remedies, and show how such spurious effects can be reduced by suitable non-reflecting boundary conditions.

Large-scale e-commercial platforms in the real-world usually contain various recommendation scenarios (domains) to meet demands of diverse customer groups. Multi-Domain Recommendation (MDR), which aims to jointly improve recommendations on all domains and easily scales to thousands of domains, has attracted increasing attention from practitioners and researchers. Existing MDR methods usually employ a shared structure and several specific components to respectively leverage reusable features and domain-specific information. However, data distribution differs across domains, making it challenging to develop a general model that can be applied to all circumstances. Additionally, during training, shared parameters often suffer from the domain conflict while specific parameters are inclined to overfitting on data sparsity domains. we first present a scalable MDR platform served in Taobao that enables to provide services for thousands of domains without specialists involved. To address the problems of MDR methods, we propose a novel model agnostic learning framework, namely MAMDR, for the multi-domain recommendation. Specifically, we first propose a Domain Negotiation (DN) strategy to alleviate the conflict between domains. Then, we develop a Domain Regularization (DR) to improve the generalizability of specific parameters by learning from other domains. We integrate these components into a unified framework and present MAMDR, which can be applied to any model structure to perform multi-domain recommendation. Finally, we present a large-scale implementation of MAMDR in the Taobao application and construct various public MDR benchmark datasets which can be used for following studies. Extensive experiments on both benchmark datasets and industry datasets demonstrate the effectiveness and generalizability of MAMDR.

Periodicity detection is an important task in time series analysis, but still a challenging problem due to the diverse characteristics of time series data like abrupt trend change, outlier, noise, and especially block missing data. In this paper, we propose a robust and effective periodicity detection algorithm for time series with block missing data. We first design a robust trend filter to remove the interference of complicated trend patterns under missing data. Then, we propose a robust autocorrelation function (ACF) that can handle missing values and outliers effectively. We rigorously prove that the proposed robust ACF can still work well when the length of the missing block is less than $1/3$ of the period length. Last, by combining the time-frequency information, our algorithm can generate the period length accurately. The experimental results demonstrate that our algorithm outperforms existing periodicity detection algorithms on real-world time series datasets.

The Cox model is an indispensable tool for time-to-event analysis, particularly in biomedical research. However, medicine is undergoing a profound transformation, generating data at an unprecedented scale, which opens new frontiers to study and understand diseases. With the wealth of data collected, new challenges for statistical inference arise, as datasets are often high dimensional, exhibit an increasing number of measurements at irregularly spaced time points, and are simply too large to fit in memory. Many current implementations for time-to-event analysis are ill-suited for these problems as inference is computationally demanding and requires access to the full data at once. Here we propose a Bayesian version for the counting process representation of Cox's partial likelihood for efficient inference on large-scale datasets with millions of data points and thousands of time-dependent covariates. Through the combination of stochastic variational inference and a reweighting of the log-likelihood, we obtain an approximation for the posterior distribution that factorizes over subsamples of the data, enabling the analysis in big data settings. Crucially, the method produces viable uncertainty estimates for large-scale and high-dimensional datasets. We show the utility of our method through a simulation study and an application to myocardial infarction in the UK Biobank.

The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. The null hypothesis is that the predictive model is calibrated, while the alternative hypothesis is that the deviation from calibration is sufficiently large. We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. When the conditional class probabilities are H\"older continuous, we propose T-Cal, a minimax optimal test for calibration based on a debiased plug-in estimator of the $\ell_2$-Expected Calibration Error (ECE). We further propose Adaptive T-Cal, a version that is adaptive to unknown smoothness. We verify our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures and several standard post-hoc calibration methods. T-Cal is a practical general-purpose tool, which -- combined with classical tests for discrete-valued predictors -- can be used to test the calibration of virtually any probabilistic classification method.

Hyper-parameter optimization is one of the most tedious yet crucial steps in training machine learning models. There are numerous methods for this vital model-building stage, ranging from domain-specific manual tuning guidelines suggested by the oracles to the utilization of general-purpose black-box optimization techniques. This paper proposes an agent-based collaborative technique for finding near-optimal values for any arbitrary set of hyper-parameters (or decision variables) in a machine learning model (or general function optimization problem). The developed method forms a hierarchical agent-based architecture for the distribution of the searching operations at different dimensions and employs a cooperative searching procedure based on an adaptive width-based random sampling technique to locate the optima. The behavior of the presented model, specifically against the changes in its design parameters, is investigated in both machine learning and global function optimization applications, and its performance is compared with that of two randomized tuning strategies that are commonly used in practice. According to the empirical results, the proposed model outperformed the compared methods in the experimented classification, regression, and multi-dimensional function optimization tasks, notably in a higher number of dimensions and in the presence of limited on-device computational resources.

In complex survey data, each sampled observation has assigned a sampling weight, indicating the number of units that it represents in the population. Whether sampling weights should or not be considered in the estimation process of model parameters is a question that still continues to generate much discussion among researchers in different fields. We aim to contribute to this debate by means of a real data based simulation study in the framework of logistic regression models. In order to study their performance, three methods have been considered for estimating the coefficients of the logistic regression model: a) the unweighted model, b) the weighted model, and c) the unweighted mixed model. The results suggest the use of the weighted logistic regression model, showing the importance of using sampling weights in the estimation of the model parameters.

While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

Knowledge graphs capture structured information and relations between a set of entities or items. As such they represent an attractive source of information that could help improve recommender systems. However existing approaches in this domain rely on manual feature engineering and do not allow for end-to-end training. Here we propose knowledge-aware graph neural networks with label smoothness regularization to provide better recommendations. Conceptually, our approach computes user-specific item embeddings by first applying a trainable function that identifies important knowledge graph relationships for a given user. This way we transform the knowledge graph into a user-specific weighted graph and then applies a graph neural network to compute personalized item embeddings. To provide better inductive bias, we use label smoothness, which assumes that adjacent items in the knowledge graph are likely to have similar user relevance labels/scores. Label smoothness provides regularization over edge weights and we prove that it is equivalent to a label propagation scheme on a graph. Finally, we combine knowledge-aware graph neural networks and label smoothness and present the unified model. Experiment results show that our method outperforms strong baselines in four datasets. It also achieves strong performance in the scenario where user-item interactions are sparse.

北京阿比特科技有限公司