亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper is concerned with inference on the conditional mean of a high-dimensional linear model when outcomes are missing at random. We propose an estimator which combines a Lasso pilot estimate of the regression function with a bias correction term based on the weighted residuals of the Lasso regression. The weights depend on estimates of the missingness probabilities (propensity scores) and solve a convex optimization program that trades off bias and variance optimally. Provided that the propensity scores can be consistently estimated, the proposed estimator is asymptotically normal and semi-parametrically efficient among all asymptotically linear estimators. The rate at which the propensity scores are consistent is essentially irrelevant, allowing us to estimate them via modern machine learning techniques. We validate the finite-sample performance of the proposed estimator through comparative simulation studies and the real-world problem of inferring the stellar masses of galaxies in the Sloan Digital Sky Survey.

相關內容

Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals. This article explores the fundamental statistical limitations associated with MIAs on machine learning models. More precisely, we first derive the statistical quantity that governs the effectiveness and success of such attacks. Then, we investigate several situations for which we provide bounds on this quantity of interest. This allows us to infer the accuracy of potential attacks as a function of the number of samples and other structural parameters of learning models, which in some cases can be directly estimated from the dataset.

This paper considers a generalized multiple-input multiple-output (GMIMO) with practical assumptions, such as massive antennas, practical channel coding, arbitrary input distributions, and general right-unitarily-invariant channel matrices (covering Rayleigh fading, certain ill-conditioned and correlated channel matrices). Orthogonal/vector approximate message passing (OAMP/VAMP) has been proved to be information-theoretically optimal in GMIMO, but it is limited to high complexity. Meanwhile, low-complexity memory approximate message passing (MAMP) was shown to be Bayes optimal in GMIMO, but channel coding was ignored. Therefore, how to design a low-complexity and information-theoretic optimal receiver for GMIMO is still an open issue. In this paper, we propose an information-theoretic optimal MAMP receiver for coded GMIMO, whose achievable rate analysis and optimal coding principle are provided to demonstrate its information-theoretic optimality. Specifically, state evolution (SE) for MAMP is intricately multi-dimensional because of the nature of local memory detection. To this end, a fixed-point consistency lemma is proposed to derive the simplified variational SE (VSE) for MAMP, based on which the achievable rate of MAMP is calculated, and the optimal coding principle is derived to maximize the achievable rate. Subsequently, we prove the information-theoretic optimality of MAMP. Numerical results show that the finite-length performances of MAMP with optimized LDPC codes are about 1.0 - 2.7 dB away from the associated constrained capacities. It is worth noting that MAMP can achieve the same performance as OAMP/VAMP with 0.4% of the time consumption for large-scale systems.

In many scenarios, one uses a large training set to train a model with the goal of performing well on a smaller testing set with a different distribution. Learning a weight for each data point of the training set is an appealing solution, as it ideally allows one to automatically learn the importance of each training point for generalization on the testing set. This task is usually formalized as a bilevel optimization problem. Classical bilevel solvers are based on a warm-start strategy where both the parameters of the models and the data weights are learned at the same time. We show that this joint dynamic may lead to sub-optimal solutions, for which the final data weights are very sparse. This finding illustrates the difficulty of data reweighting and offers a clue as to why this method is rarely used in practice.

Identification of optimal dose combinations in early phase dose-finding trials is challenging, due to the trade-off between precisely estimating the many parameters required to flexibly model the dose-response surface, and the small sample sizes in early phase trials. Existing methods often restrict the search to pre-defined dose combinations, which may fail to identify regions of optimality in the dose combination space. These difficulties are even more pertinent in the context of personalized dose-finding, where patient characteristics are used to identify tailored optimal dose combinations. To overcome these challenges, we propose the use of Bayesian optimization for finding optimal dose combinations in standard ("one size fits all") and personalized multi-agent dose-finding trials. Bayesian optimization is a method for estimating the global optima of expensive-to-evaluate objective functions. The objective function is approximated by a surrogate model, commonly a Gaussian process, paired with a sequential design strategy to select the next point via an acquisition function. This work is motivated by an industry-sponsored problem, where focus is on optimizing a dual-agent therapy in a setting featuring minimal toxicity. To compare the performance of the standard and personalized methods under this setting, simulation studies are performed for a variety of scenarios. Our study concludes that taking a personalized approach is highly beneficial in the presence of heterogeneity.

In this paper, we design a regularization-free algorithm for high-dimensional support vector machines (SVMs) by integrating over-parameterization with Nesterov's smoothing method, and provide theoretical guarantees for the induced implicit regularization phenomenon. In particular, we construct an over-parameterized hinge loss function and estimate the true parameters by leveraging regularization-free gradient descent on this loss function. The utilization of Nesterov's method enhances the computational efficiency of our algorithm, especially in terms of determining the stopping criterion and reducing computational complexity. With appropriate choices of initialization, step size, and smoothness parameter, we demonstrate that unregularized gradient descent achieves a near-oracle statistical convergence rate. Additionally, we verify our theoretical findings through a variety of numerical experiments and compare the proposed method with explicit regularization. Our results illustrate the advantages of employing implicit regularization via gradient descent in conjunction with over-parameterization in sparse SVMs.

Matroid intersection is a classical optimization problem where, given two matroids over the same ground set, the goal is to find the largest common independent set. In this paper, we show that there exists a certain "sparsifer": a subset of elements, of size $O(|S^{opt}| \cdot 1/\varepsilon)$, where $S^{opt}$ denotes the optimal solution, that is guaranteed to contain a $3/2 + \varepsilon$ approximation, while guaranteeing certain robustness properties. We call such a small subset a Density Constrained Subset (DCS), which is inspired by the Edge-Degree Constrained Subgraph (EDCS) [Bernstein and Stein, 2015], originally designed for the maximum cardinality matching problem in a graph. Our proof is constructive and hinges on a greedy decomposition of matroids, which we call the density-based decomposition. We show that this sparsifier has certain robustness properties that can be used in one-way communication and random-order streaming models.

This paper proposes a method to estimate the class separability of an unlabeled text dataset by inspecting the topological characteristics of sentence-transformer embeddings of the text. Experiments conducted involve both binary and multi-class cases, with balanced and imbalanced scenarios. The results demonstrate a clear correlation and a better consistency between the proposed method and other separability and classification metrics, such as Thornton's method and the AUC score of a logistic regression classifier, as well as unsupervised methods. Finally, we empirically show that the proposed method can be part of a stopping criterion for fine-tuning language-model classifiers. By monitoring the class separability of the embedding space after each training iteration, we can detect when the training process stops improving the separability of the embeddings without using additional labels.

This study enhances option pricing by presenting unique pricing model fractional order Black-Scholes-Merton (FOBSM) which is based on the Black-Scholes-Merton (BSM) model. The main goal is to improve the precision and authenticity of option pricing, matching them more closely with the financial landscape. The approach integrates the strengths of both the BSM and neural network (NN) with complex diffusion dynamics. This study emphasizes the need to take fractional derivatives into account when analyzing financial market dynamics. Since FOBSM captures memory characteristics in sequential data, it is better at simulating real-world systems than integer-order models. Findings reveals that in complex diffusion dynamics, this hybridization approach in option pricing improves the accuracy of price predictions. the key contribution of this work lies in the development of a novel option pricing model (FOBSM) that leverages fractional calculus and neural networks to enhance accuracy in capturing complex diffusion dynamics and memory effects in financial data.

We propose a novel method for automatic reasoning on knowledge graphs based on debate dynamics. The main idea is to frame the task of triple classification as a debate game between two reinforcement learning agents which extract arguments -- paths in the knowledge graph -- with the goal to promote the fact being true (thesis) or the fact being false (antithesis), respectively. Based on these arguments, a binary classifier, called the judge, decides whether the fact is true or false. The two agents can be considered as sparse, adversarial feature generators that present interpretable evidence for either the thesis or the antithesis. In contrast to other black-box methods, the arguments allow users to get an understanding of the decision of the judge. Since the focus of this work is to create an explainable method that maintains a competitive predictive accuracy, we benchmark our method on the triple classification and link prediction task. Thereby, we find that our method outperforms several baselines on the benchmark datasets FB15k-237, WN18RR, and Hetionet. We also conduct a survey and find that the extracted arguments are informative for users.

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

北京阿比特科技有限公司