女生喊疼男生越往里寨的免费观看,欧美亚州视频一区二区三区,美女自拍理论视频,亚洲AV无码大网站在线看

Missing data is a common problem in practical settings. Various imputation methods have been developed to deal with missing data. However, even though the label is usually available in the training data, the common practice of imputation usually only relies on the input and ignores the label. In this work, we illustrate how stacking the label into the input can significantly improve the imputation of the input. In addition, we propose a classification strategy that initializes the predicted test label with missing values and stacks the label with the input for imputation. This allows imputing the label and the input at the same time. Also, the technique is capable of handling data training with missing labels without any prior imputation and is applicable to continuous, categorical, or mixed-type data. Experiments show promising results in terms of accuracy.

相關內容

標注

關注 0

線性的 · MoDELS · 可辨認的 · 潛在 · 有向 ·

2024 年 5 月 31 日

Parameter identification in linear non-Gaussian causal models under general confounding

Daniele Tramontano,Mathias Drton,Jalal Etesami

Linear non-Gaussian causal models postulate that each random variable is a linear function of parent variables and non-Gaussian exogenous error terms. We study identification of the linear coefficients when such models contain latent variables. Our focus is on the commonly studied acyclic setting, where each model corresponds to a directed acyclic graph (DAG). For this case, prior literature has demonstrated that connections to overcomplete independent component analysis yield effective criteria to decide parameter identifiability in latent variable models. However, this connection is based on the assumption that the observed variables linearly depend on the latent variables. Departing from this assumption, we treat models that allow for arbitrary non-linear latent confounding. Our main result is a graphical criterion that is necessary and sufficient for deciding the generic identifiability of direct causal effects. Moreover, we provide an algorithmic implementation of the criterion with a run time that is polynomial in the number of observed variables. Finally, we report on estimation heuristics based on the identification result, explore a generalization to models with feedback loops, and provide new results on the identifiability of the causal graph.

樣本 · CASE · 混合 · 情景 · MoDELS ·

2024 年 5 月 31 日

Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario

Debajyoti Mazumder,Aakash Kumar,Jasabanta Patro

from arxiv, Generated from XeLaTeX

Hate detection has long been a challenging task for the NLP community. The task becomes complex in a code-mixed environment because the models must understand the context and the hate expressed through language alteration. Compared to the monolingual setup, we see very less work on code-mixed hate as large-scale annotated hate corpora are unavailable to make the study. To overcome this bottleneck, we propose using native language hate samples. We hypothesise that in the era of multilingual language models (MLMs), hate in code-mixed settings can be detected by majorly relying on the native language samples. Even though the NLP literature reports the effectiveness of MLMs on hate detection in many cross-lingual settings, their extensive evaluation in a code-mixed scenario is yet to be done. This paper attempts to fill this gap through rigorous empirical experiments. We considered the Hindi-English code-mixed setup as a case study as we have the linguistic expertise for the same. Some of the interesting observations we got are: (i) adding native hate samples in the code-mixed training set, even in small quantity, improved the performance of MLMs for code-mixed hate detection, (ii) MLMs trained with native samples alone observed to be detecting code-mixed hate to a large extent, (iii) The visualisation of attention scores revealed that, when native samples were included in training, MLMs could better focus on the hate emitting words in the code-mixed context, and (iv) finally, when hate is subjective or sarcastic, naively mixing native samples doesn't help much to detect code-mixed hate. We will release the data and code repository to reproduce the reported results.

對數幾率回歸 · 估計/估計量 · 可辨認的 · INFORMS · 統計量 ·

2024 年 5 月 31 日

Statistical inference for case-control logistic regression via integrating external summary data

Hengchao Shi,Xinyi Liu,Ming Zheng,Wen Yu

Case-control sampling is a commonly used retrospective sampling design to alleviate imbalanced structure of binary data. When fitting the logistic regression model with case-control data, although the slope parameter of the model can be consistently estimated, the intercept parameter is not identifiable, and the marginal case proportion is not estimatable, either. We consider the situations in which besides the case-control data from the main study, called internal study, there also exists summary-level information from related external studies. An empirical likelihood based approach is proposed to make inference for the logistic model by incorporating the internal case-control data and external information. We show that the intercept parameter is identifiable with the help of external information, and then all the regression parameters as well as the marginal case proportion can be estimated consistently. The proposed method also accounts for the possible variability in external studies. The resultant estimators are shown to be asymptotically normally distributed. The asymptotic variance-covariance matrix can be consistently estimated by the case-control data. The optimal way to utilized external information is discussed. Simulation studies are conducted to verify the theoretical findings. A real data set is analyzed for illustration.

優化器 · 回合 · Agent · AIM · Performer ·

2024 年 5 月 30 日

Distributed maze exploration using multiple agents and optimal goal assignment

Manousos Linardakis,Iraklis Varlamis,Georgios Th. Papadopoulos

from arxiv, 11 pages, 9 figures

Robotic exploration has long captivated researchers aiming to map complex environments efficiently. Techniques such as potential fields and frontier exploration have traditionally been employed in this pursuit, primarily focusing on solitary agents. Recent advancements have shifted towards optimizing exploration efficiency through multiagent systems. However, many existing approaches overlook critical real-world factors, such as broadcast range limitations, communication costs, and coverage overlap. This paper addresses these gaps by proposing a distributed maze exploration strategy (CU-LVP) that assumes constrained broadcast ranges and utilizes Voronoi diagrams for better area partitioning. By adapting traditional multiagent methods to distributed environments with limited broadcast ranges, this study evaluates their performance across diverse maze topologies, demonstrating the efficacy and practical applicability of the proposed method. The code and experimental results supporting this study are available in the following repository: //github.com/manouslinard/multiagent-exploration/.

特征選擇 · 離散化 · 模型評估 · CASES · xgboost ·

2024 年 5 月 30 日

Interpretable classifiers for tabular data via discretization and feature selection

Reijo Jaakkola,Tomi Janhunen,Antti Kuusisto,Masood Feyzbakhsh Rankooh,Miikka Vilander

from arxiv, Changes in relation to version 1: more thorough and detailed experiments, general corrections and refinements

We introduce a method for computing immediately human interpretable yet accurate classifiers from tabular data. The classifiers obtained are short Boolean formulas, computed via first discretizing the original data and then using feature selection coupled with a very fast algorithm for producing the best possible Boolean classifier for the setting. We demonstrate the approach via 13 experiments, obtaining results with accuracies comparable to ones obtained via random forests, XGBoost, and existing results for the same datasets in the literature. In most cases, the accuracy of our method is in fact similar to that of the reference methods, even though the main objective of our study is the immediate interpretability of our classifiers. We also prove a new result on the probability that the classifier we obtain from real-life data corresponds to the ideally best classifier with respect to the background distribution the data comes from.

binary · 近似 · 優化器 · 相互獨立的 · 同分布的 ·

2024 年 5 月 29 日

Finite-sample expansions for the optimal error probability in asymmetric binary hypothesis testing

Valentinian Lungu,Ioannis Kontoyiannis

The problem of binary hypothesis testing between two probability measures is considered. New sharp bounds are derived for the best achievable error probability of such tests based on independent and identically distributed observations. Specifically, the asymmetric version of the problem is examined, where different requirements are placed on the two error probabilities. Accurate nonasymptotic expansions with explicit constants are obtained for the error probability, using tools from large deviations and Gaussian approximation. Examples are shown indicating that, in the asymmetric regime, the approximations suggested by the new bounds are significantly more accurate than the approximations provided by either of the two main earlier approaches -- normal approximation and error exponents.

正則化項 · Processing（編程語言） · Continuity · 模型評估 · 離散化 ·

2024 年 5 月 29 日

A novel mesh regularization approach based on finite element distortion potentials: Application to material expansion processes with extreme volume change

Abhiroop Satheesh,Christoph P. Schmidt,Wolfgang A. Wall,Christoph Meier

The accuracy of finite element solutions is closely tied to the mesh quality. In particular, geometrically nonlinear problems involving large and strongly localized deformations often result in prohibitively large element distortions. In this work, we propose a novel mesh regularization approach allowing to restore a non-distorted high-quality mesh in an adaptive manner without the need for expensive re-meshing procedures. The core idea of this approach lies in the definition of a finite element distortion potential considering contributions from different distortion modes such as skewness and aspect ratio of the elements. The regularized mesh is found by minimization of this potential. Moreover, based on the concept of spatial localization functions, the method allows to specify tailored requirements on mesh resolution and quality for regions with strongly localized mechanical deformation and mesh distortion. In addition, while existing mesh regularization schemes often keep the boundary nodes of the discretization fixed, we propose a mesh-sliding algorithm based on variationally consistent mortar methods allowing for an unrestricted tangential motion of nodes along the problem boundary. Especially for problems involving significant surface deformation (e.g., frictional contact), this approach allows for an improved mesh relaxation as compared to schemes with fixed boundary nodes. To transfer data such as tensor-valued history variables of the material model from the old (distorted) to the new (regularized) mesh, a structure-preserving invariant interpolation scheme for second-order tensors is employed, which has been proposed in our previous work and is designed to preserve important mechanical properties of tensor-valued data such as objectivity and positive definiteness... {continued see pdf}

Agent · Learning · 回合 · 極大 · 優化器 ·

2024 年 5 月 29 日

An approach to improve agent learning via guaranteeing goal reaching in all episodes

Pavel Osinenko,Grigory Yaremenko,Georgiy Malaniya,Anton Bolychev

Reinforcement learning is commonly concerned with problems of maximizing accumulated rewards in Markov decision processes. Oftentimes, a certain goal state or a subset of the state space attain maximal reward. In such a case, the environment may be considered solved when the goal is reached. Whereas numerous techniques, learning or non-learning based, exist for solving environments, doing so optimally is the biggest challenge. Say, one may choose a reward rate which penalizes the action effort. Reinforcement learning is currently among the most actively developed frameworks for solving environments optimally by virtue of maximizing accumulated reward, in other words, returns. Yet, tuning agents is a notoriously hard task as reported in a series of works. Our aim here is to help the agent learn a near-optimal policy efficiently while ensuring a goal reaching property of some basis policy that merely solves the environment. We suggest an algorithm, which is fairly flexible, and can be used to augment practically any agent as long as it comprises of a critic. A formal proof of a goal reaching property is provided. Simulation experiments on six problems under five agents, including the benchmarked one, provided an empirical evidence that the learning can indeed be boosted while ensuring goal reaching property.

稀疏 · 線性的 · MoDELS · 分類模型 · 有向 ·

2024 年 5 月 29 日

Multi-block linearized alternating direction method for sparse fused Lasso modeling problems

Xiaofei Wu,Rongmei Liang,Zhimin Zhang,Zhenyu Cui

In many statistical modeling problems, such as classification and regression, it is common to encounter sparse and blocky coefficients. Sparse fused Lasso is specifically designed to recover these sparse and blocky structured features, especially in cases where the design matrix has ultrahigh dimensions, meaning that the number of features significantly surpasses the number of samples. Quantile loss is a well-known robust loss function that is widely used in statistical modeling. In this paper, we propose a new sparse fused lasso classification model, and develop a unified multi-block linearized alternating direction method of multipliers algorithm that effectively selects sparse and blocky features for regression and classification. Our algorithm has been proven to converge with a derived linear convergence rate. Additionally, our algorithm has a significant advantage over existing methods for solving ultrahigh dimensional sparse fused Lasso regression and classification models due to its lower time complexity. Note that the algorithm can be easily extended to solve various existing fused Lasso models. Finally, we present numerical results for several synthetic and real-world examples, which demonstrate the robustness, scalability, and accuracy of the proposed classification model and algorithm

樣本 · 核化 · CASES · 規范化的 · 多峰值 ·

2024 年 5 月 28 日

Sampling metastable systems using collective variables and Jarzynski-Crooks paths

Christoph Sch?nle,Marylou Gabrié,Tony Lelièvre,Gabriel Stoltz

We consider the problem of sampling a high dimensional multimodal target probability measure. We assume that a good proposal kernel to move only a subset of the degrees of freedoms (also known as collective variables) is known a priori. This proposal kernel can for example be built using normalizing flows. We show how to extend the move from the collective variable space to the full space and how to implement an accept-reject step in order to get a reversible chain with respect to a target probability measure. The accept-reject step does not require to know the marginal of the original measure in the collective variable (namely to know the free energy). The obtained algorithm admits several variants, some of them being very close to methods which have been proposed previously in the literature. We show how the obtained acceptance ratio can be expressed in terms of the work which appears in the Jarzynski-Crooks equality, at least for some variants. Numerical illustrations demonstrate the efficiency of the approach on various simple test cases, and allow us to compare the variants of the algorithm.