亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper introduces the Trinary decision tree, an algorithm designed to improve the handling of missing data in decision tree regressors and classifiers. Unlike other approaches, the Trinary decision tree does not assume that missing values contain any information about the response. Both theoretical calculations on estimator bias and numerical illustrations using real data sets are presented to compare its performance with established algorithms in different missing data scenarios (Missing Completely at Random (MCAR), and Informative Missingness (IM)). Notably, the Trinary tree outperforms its peers in MCAR settings, especially when data is only missing out-of-sample, while lacking behind in IM settings. A hybrid model, the TrinaryMIA tree, which combines the Trinary tree and the Missing In Attributes (MIA) approach, shows robust performance in all types of missingness. Despite the potential drawback of slower training speed, the Trinary tree offers a promising and more accurate method of handling missing data in decision tree algorithms.

相關內容

決策樹(Decision Tree)是在已知各種情況發生概率的基礎上,通過構成決策樹來求取凈現值的期望值大于等于零的概率,評價項目風險,判斷其可行性的決策分析方法,是直觀運用概率分析的一種圖解法。由于這種決策分支畫成圖形很像一棵樹的枝干,故稱決策樹。在機器學習中,決策樹是一個預測模型,他代表的是對象屬性與對象值之間的一種映射關系。Entropy = 系統的凌亂程度,使用算法ID3, C4.5和C5.0生成樹算法使用熵。這一度量是基于信息學理論中熵的概念。 決策樹是一種樹形結構,其中每個內部節點表示一個屬性上的測試,每個分支代表一個測試輸出,每個葉節點代表一種類別。 分類樹(決策樹)是一種十分常用的分類方法。他是一種監管學習,所謂監管學習就是給定一堆樣本,每個樣本都有一組屬性和一個類別,這些類別是事先確定的,那么通過學習得到一個分類器,這個分類器能夠對新出現的對象給出正確的分類。這樣的機器學習就被稱之為監督學習。

知識薈萃

精品入門和進階教程、論文和代碼整理等

更多

查看相關VIP內容、論文、資訊等

This work proposes the extended functional tensor train (EFTT) format for compressing and working with multivariate functions on tensor product domains. Our compression algorithm combines tensorized Chebyshev interpolation with a low-rank approximation algorithm that is entirely based on function evaluations. Compared to existing methods based on the functional tensor train format, the adaptivity of our approach often results in reducing the required storage, sometimes considerably, while achieving the same accuracy. In particular, we reduce the number of function evaluations required to achieve a prescribed accuracy by up to over 96% compared to the algorithm from [Gorodetsky, Karaman and Marzouk, Comput. Methods Appl. Mech. Eng., 347 (2019)].

Some mechanical systems, that are modeled to have inelastic collisions, nonetheless possess energy-conserving intermittent-contact solutions, known as collisionless solutions. Such a solution, representing a persistent hopping or walking across a level ground, may be important for understanding animal locomotion or for designing efficient walking machines. So far, collisionless motion has been analytically studied in simple two degrees of freedom (DOF) systems, or in a system that decouples into 2-DOF subsystems in the harmonic approximation. In this paper we extend the consideration to a N-DOF system, recovering the known solutions as a special N = 2 case of the general formulation. We show that in the harmonic approximation the collisionless solution is determined by the spectrum of the system. We formulate a solution existence condition, which requires the presence of at least one oscillating normal mode in the most constrained phase of the motion. An application of the developed general framework is illustrated by finding a collisionless solution for a rocking motion of a biped with an armed standing torso.

Interpolation of data on non-Euclidean spaces is an active research area fostered by its numerous applications. This work considers the Hermite interpolation problem: finding a sufficiently smooth manifold curve that interpolates a collection of data points on a Riemannian manifold while matching a prescribed derivative at each point. We propose a novel procedure relying on the general concept of retractions to solve this problem on a large class of manifolds, including those for which computing the Riemannian exponential or logarithmic maps is not straightforward, such as the manifold of fixed-rank matrices. We analyze the well-posedness of the method by introducing and showing the existence of retraction-convex sets, a generalization of geodesically convex sets. We extend to the manifold setting a classical result on the asymptotic interpolation error of Hermite interpolation. We finally illustrate these results and the effectiveness of the method with numerical experiments on the manifold of fixed-rank matrices and the Stiefel manifold of matrices with orthonormal columns.

The author has recently introduced an abstract algebraic framework of analogical proportions within the general setting of universal algebra. This paper studies analogical proportions in the boolean domain consisting of two elements 0 and 1 within his framework. It turns out that our notion of boolean proportions coincides with two prominent models from the literature in different settings. This means that we can capture two separate modellings of boolean proportions within a single framework which is mathematically appealing and provides further evidence for the robustness and applicability of the general framework.

In a regression model with multiple response variables and multiple explanatory variables, if the difference of the mean vectors of the response variables for different values of explanatory variables is always in the direction of the first principal eigenvector of the covariance matrix of the response variables, then it is called a multivariate allometric regression model. This paper studies the estimation of the first principal eigenvector in the multivariate allometric regression model. A class of estimators that includes conventional estimators is proposed based on weighted sum-of-squares matrices of regression sum-of-squares matrix and residual sum-of-squares matrix. We establish an upper bound of the mean squared error of the estimators contained in this class, and the weight value minimizing the upper bound is derived. Sufficient conditions for the consistency of the estimators are discussed in weak identifiability regimes under which the difference of the largest and second largest eigenvalues of the covariance matrix decays asymptotically and in ``large $p$, large $n$" regimes, where $p$ is the number of response variables and $n$ is the sample size. Several numerical results are also presented.

Cognitive decision-making processes are crucial aspects of human behavior, influencing various personal and professional domains. This research delves into the application of differential equations in analyzing decision-making accuracy by leveraging eye-tracking data within a virtual industrial town setting. The study unveils a systematic approach to transforming raw data into a differential equation, essential for deciphering the relationship between eye movements during decision-making processes. Mathematical relationship extraction and variable-parameter definition pave the way for deriving a differential equation that encapsulates the growth of fixations on characters. The key factors in this equation encompass the fixation rate $(\lambda)$ and separation rate $(\mu)$, reflecting user interaction dynamics and their impact on decision-making complexities tied to user engagement with virtual characters. For a comprehensive grasp of decision dynamics, solving this differential equation requires initial fixation counts, fixation rate, and separation rate. The formulation of differential equations incorporates various considerations such as engagement duration, character-player distance, relative speed, and character attributes, enabling the representation of fixation changes, speed dynamics, distance variations, and the effects of character attributes. This comprehensive analysis not only enhances our comprehension of decision-making processes but also provides a foundational framework for predictive modeling and data-driven insights for future research and applications in cognitive science and virtual reality environments.

Counterfactual generation lies at the core of various machine learning tasks, including image translation and controllable text generation. This generation process usually requires the identification of the disentangled latent representations, such as content and style, that underlie the observed data. However, it becomes more challenging when faced with a scarcity of paired data and labeling information. Existing disentangled methods crucially rely on oversimplified assumptions, such as assuming independent content and style variables, to identify the latent variables, even though such assumptions may not hold for complex data distributions. For instance, food reviews tend to involve words like tasty, whereas movie reviews commonly contain words such as thrilling for the same positive sentiment. This problem is exacerbated when data are sampled from multiple domains since the dependence between content and style may vary significantly over domains. In this work, we tackle the domain-varying dependence between the content and the style variables inherent in the counterfactual generation task. We provide identification guarantees for such latent-variable models by leveraging the relative sparsity of the influences from different latent variables. Our theoretical insights enable the development of a doMain AdapTive counTerfactual gEneration model, called (MATTE). Our theoretically grounded framework achieves state-of-the-art performance in unsupervised style transfer tasks, where neither paired data nor style labels are utilized, across four large-scale datasets. Code is available at //github.com/hanqi-qi/Matte.git

Vessel segmentation and centerline extraction are two crucial preliminary tasks for many computer-aided diagnosis tools dealing with vascular diseases. Recently, deep-learning based methods have been widely applied to these tasks. However, classic deep-learning approaches struggle to capture the complex geometry and specific topology of vascular networks, which is of the utmost importance in most applications. To overcome these limitations, the clDice loss, a topological loss that focuses on the vessel centerlines, has been recently proposed. This loss requires computing, with a proposed soft-skeleton algorithm, the skeletons of both the ground truth and the predicted segmentation. However, the soft-skeleton algorithm provides suboptimal results on 3D images, which makes the clDice hardly suitable on 3D images. In this paper, we propose to replace the soft-skeleton algorithm by a U-Net which computes the vascular skeleton directly from the segmentation. We show that our method provides more accurate skeletons than the soft-skeleton algorithm. We then build upon this network a cascaded U-Net trained with the clDice loss to embed topological constraints during the segmentation. The resulting model is able to predict both the vessel segmentation and centerlines with a more accurate topology.

Full waveform inversion (FWI) is a large-scale nonlinear ill-posed problem for which computationally expensive Newton-type methods can become trapped in undesirable local minima, particularly when the initial model lacks a low-wavenumber component and the recorded data lacks low-frequency content. A modification to the Gauss-Newton (GN) method is proposed to address these issues. The standard GN system for multisource multireceiver FWI is reformulated into an equivalent matrix equation form, with the solution becoming a diagonal matrix rather than a vector as in the standard system. The search direction is transformed from a vector to a matrix by relaxing the diagonality constraint, effectively adding a degree of freedom to the subsurface offset axis. The relaxed system can be explicitly solved with only the inversion of two small matrices that deblur the data residual matrix along the source and receiver dimensions, which simplifies the inversion of the Hessian matrix. When used to solve the extended source FWI objective function, the Extended GN (EGN) method integrates the benefits of both model and source extension. The EGN method effectively combines the computational effectiveness of the reduced FWI method with the robustness characteristics of extended formulations and offers a promising solution for addressing the challenges of FWI. It bridges the gap between these extended formulations and the reduced FWI method, enhancing inversion robustness while maintaining computational efficiency. The robustness and stability of the EGN algorithm for waveform inversion are demonstrated numerically.

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.

北京阿比特科技有限公司