亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable $\mathrm{Y}$ given explanatory features $\boldsymbol{\mathrm{X}}$. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for vector-valued target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions via optimal transport. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target $\boldsymbol{\mathrm{Y}}$ given the features $\boldsymbol{\mathrm{X}}$; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, number of regressed quantile levels, or number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose {vector monotone rearrangement}, a method which ensures the quantile functions estimated by VQR are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.

相關內容

FAST:Conference on File and Storage Technologies。 Explanation:文件和存儲技(ji)術(shu)會議(yi)。 Publisher:USENIX。 SIT:

This paper considers the specification of covariance structures with tail estimates. We focus on two aspects: (i) the estimation of the VaR-CoVaR risk matrix in the case of larger number of time series observations than assets in a portfolio using quantile predictive regression models without assuming the presence of nonstationary regressors and; (ii) the construction of a novel variable selection algorithm, so-called, Feature Ordering by Centrality Exclusion (FOCE), which is based on an assumption-lean regression framework, has no tuning parameters and is proved to be consistent under general sparsity assumptions. We illustrate the usefulness of our proposed methodology with numerical studies of real and simulated datasets when modelling systemic risk in a network.

Representation learning plays a crucial role in automated feature selection, particularly in the context of high-dimensional data, where non-parametric methods often struggle. In this study, we focus on supervised learning scenarios where the pertinent information resides within a lower-dimensional linear subspace of the data, namely the multi-index model. If this subspace were known, it would greatly enhance prediction, computation, and interpretation. To address this challenge, we propose a novel method for linear feature learning with non-parametric prediction, which simultaneously estimates the prediction function and the linear subspace. Our approach employs empirical risk minimisation, augmented with a penalty on function derivatives, ensuring versatility. Leveraging the orthogonality and rotation invariance properties of Hermite polynomials, we introduce our estimator, named RegFeaL. By utilising alternative minimisation, we iteratively rotate the data to improve alignment with leading directions and accurately estimate the relevant dimension in practical settings. We establish that our method yields a consistent estimator of the prediction function with explicit rates. Additionally, we provide empirical results demonstrating the performance of RegFeaL in various experiments.

Hypertension is a highly prevalent chronic medical condition and a strong risk factor for cardiovascular disease (CVD), as it accounts for more than $45\%$ of CVD. The relation between blood pressure (BP) and its risk factors cannot be explored clearly by standard linear models. Although the fractional polynomials (FPs) can act as a concise and accurate formula for examining smooth relationships between response and predictors, modelling conditional mean functions observes the partial view of a distribution of response variable, as the distributions of many response variables such as BP measures are typically skew. Then modelling 'average' BP may link to CVD but extremely high BP could explore CVD insight deeply and precisely. So, existing mean-based FP approaches for modelling the relationship between factors and BP cannot answer key questions in need. Conditional quantile functions with FPs provide a comprehensive relationship between the response variable and its predictors, such as median and extremely high BP measures that may be often required in practical data analysis generally. To the best of our knowledge, this is new in the literature. Therefore, in this paper, we employ Bayesian variable selection with quantile-dependent prior for the FP model to propose a Bayesian variable selection with parametric nonlinear quantile regression model. The objective is to examine a nonlinear relationship between BP measures and their risk factors across median and upper quantile levels using data extracted from the 2007-2008 National Health and Nutrition Examination Survey (NHANES). The variable selection in the model analysis identified that the nonlinear terms of continuous variables (body mass index, age), and categorical variables (ethnicity, gender and marital status) were selected as important predictors in the model across all quantile levels.

Multivariate functional data that are cross-sectionally compositional data are attracting increasing interest in the statistical modeling literature, a major example being trajectories over time of compositions derived from cause-specific mortality rates. In this work, we develop a novel functional concurrent regression model in which independent variables are functional compositions. This allows us to investigate the relationship over time between life expectancy at birth and compositions derived from cause-specific mortality rates of four distinct age classes, namely 0--4, 5--39, 40--64 and 65+. A penalized approach is developed to estimate the regression coefficients and select the relevant variables. Then an efficient computational strategy based on an augmented Lagrangian algorithm is derived to solve the resulting optimization problem. The good performances of the model in predicting the response function and estimating the unknown functional coefficients are shown in a simulation study. The results on real data confirm the important role of neoplasms and cardiovascular diseases in determining life expectancy emerged in other studies and reveal several other contributions not yet observed.

Symbolic Regression is a powerful data-driven technique that searches for mathematical expressions that explain the relationship between input variables and a target of interest. Due to its efficiency and flexibility, Genetic Programming can be seen as the standard search technique for Symbolic Regression. However, the conventional Genetic Programming algorithm requires storing all data in a central location, which is not always feasible due to growing concerns about data privacy and security. While privacy-preserving research has advanced recently and might offer a solution to this problem, their application to Symbolic Regression remains largely unexplored. Furthermore, the existing work only focuses on the horizontally partitioned setting, whereas the vertically partitioned setting, another popular scenario, has yet to be investigated. Herein, we propose an approach that employs a privacy-preserving technique called Secure Multiparty Computation to enable parties to jointly build Symbolic Regression models in the vertical scenario without revealing private data. Preliminary experimental results indicate that our proposed method delivers comparable performance to the centralized solution while safeguarding data privacy.

Many problems in robotics, such as estimating the state from noisy sensor data or aligning two point clouds, can be posed and solved as least-squares problems. Unfortunately, vanilla nonminimal solvers for least-squares problems are notoriously sensitive to outliers. As such, various robust loss functions have been proposed to reduce the sensitivity to outliers. Examples of loss functions include pseudo-Huber, Cauchy, and Geman-McClure. Recently, these loss functions have been generalized into a single loss function that enables the best loss function to be found adaptively based on the distribution of the residuals. However, even with the generalized robust loss function, most nonminimal solvers can only be solved locally given a prior state estimate due to the nonconvexity of the problem. The first contribution of this paper is to combine graduated nonconvexity (GNC) with the generalized robust loss function to solve least-squares problems without a prior state estimate and without the need to specify a loss function. Moreover, existing loss functions, including the generalized loss function, are based on Gaussian-like distribution. However, residuals are often defined as the squared norm of a multivariate error and distributed in a Chi-like fashion. The second contribution of this paper is to apply a norm-aware adaptive robust loss function within a GNC framework. The proposed approach enables a GNC formulation of a generalized loss function such that GNC can be readily applied to a wider family of loss functions. Furthermore, simulations and experiments demonstrate that the proposed method is more robust compared to non-GNC counterparts, and yields faster convergence times compared to other GNC formulations.

Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own pace without any form of synchronization. Different from existing asynchronous distributed algorithms, R-FAST can eliminate the impact of data heterogeneity across devices and allow for packet losses by employing a robust gradient tracking strategy that relies on properly designed auxiliary variables for tracking and buffering the overall gradient vector. More importantly, the proposed method utilizes two spanning-tree graphs for communication so long as both share at least one common root, enabling flexible designs in communication architectures. We show that R-FAST converges in expectation to a neighborhood of the optimum with a geometric rate for smooth and strongly convex objectives; and to a stationary point with a sublinear rate for general non-convex settings. Extensive experiments demonstrate that R-FAST runs 1.5-2 times faster than synchronous benchmark algorithms, such as Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP, especially in the presence of stragglers.

A new Las Vegas algorithm is presented for the composition of two polynomials modulo a third one, over an arbitrary field. When the degrees of these polynomials are bounded by $n$, the algorithm uses $O(n^{1.43})$ field operations, breaking through the $3/2$ barrier in the exponent for the first time. The previous fastest algebraic algorithms, due to Brent and Kung in 1978, require $O(n^{1.63})$ field operations in general, and ${n^{3/2+o(1)}}$ field operations in the special case of power series over a field of large enough characteristic. If cubic-time matrix multiplication is used, the new algorithm runs in ${n^{5/3+o(1)}}$ operations, while previous ones run in $O(n^2)$ operations. Our approach relies on the computation of a matrix of algebraic relations that is typically of small size. Randomization is used to reduce arbitrary input to this favorable situation.

Flexible modeling of how an entire distribution changes with covariates is an important yet challenging generalization of mean-based regression that has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities and differences between these approaches, extensions, properties and limitations, estimation procedures, and the availability of software. In view of the increasing complexity and availability of large-scale data, this review also discusses the scalability of traditional estimation methods, current trends, and open challenges. Illustrations are provided using data on childhood malnutrition in Nigeria and Australian electricity prices.

The traditional method of computing singular value decomposition (SVD) of a data matrix is based on a least squares principle, thus, is very sensitive to the presence of outliers. Hence the resulting inferences across different applications using the classical SVD are extremely degraded in the presence of data contamination (e.g., video surveillance background modelling tasks, etc.). A robust singular value decomposition method using the minimum density power divergence estimator (rSVDdpd) has been found to provide a satisfactory solution to this problem and works well in applications. For example, it provides a neat solution to the background modelling problem of video surveillance data in the presence of camera tampering. In this paper, we investigate the theoretical properties of the rSVDdpd estimator such as convergence, equivariance and consistency under reasonable assumptions. Since the dimension of the parameters, i.e., the number of singular values and the dimension of singular vectors can grow linearly with the size of the data, the usual M-estimation theory has to be suitably modified with concentration bounds to establish the asymptotic properties. We believe that we have been able to accomplish this satisfactorily in the present work. We also demonstrate the efficiency of rSVDdpd through extensive simulations.

北京阿比特科技有限公司