In many applications, sparse and blocky coefficients often occur in regression and classification problems. The fused Lasso was designed to recover these sparse structured features especially when the design matrix encounters the situation of ultrahigh dimension. Quantile loss is well known as a robust loss function in regression and classification. In this paper, we combine quantile loss and fused Lasso penalty together to produce quantile fused Lasso which can achieve sparse and blocky feature selection in both regression and classification. Interestingly, our proposed model has the unified optimization formula for regression and classification. For ultrahigh dimensional collected data, we derive multi-block linearized alternating direction method of multipliers (LADMM) to deal with it. Moreover, we prove convergence and derive convergence rates of the proposed LADMM algorithm through an elegant method. Note that the algorithm can be easily extended to solve many existing fused Lasso models. Finally, we present some numerical results for several synthetic and real world examples, which illustrate the robustness, scalability, and accuracy of the proposed method.
High-dimensional data are routinely collected in many areas. We are particularly interested in Bayesian classification models in which one or more variables are imbalanced. Current Markov chain Monte Carlo algorithms for posterior computation are inefficient as $n$ and/or $p$ increase due to worsening time per step and mixing rates. One strategy is to use a gradient-based sampler to improve mixing while using data sub-samples to reduce per-step computational complexity. However, usual sub-sampling breaks down when applied to imbalanced data. Instead, we generalize piece-wise deterministic Markov chain Monte Carlo algorithms to include importance-weighted and mini-batch sub-sampling. These approaches maintain the correct stationary distribution with arbitrarily small sub-samples, and substantially outperform current competitors. We provide theoretical support and illustrate gains in simulated and real data applications.
Various methods have recently been proposed to estimate causal effects with confidence intervals that are uniformly valid over a set of data generating processes when high-dimensional nuisance models are estimated by post-model-selection or machine learning estimators. These methods typically require that all the confounders are observed to ensure identification of the effects. We contribute by showing how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models. We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small relative to the sample size; the latter is formalized in terms of convergence rates. Simulation experiments illustrate the finite sample properties of the proposed intervals and investigate an alternative procedure that improves the empirical coverage of the intervals when the amount of unobserved confounding is large. Finally, a case study on the effect of smoking during pregnancy on birth weight is used to illustrate the use of the methods introduced to perform a sensitivity analysis to unobserved confounding.
We consider scalar semilinear elliptic PDEs, where the nonlinearity is strongly monotone, but only locally Lipschitz continuous. To linearize the arising discrete nonlinear problem, we employ a damped Zarantonello iteration, which leads to a linear Poisson-type equation that is symmetric and positive definite. The resulting system is solved by a contractive algebraic solver such as a multigrid method with local smoothing. We formulate a fully adaptive algorithm that equibalances the various error components coming from mesh refinement, iterative linearization, and algebraic solver. We prove that the proposed adaptive iteratively linearized finite element method (AILFEM) guarantees convergence with optimal complexity, where the rates are understood with respect to the overall computational cost (i.e., the computational time). Numerical experiments investigate the involved adaptivity parameters.
Advances in survival analysis have facilitated unprecedented flexibility in data modeling, yet there remains a lack of tools for graphically illustrating the influence of continuous covariates on predicted survival outcomes. We propose the utilization of a colored contour plot to depict the predicted survival probabilities over time, and provide a Shiny app and R package as implementations of this tool. Our approach is capable of supporting conventional models, including the Cox and Fine-Gray models. However, its capability shines when coupled with cutting-edge machine learning models such as random survival forests and deep neural networks.
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data, i.e., images, text, and audio. Accordingly, its promising performance has led to the GAN-based adversarial attack methods in the white-box and black-box attack scenarios. The importance of transferable black-box attacks lies in their ability to be effective across different models and settings, more closely aligning with real-world applications. However, it remains challenging to retain the performance in terms of transferable adversarial examples for such methods. Meanwhile, we observe that some enhanced gradient-based transferable adversarial attack algorithms require prolonged time for adversarial sample generation. Thus, in this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples whilst improving the algorithm's efficiency. The main approach is via optimising the training process of the generator parameters. With the functional and characteristic similarity analysis, we introduce a novel gradient editing (GE) mechanism and verify its feasibility in generating transferable samples on various models. Moreover, by exploring the frequency domain information to determine the gradient editing direction, GE-AdvGAN can generate highly transferable adversarial samples while minimizing the execution time in comparison to the state-of-the-art transferable adversarial attack algorithms. The performance of GE-AdvGAN is comprehensively evaluated by large-scale experiments on different datasets, which results demonstrate the superiority of our algorithm. The code for our algorithm is available at: //github.com/LMBTough/GE-advGAN
Recent wildfires in Australia have led to considerable economic loss and property destruction, and there is increasing concern that climate change may exacerbate their intensity, duration, and frequency. Hazard quantification for extreme wildfires is an important component of wildfire management, as it facilitates efficient resource distribution, adverse effect mitigation, and recovery efforts. However, although extreme wildfires are typically the most impactful, both small and moderate fires can still be devastating to local communities and ecosystems. Therefore, it is imperative to develop robust statistical methods to reliably model the full distribution of wildfire spread. We do so for a novel dataset of Australian wildfires from 1999 to 2019, and analyse monthly spread over areas approximately corresponding to Statistical Areas Level~1 and~2 (SA1/SA2) regions. Given the complex nature of wildfire ignition and spread, we exploit recent advances in statistical deep learning and extreme value theory to construct a parametric regression model using graph convolutional neural networks and the extended generalized Pareto distribution, which allows us to model wildfire spread observed on an irregular spatial domain. We highlight the efficacy of our newly proposed model and perform a wildfire hazard assessment for Australia and population-dense communities, namely Tasmania, Sydney, Melbourne, and Perth.
In recent years a great deal of attention has been paid to discretizations of the incompressible Stokes equations that exactly preserve the incompressibility constraint. These are of substantial interest because these discretizations are pressure-robust, i.e. the error estimates for the velocity do not depend on the error in the pressure. Similar considerations arise in nearly incompressible linear elastic solids. Conforming discretizations with this property are now well understood in two dimensions, but remain poorly understood in three dimensions. In this work we state two conjectures on this subject. The first is that the Scott-Vogelius element pair is inf-sup stable on uniform meshes for velocity degree $k \ge 4$; the best result available in the literature is for $k \ge 6$. The second is that there exists a stable space decomposition of the kernel of the divergence for $k \ge 5$. We present numerical evidence supporting our conjectures.
We consider the Navier-Stokes-Fourier system governing the motion of a general compressible, heat conducting, Newtonian fluid driven by random initial/boundary data. Convergence of the stochastic collocation and Monte Carlo numerical methods is shown under the hypothesis that approximate solutions are bounded in probability. Abstract results are illustrated by numerical experiments for the Rayleigh-Benard convection problem.
This paper proposes a new approach to fit a linear regression for symbolic internal-valued variables, which improves both the Center Method suggested by Billard and Diday in \cite{BillardDiday2000} and the Center and Range Method suggested by Lima-Neto, E.A. and De Carvalho, F.A.T. in \cite{Lima2008, Lima2010}. Just in the Centers Method and the Center and Range Method, the new methods proposed fit the linear regression model on the midpoints and in the half of the length of the intervals as an additional variable (ranges) assumed by the predictor variables in the training data set, but to make these fitments in the regression models, the methods Ridge Regression, Lasso, and Elastic Net proposed by Tibshirani, R. Hastie, T., and Zou H in \cite{Tib1996, HastieZou2005} are used. The prediction of the lower and upper of the interval response (dependent) variable is carried out from their midpoints and ranges, which are estimated from the linear regression models with shrinkage generated in the midpoints and the ranges of the interval-valued predictors. Methods presented in this document are applied to three real data sets cardiologic interval data set, Prostate interval data set and US Murder interval data set to then compare their performance and facility of interpretation regarding the Center Method and the Center and Range Method. For this evaluation, the root-mean-squared error and the correlation coefficient are used. Besides, the reader may use all the methods presented herein and verify the results using the {\tt RSDA} package written in {\tt R} language, that can be downloaded and installed directly from {\tt CRAN} \cite{Rod2014}.
Transformer based Large Language Models (LLMs) have been widely used in many fields, and the efficiency of LLM inference becomes hot topic in real applications. However, LLMs are usually complicatedly designed in model structure with massive operations and perform inference in the auto-regressive mode, making it a challenging task to design a system with high efficiency. In this paper, we propose an efficient LLM inference solution with low latency and high throughput. Firstly, we simplify the LLM decoder layer by fusing data movement and element-wise operations to reduce the memory access frequency and lower system latency. We also propose a segment KV cache policy to keep key/value of the request and response tokens in separate physical memory for effective device memory management, helping enlarge the runtime batch size and improve system throughput. A customized Scaled-Dot-Product-Attention kernel is designed to match our fusion policy based on the segment KV cache solution. We implement our LLM inference solution on Intel GPU and publish it publicly. Compared with the standard HuggingFace implementation, the proposed solution achieves up to 7x lower token latency and 27x higher throughput for some popular LLMs on Intel GPU.