无码人妻一区二区三区在线不卡-国产又色又爽又黄剌激视频

This paper deals with variable selection in multivariate linear regression model when the data are observations on a spatial domain being a grid of sites in $\mathbb{Z}^d$ with $d\geqslant 2$. We use a criterion that allows to characterize the subset of relevant variables as depending on two parameters, and we propose estimators for these parameters based on spatially dependent observations. We prove the consistency, under specified assumptions, of the method thus proposed. A simulation study made in order to assess the finite-sample behaviour of the proposed method with comparison to existing ones is presented.

相關內容

多變量回歸

關注 0

Learning · 主動學習 · 白盒 · MoDELS · 黑盒 ·

2023 年 7 月 7 日

Black-Box Batch Active Learning for Regression

Andreas Kirsch

from arxiv, 12 pages + 11 pages appendix

Batch active learning is a popular approach for efficiently training machine learning models on large, initially unlabelled datasets by repeatedly acquiring labels for batches of data points. However, many recent batch active learning methods are white-box approaches and are often limited to differentiable parametric models: they score unlabeled points using acquisition functions based on model embeddings or first- and second-order derivatives. In this paper, we propose black-box batch active learning for regression tasks as an extension of white-box approaches. Crucially, our method only relies on model predictions. This approach is compatible with a wide range of machine learning models, including regular and Bayesian deep learning models and non-differentiable models such as random forests. It is rooted in Bayesian principles and utilizes recent kernel-based approaches. This allows us to extend a wide range of existing state-of-the-art white-box batch active learning methods (BADGE, BAIT, LCMD) to black-box models. We demonstrate the effectiveness of our approach through extensive experimental evaluations on regression datasets, achieving surprisingly strong performance compared to white-box approaches for deep learning models.

線性的 · 線性回歸 · 稀疏 · 樣本復雜度 · SimPLe ·

2023 年 7 月 6 日

Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression

Gabriel Arpino,Ramji Venkataramanan

from arxiv, To appear in Conference on Learning Theory (COLT) 2023

We consider the problem of mixed sparse linear regression with two components, where two real $k$-sparse signals $\beta_1, \beta_2$ are to be recovered from $n$ unlabelled noisy linear measurements. The sparsity is allowed to be sublinear in the dimension, and additive noise is assumed to be independent Gaussian with variance $\sigma^2$. Prior work has shown that the problem suffers from a $\frac{k}{SNR^2}$-to-$\frac{k^2}{SNR^2}$ statistical-to-computational gap, resembling other computationally challenging high-dimensional inference problems such as Sparse PCA and Robust Sparse Mean Estimation; here $SNR$ is the signal-to-noise ratio. We establish the existence of a more extensive computational barrier for this problem through the method of low-degree polynomials, but show that the problem is computationally hard only in a very narrow symmetric parameter regime. We identify a smooth information-computation tradeoff between the sample complexity $n$ and runtime for any randomized algorithm in this hard regime. Via a simple reduction, this provides novel rigorous evidence for the existence of a computational barrier to solving exact support recovery in sparse phase retrieval with sample complexity $n = \tilde{o}(k^2)$. Our second contribution is to analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem in $O(np)$ time with square-root the number of samples and matches the sample complexity required for (non-mixed) sparse linear regression; this allows the recovery problem to be subsequently solved by state-of-the-art techniques from the dense case. As a special case of our results, we show that this simple algorithm is order-optimal among a large family of algorithms in solving exact signed support recovery in sparse linear regression.

控制器 · 估計/估計量 · Microsoft Surface · 指數衰減 · 穩健性 ·

2023 年 7 月 6 日

Incremental Nonlinear Dynamic Inversion based Optical Flow Control for Flying Robots: An Efficient Data-driven Approach

Hann Woei Ho,Ye Zhou

This paper presents a novel approach for optical flow control of Micro Air Vehicles (MAVs). The task is challenging due to the nonlinearity of optical flow observables. Our proposed Incremental Nonlinear Dynamic Inversion (INDI) control scheme incorporates an efficient data-driven method to address the nonlinearity. It directly estimates the inverse of the time-varying control effectiveness in real-time, eliminating the need for the constant assumption and avoiding high computation in traditional INDI. This approach effectively handles fast-changing system dynamics commonly encountered in optical flow control, particularly height-dependent changes. We demonstrate the robustness and efficiency of the proposed control scheme in numerical simulations and also real-world flight tests: multiple landings of an MAV on a static and flat surface with various tracking setpoints, hovering and landings on moving and undulating surfaces. Despite being challenged with the presence of noisy optical flow estimates and the lateral and vertical movement of the landing surfaces, the MAV is able to successfully track or land on the surface with an exponential decay of both height and vertical velocity at almost the same time, as desired.

估計/估計量 · 控制器 · 近似 · 穩健性 · 統計量 ·

2023 年 7 月 5 日

An approximate control variates approach to multifidelity distribution estimation

Ruijian Han,Boris Kramer,Dongjin Lee,Akil Narayan,Yiming Xu

from arxiv, 41 pages, added additional numerical experiments

Forward simulation-based uncertainty quantification that studies the distribution of quantities of interest (QoI) is a crucial component for computationally robust engineering design and prediction. There is a large body of literature devoted to accurately assessing statistics of QoIs, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tradeoffs between a given ensemble of models. However, effective algorithms that can estimate the full distribution of QoIs are still under active development. In this paper, we introduce a general multifidelity framework for estimating the cumulative distribution function (CDF) of a vector-valued QoI associated with a high-fidelity model under a budget constraint. Given a family of appropriate control variates obtained from lower-fidelity surrogates, our framework involves identifying the most cost-effective model subset and then using it to build an approximate control variates estimator for the target CDF. We instantiate the framework by constructing a family of control variates using intermediate linear approximators and rigorously analyze the corresponding algorithm. Our analysis reveals that the resulting CDF estimator is uniformly consistent and asymptotically optimal as the budget tends to infinity, with only mild moment and regularity assumptions on the joint distribution of QoIs. The approach provides a robust multifidelity CDF estimator that is adaptive to the available budget, does not require \textit{a priori} knowledge of cross-model statistics or model hierarchy, and applies to multiple dimensions. We demonstrate the efficiency and robustness of the approach using test examples of parametric PDEs and stochastic differential equations including both academic instances and more challenging engineering problems.

INFORMS · 線性模型 · MoDELS · 得分 · 線性的 ·

2023 年 7 月 5 日

On the selection of optimal subdata for big data regression based on leverage scores

Vasilis Chasiotis,Dimitris Karlis

from arxiv, arXiv admin note: text overlap with arXiv:2305.00218

The demand of computational resources for the modeling process increases as the scale of the datasets does, since traditional approaches for regression involve inverting huge data matrices. The main problem relies on the large data size, and so a standard approach is subsampling that aims at obtaining the most informative portion of the big data. In the current paper, we explore an existing approach based on leverage scores, proposed for subdata selection in linear model discrimination. Our objective is to propose the aforementioned approach for selecting the most informative data points to estimate unknown parameters in both the first-order linear model and a model with interactions. We conclude that the approach based on leverage scores improves existing approaches, providing simulation experiments as well as a real data application.

子采樣 · 線性回歸 · 線性的 · 設計 · 數據縮減 ·

2023 年 7 月 5 日

D-optimal Subsampling Design for Massive Data Linear Regression

Torsten Reuter,Rainer Schwabe

from arxiv, 21 pages, 8 figures

Data reduction is a fundamental challenge of modern technology, where classical statistical methods are not applicable because of computational limitations. We consider linear regression for an extraordinarily large number of observations, but only a few covariates. Subsampling aims at the selection of a given percentage of the existing original data. Under distributional assumptions on the covariates, we derive D-optimal subsampling designs and study their theoretical properties. We make use of fundamental concepts of optimal design theory and an equivalence theorem from constrained convex optimization. The thus obtained subsampling designs provide simple rules for whether to accept or reject a data point, allowing for an easy algorithmic implementation. In addition, we propose a simplified subsampling method that differs from the D-optimal design but requires lower computing time. We present a simulation study, comparing both subsampling schemes with the IBOSS method.

估計/估計量 · 分解的 · 經驗風險最小化 · Minimax · 極大似然 ·

2023 年 7 月 5 日

Empirical Bayes via ERM and Rademacher complexities: the Poisson model

Soham Jana,Yury Polyanskiy,Anzo Teh,Yihong Wu

from arxiv, 34 pages, 1 Figure, to appear in the 2023 Conference of Learning Theory (COLT)

We consider the problem of empirical Bayes estimation for (multivariate) Poisson means. Existing solutions that have been shown theoretically optimal for minimizing the regret (excess risk over the Bayesian oracle that knows the prior) have several shortcomings. For example, the classical Robbins estimator does not retain the monotonicity property of the Bayes estimator and performs poorly under moderate sample size. Estimators based on the minimum distance and non-parametric maximum likelihood (NPMLE) methods correct these issues, but are computationally expensive with complexity growing exponentially with dimension. Extending the approach of Barbehenn and Zhao (2022), in this work we construct monotone estimators based on empirical risk minimization (ERM) that retain similar theoretical guarantees and can be computed much more efficiently. Adapting the idea of offset Rademacher complexity Liang et al. (2015) to the non-standard loss and function class in empirical Bayes, we show that the shape-constrained ERM estimator attains the minimax regret within constant factors in one dimension and within logarithmic factors in multiple dimensions.

優化器 · 可約的 · Conformer · 評論員 · 線性的 ·

2023 年 7 月 4 日

Optimal Surrogate Boundary Selection and Scalability Studies for the Shifted Boundary Method on Octree Meshes

Cheng-Hau Yang,Kumar Saurabh,Guglielmo Scovazzi,Claudio Canuto,Adarsh Krishnamurthy,Baskar Ganapathysubramanian

The accurate and efficient simulation of Partial Differential Equations (PDEs) in and around arbitrarily defined geometries is critical for many application domains. Immersed boundary methods (IBMs) alleviate the usually laborious and time-consuming process of creating body-fitted meshes around complex geometry models (described by CAD or other representations, e.g., STL, point clouds), especially when high levels of mesh adaptivity are required. In this work, we advance the field of IBM in the context of the recently developed Shifted Boundary Method (SBM). In the SBM, the location where boundary conditions are enforced is shifted from the actual boundary of the immersed object to a nearby surrogate boundary, and boundary conditions are corrected utilizing Taylor expansions. This approach allows choosing surrogate boundaries that conform to a Cartesian mesh without losing accuracy or stability. Our contributions in this work are as follows: (a) we show that the SBM numerical error can be greatly reduced by an optimal choice of the surrogate boundary, (b) we mathematically prove the optimal convergence of the SBM for this optimal choice of the surrogate boundary, (c) we deploy the SBM on massively parallel octree meshes, including algorithmic advances to handle incomplete octrees, and (d) we showcase the applicability of these approaches with a wide variety of simulations involving complex shapes, sharp corners, and different topologies. Specific emphasis is given to Poisson's equation and the linear elasticity equations.

邊緣化 · 學成 · Networking · Performer · Better ·

2021 年 3 月 25 日

Recent Advances in Large Margin Learning

Yiwen Guo,Changshui Zhang

from arxiv, 8 pages, 3 figures

This paper serves as a survey of recent advances in large margin training and its theoretical foundations, mostly for (nonlinear) deep neural networks (DNNs) that are probably the most prominent machine learning models for large-scale data in the community over the past decade. We generalize the formulation of classification margins from classical research to latest DNNs, summarize theoretical connections between the margin, network generalization, and robustness, and introduce recent efforts in enlarging the margins for DNNs comprehensively. Since the viewpoint of different methods is discrepant, we categorize them into groups for ease of comparison and discussion in the paper. Hopefully, our discussions and overview inspire new research work in the community that aim to improve the performance of DNNs, and we also point to directions where the large margin principle can be verified to provide theoretical evidence why certain regularizations for DNNs function well in practice. We managed to shorten the paper such that the crucial spirit of large margin learning and related methods are better emphasized.

Networking · Extensibility · MoDELS · Neural Networks · 模型復雜度 ·

2018 年 9 月 6 日

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

Yen-Yu Chang,Fan-Yun Sun,Yueh-Hua Wu,Shou-De Lin

from arxiv, 8 pages, 4 figures, submitted to AAAI 2019

Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.