久草精品视频在线观看,亚洲精品无码中出中文字幕

Large language model (LLM) has marked a pivotal moment in the field of machine learning and deep learning. Recently its capability for query planning has been investigated, including both single-modal and multi-modal queries. However, there is no work on the query optimization capability of LLM. As a critical (or could even be the most important) step that significantly impacts the execution performance of the query plan, such analysis and attempts should not be missed. From another aspect, existing query optimizers are usually rule-based or rule-based + cost-based, i.e., they are dependent on manually created rules to complete the query plan rewrite/transformation. Given the fact that modern optimizers include hundreds to thousands of rules, designing a multi-modal query optimizer following a similar way is significantly time-consuming since we will have to enumerate as many multi-modal optimization rules as possible, which has not been well addressed today. In this paper, we investigate the query optimization ability of LLM and use LLM to design LaPuda, a novel LLM and Policy based multi-modal query optimizer. Instead of enumerating specific and detailed rules, LaPuda only needs a few abstract policies to guide LLM in the optimization, by which much time and human effort are saved. Furthermore, to prevent LLM from making mistakes or negative optimization, we borrow the idea of gradient descent and propose a guided cost descent (GCD) algorithm to perform the optimization, such that the optimization can be kept in the correct direction. In our evaluation, our methods consistently outperform the baselines in most cases. For example, the optimized plans generated by our methods result in 1~3x higher execution speed than those by the baselines.

相關內容

優化器

關注 4

優化器 · 代碼 · 大語言模型 · 語言模型化 · Learning ·

2024 年 5 月 3 日

LangProp: A code optimization framework using Large Language Models applied to driving

Shu Ishida,Gianluca Corrado,George Fedoseev,Hudson Yeo,Lloyd Russell,Jamie Shotton,Jo?o F. Henriques,Anthony Hu

We propose LangProp, a framework for iteratively optimizing code generated by large language models (LLMs), in both supervised and reinforcement learning settings. While LLMs can generate sensible coding solutions zero-shot, they are often sub-optimal. Especially for code generation tasks, it is likely that the initial code will fail on certain edge cases. LangProp automatically evaluates the code performance on a dataset of input-output pairs, catches any exceptions, and feeds the results back to the LLM in the training loop, so that the LLM can iteratively improve the code it generates. By adopting a metric- and data-driven training paradigm for this code optimization procedure, one could easily adapt findings from traditional machine learning techniques such as imitation learning, DAgger, and reinforcement learning. We show LangProp's applicability to general domains such as Sudoku and CartPole, as well as demonstrate the first proof of concept of automated code optimization for autonomous driving in CARLA. We show that LangProp can generate interpretable and transparent policies that can be verified and improved in a metric- and data-driven way. Our code is available at //github.com/shuishida/LangProp.

Networking · Neural Networks · Tensor · CIFAR-10 · 可約的 ·

2024 年 5 月 3 日

Compressing neural network by tensor network with exponentially fewer variational parameters

Yong Qing,Ke Li,Peng-Fei Zhou,Shi-Ju Ran

from arxiv, 6 pages, 3 figures for the main text and 3 pages for the appendices

Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including over-fitting, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN by encoding them to deep automatically-differentiable tensor network (ADTN) that contains exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, AlextNet, ZFNet and VGG-16) and datasets (MNIST, CIFAR-10 and CIFAR-100). For instance, we compress two linear layers in VGG-16 with approximately $10^{7}$ parameters to two ADTN's with just 424 parameters, where the testing accuracy on CIFAR-10 is improved from $90.17 \%$ to $91.74\%$. Our work suggests TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which exhibits superior compressibility over the commonly-used matrices and multi-way arrays.

峰值 · MoDELS · 相互獨立的 · 數學 · 設計 ·

2024 年 5 月 3 日

On human-centred security: A new systems model based on modes and mode transitions

Edwin J Beggs,John V Tucker,Victoria Wang

We propose an abstract conceptual framework for analysing complex security systems using a new notion of modes and mode transitions. A mode is an independent component of a system with its own objectives, monitoring data, algorithms, and scope and limits. The behaviour of a mode, including its transitions to other modes, is determined by interpretations of the mode's monitoring data in the light of its objectives and capabilities -- these interpretations we call beliefs. We formalise the conceptual framework mathematically and, by quantifying and visualising beliefs in higher-dimensional geometric spaces, we argue our models may help both design, analyse and explain systems. The mathematical models are based on simplicial complexes.

MoDELS · 假陽性 · 假正例率 · 估計/估計量 · state-of-the-art ·

2024 年 5 月 3 日

Predictive change point detection for heterogeneous data

Anna-Christina Glock,Florian Sobieczky,Johannes Fürnkranz,Peter Filzmoser,Martin Jech

A change point detection (CPD) framework assisted by a predictive machine learning model called "Predict and Compare" is introduced and characterised in relation to other state-of-the-art online CPD routines which it outperforms in terms of false positive rate and out-of-control average run length. The method's focus is on improving standard methods from sequential analysis such as the CUSUM rule in terms of these quality measures. This is achieved by replacing typically used trend estimation functionals such as the running mean with more sophisticated predictive models (Predict step), and comparing their prognosis with actual data (Compare step). The two models used in the Predict step are the ARIMA model and the LSTM recursive neural network. However, the framework is formulated in general terms, so as to allow the use of other prediction or comparison methods than those tested here. The power of the method is demonstrated in a tribological case study in which change points separating the run-in, steady-state, and divergent wear phases are detected in the regime of very few false positives.

估計/估計量 · 線性的 · 動力系統 · UniFormer · 方陣 ·

2024 年 5 月 2 日

Learning linear dynamical systems under convex constraints

Hemant Tyagi,Denis Efimov

from arxiv, 29 pages; added Example 4 (Lipschitz regression), Corollary 4 and Remark 2; corrected minor typos

We consider the problem of finite-time identification of linear dynamical systems from $T$ samples of a single trajectory. Recent results have predominantly focused on the setup where no structural assumption is made on the system matrix $A^* \in \mathbb{R}^{n \times n}$, and have consequently analyzed the ordinary least squares (OLS) estimator in detail. We assume prior structural information on $A^*$ is available, which can be captured in the form of a convex set $\mathcal{K}$ containing $A^*$. For the solution of the ensuing constrained least squares estimator, we derive non-asymptotic error bounds in the Frobenius norm that depend on the local size of $\mathcal{K}$ at $A^*$. To illustrate the usefulness of these results, we instantiate them for four examples, namely when (i) $A^*$ is sparse and $\mathcal{K}$ is a suitably scaled $\ell_1$ ball; (ii) $\mathcal{K}$ is a subspace; (iii) $\mathcal{K}$ consists of matrices each of which is formed by sampling a bivariate convex function on a uniform $n \times n$ grid (convex regression); (iv) $\mathcal{K}$ consists of matrices each row of which is formed by uniform sampling (with step size $1/T$) of a univariate Lipschitz function. In all these situations, we show that $A^*$ can be reliably estimated for values of $T$ much smaller than what is needed for the unconstrained setting.

標注 · 圖像分割 · 類別 · 傳感器 · MoDELS ·

2024 年 5 月 2 日

CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation

Chenying Liu,Conrad Albrecht,Yi Wang,Xiao Xiang Zhu

from arxiv, Accepted as an oral presentation by ICLR 2024 ML4RS workshop

We study the potential of noisy labels y to pretrain semantic segmentation models in a multi-modal learning framework for geospatial applications. Specifically, we propose a novel Cross-modal Sample Selection method (CromSS) that utilizes the class distributions P^{(d)}(x,c) over pixels x and classes c modelled by multiple sensors/modalities d of a given geospatial scene. Consistency of predictions across sensors $d$ is jointly informed by the entropy of P^{(d)}(x,c). Noisy label sampling we determine by the confidence of each sensor d in the noisy class label, P^{(d)}(x,c=y(x)). To verify the performance of our approach, we conduct experiments with Sentinel-1 (radar) and Sentinel-2 (optical) satellite imagery from the globally-sampled SSL4EO-S12 dataset. We pair those scenes with 9-class noisy labels sourced from the Google Dynamic World project for pretraining. Transfer learning evaluations (downstream task) on the DFC2020 dataset confirm the effectiveness of the proposed method for remote sensing image segmentation.

控制器 · 線性的 · 約束 · Extensibility · CASES ·

2024 年 5 月 2 日

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

Severin Bals,Alexandros Evangelidis,Jan K?etínsky,Jakob Waibel

We present MULTIGAIN 2.0, a major extension to the controller synthesis tool MULTIGAIN, built on top of the probabilistic model checker PRISM. This new version extends MULTIGAIN's multi-objective capabilities, by allowing for the formal verification and synthesis of controllers for probabilistic systems with multi-dimensional long-run average reward structures, steady-state constraints, and linear temporal logic properties. Additionally, MULTIGAIN 2.0 can modify the underlying linear program to prevent unbounded-memory and other unintuitive solutions and visualizes Pareto curves, in the two- and three-dimensional cases, to facilitate trade-off analysis in multi-objective scenarios.

維數災難 · 零空間 · 線性的 · 泛函 · 可約的 ·

2024 年 5 月 1 日

The Loewner framework for parametric systems: Taming the curse of dimensionality

Athanasios C. Antoulas,Ion Victor Gosea,Charles Poussot-Vassal

from arxiv, 32 pages, 4 figures

The Loewner framework is an interpolatory framework for the approximation of linear and nonlinear systems. The purpose here is to extend this framework to linear parametric systems with an arbitrary number n of parameters. One main innovation established here is the construction of data-based realizations for any number of parameters. Equally importantly, we show how to alleviate the computational burden, by avoiding the explicit construction of large-scale n-dimensional Loewner matrices of size $N \times N$. This reduces the complexity from $O(N^3)$ to about $O(N^{1.4})$, thus taming the curse of dimensionality and making the solution scalable to very large data sets. To achieve this, a new generalized multivariate rational function realization is defined. Then, we introduce the n-dimensional multivariate Loewner matrices and show that they can be computed by solving a coupled set of Sylvester equations. The null space of these Loewner matrices then allows the construction of the multivariate barycentric transfer function. The principal result of this work is to show how the null space of the n-dimensional Loewner matrix can be computed using a sequence of 1-dimensional Loewner matrices, leading to a drastic computational burden reduction. Finally, we suggest two algorithms (one direct and one iterative) to construct, directly from data, multivariate (or parametric) realizations ensuring (approximate) interpolation. Numerical examples highlight the effectiveness and scalability of the method.

統計量 · MoDELS · 樣本 · 估計/估計量 · Seven ·

2024 年 5 月 1 日

Modern extreme value statistics for Utopian extremes

Jordan Richards,Noura Alotaibi,Daniela Cisneros,Yan Gong,Matheus B. Guerrero,Paolo Redondo,Xuanjie Shao

Capturing the extremal behaviour of data often requires bespoke marginal and dependence models which are grounded in rigorous asymptotic theory, and hence provide reliable extrapolation into the upper tails of the data-generating distribution. We present a toolbox of four methodological frameworks, motivated by modern extreme value theory, that can be used to accurately estimate extreme exceedance probabilities or the corresponding level in either a univariate or multivariate setting. Our frameworks were used to facilitate the winning contribution of Team Yalla to the EVA (2023) Conference Data Challenge, which was organised for the 13$^\text{th}$ International Conference on Extreme Value Analysis. This competition comprised seven teams competing across four separate sub-challenges, with each requiring the modelling of data simulated from known, yet highly complex, statistical distributions, and extrapolation far beyond the range of the available samples in order to predict probabilities of extreme events. Data were constructed to be representative of real environmental data, sampled from the fantasy country of "Utopia"

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.