亚洲黄色网站不卡免费-国产精品大秀视频日韩无码

We consider the task of estimating a conditional density using i.i.d. samples from a joint distribution, which is a fundamental problem with applications in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. When applying these results to conditional density estimation, the use of uniform entropy -- which is infinite when the covariate space is unbounded and suffers from the curse of dimensionality -- can lead to suboptimal rates. Consequently, minimax rates for conditional density estimation cannot be characterized using these classical results. We resolve this problem for well-specified models, obtaining matching (within logarithmic factors) upper and lower bounds on the minimax Kullback--Leibler risk in terms of the empirical Hellinger entropy for the conditional density class. The use of empirical entropy allows us to appeal to concentration arguments based on local Rademacher complexity, which -- in contrast to uniform entropy -- leads to matching rates for large, potentially nonparametric classes and captures the correct dependence on the complexity of the covariate space. Our results require only that the conditional densities are bounded above, and do not require that they are bounded below or otherwise satisfy any tail conditions.

相關內容

Minimax

關注 0

模型評估 · 特征函數 · 近似 · 泛函 · Continuity ·

2023 年 8 月 8 日

The Random Feature Method for Solving Interface Problems

Xurong Chi,Jingrun Chen,Zhouwang Yang

from arxiv, Due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract appearing here is slightly shorter than that in the PDF file

Interface problems have long been a major focus of scientific computing, leading to the development of various numerical methods. Traditional mesh-based methods often employ time-consuming body-fitted meshes with standard discretization schemes or unfitted meshes with tailored schemes to achieve controllable accuracy and convergence rate. Along another line, mesh-free methods bypass mesh generation but lack robustness in terms of convergence and accuracy due to the low regularity of solutions. In this study, we propose a novel method for solving interface problems within the framework of the random feature method. This approach utilizes random feature functions in conjunction with a partition of unity as approximation functions. It evaluates partial differential equations, boundary conditions, and interface conditions on collocation points in equal footing, and solves a linear least-squares system to obtain the approximate solution. To address the issue of low regularity, two sets of random feature functions are used to approximate the solution on each side of the interface, which are then coupled together via interface conditions. We validate our method through a series of increasingly complex numerical examples. Our findings show that despite the solution often being only continuous or even discontinuous, our method not only eliminates the need for mesh generation but also maintains high accuracy, akin to the spectral collocation method for smooth solutions. Remarkably, for the same accuracy requirement, our method requires two to three orders of magnitude fewer degrees of freedom than traditional methods, demonstrating its significant potential for solving interface problems with complex geometries.

MoDELS · 語言模型化 · 數學 · 可辨認的 · Analysis ·

2023 年 8 月 8 日

Generating Mathematical Derivations with Large Language Models

Jordan Meadows,Marco Valentino,Andre Freitas

from arxiv, 10 pages

The derivation of mathematical results in specialised fields, using Large Language Models (LLMs), is an emerging research direction that can help identify models' limitations, and potentially support mathematical discovery. In this paper, we leverage a symbolic engine to generate derivations of equations at scale, and investigate the capabilities of LLMs when deriving goal equations from premises. Specifically, we employ in-context learning for GPT and fine-tune a range of T5 models to compare the robustness and generalisation of pre-training strategies to specialised models. Empirical results show that fine-tuned FLAN-T5-large (MathT5) outperforms GPT models on all static and out-of-distribution test sets in conventional scores. However, an in-depth analysis reveals that the fine-tuned models are more sensitive to perturbations involving unseen symbols and (to a lesser extent) changes to equation structure. In addition, we analyse 1.7K equations, and over 200 derivations, to highlight common reasoning errors such as the inclusion of incorrect, irrelevant, and redundant equations. Finally, we explore the suitability of existing metrics for evaluating mathematical derivations and find evidence that, while they can capture general properties such as sensitivity to perturbations, they fail to highlight fine-grained reasoning errors and essential differences between models. Overall, this work demonstrates that training models on synthetic data may improve their math capabilities beyond much larger LLMs, but current metrics are not appropriately assessing the quality of generated mathematical text.

3D · Networking · 反向傳播 · 知識 (knowledge) · Neural Networks ·

2023 年 8 月 7 日

Differentiable Rendering for Synthetic Aperture Radar Imagery

Michael Wilmanski,Jonathan Tamir

from arxiv, This version of the manuscript is an updated preprint which has been recently accepted by IEEE Transactions on Aerospace Electronic Systems, but has not yet been published or processed by IEEE

There is rising interest in differentiable rendering, which allows explicitly modeling geometric priors and constraints in optimization pipelines using first-order methods such as backpropagation. Incorporating such domain knowledge can lead to deep neural networks that are trained more robustly and with limited data, as well as the capability to solve ill-posed inverse problems. Existing efforts in differentiable rendering have focused on imagery from electro-optical sensors, particularly conventional RGB-imagery. In this work, we propose an approach for differentiable rendering of Synthetic Aperture Radar (SAR) imagery, which combines methods from 3D computer graphics with neural rendering. We demonstrate the approach on the inverse graphics problem of 3D Object Reconstruction from limited SAR imagery using high-fidelity simulated SAR data.

線性的 · MoDELS · 近似 · 冪法 · 矩 ·

2023 年 8 月 7 日

Linear Convergence Bounds for Diffusion Models via Stochastic Localization

Joe Benton,Valentin De Bortoli,Arnaud Doucet,George Deligiannidis

Diffusion models are a powerful method for generating approximate samples from high-dimensional data distributions. Several recent results have provided polynomial bounds on the convergence rate of such models, assuming $L^2$-accurate score estimators. However, up until now the best known such bounds were either superlinear in the data dimension or required strong smoothness assumptions. We provide the first convergence bounds which are linear in the data dimension (up to logarithmic factors) assuming only finite second moments of the data distribution. We show that diffusion models require at most $\tilde O(\frac{d \log^2(1/\delta)}{\varepsilon^2})$ steps to approximate an arbitrary data distribution on $\mathbb{R}^d$ corrupted with Gaussian noise of variance $\delta$ to within $\varepsilon^2$ in Kullback--Leibler divergence. Our proof builds on the Girsanov-based methods of previous works. We introduce a refined treatment of the error arising from the discretization of the reverse SDE, which is based on tools from stochastic localization.

Neural Networks · Networking · Networks · 優化器 · 可約的 ·

2023 年 8 月 6 日

Local Randomized Neural Networks Methods for Interface Problems

Yunlong Li,Fei Wang

from arxiv, 22 pages, 15 figures

Accurate modeling of complex physical problems, such as fluid-structure interaction, requires multiphysics coupling across the interface, which often has intricate geometry and dynamic boundaries. Conventional numerical methods face challenges in handling interface conditions. Deep neural networks offer a mesh-free and flexible alternative, but they suffer from drawbacks such as time-consuming optimization and local optima. In this paper, we propose a mesh-free approach based on Randomized Neural Networks (RNNs), which avoid optimization solvers during training, making them more efficient than traditional deep neural networks. Our approach, called Local Randomized Neural Networks (LRNNs), uses different RNNs to approximate solutions in different subdomains. We discretize the interface problem into a linear system at randomly sampled points across the domain, boundary, and interface using a finite difference scheme, and then solve it by a least-square method. For time-dependent interface problems, we use a space-time approach based on LRNNs. We show the effectiveness and robustness of the LRNNs methods through numerical examples of elliptic and parabolic interface problems. We also demonstrate that our approach can handle high-dimension interface problems. Compared to conventional numerical methods, our approach achieves higher accuracy with fewer degrees of freedom, eliminates the need for complex interface meshing and fitting, and significantly reduces training time, outperforming deep neural networks.

賭博機/老虎機 · 圖 · 優化器 · Learning · Analysis ·

2023 年 8 月 4 日

Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

Yuchen He,Chihao Zhang

The problem of bandit with graph feedback generalizes both the multi-armed bandit (MAB) problem and the learning with expert advice problem by encoding in a directed graph how the loss vector can be observed in each round of the game. The mini-max regret is closely related to the structure of the feedback graph and their connection is far from being fully understood. We propose a new algorithmic framework for the problem based on a partition of the feedback graph. Our analysis reveals the interplay between various parts of the graph by decomposing the regret to the sum of the regret caused by small parts and the regret caused by their interaction. As a result, our algorithm can be viewed as an interpolation and generalization of the optimal algorithms for MAB and learning with expert advice. Our framework unifies previous algorithms for both strongly observable graphs and weakly observable graphs, resulting in improved and optimal regret bounds on a wide range of graph families including graphs of bounded degree and strongly observable graphs with a few corrupted arms.

圖 · 學成 · MoDELS · Extensibility · 深度學習 ·

2022 年 2 月 24 日

Bayesian Deep Learning for Graphs

Federico Errica

from arxiv, PhD Thesis

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

自動問答 · MoDELS · Networking · Processing（編程語言） · state-of-the-art ·

2018 年 6 月 1 日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Mantong Zhou,Minlie Huang,Xiaoyan Zhu

from arxiv, COLING 2018, 13pages

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.

樣例 · 黑盒 · Networking · MoDELS · 原點 ·

2018 年 1 月 15 日

Generating Adversarial Examples with Adversarial Networks

Chaowei Xiao,Bo Li,Jun-Yan Zhu,Warren He,Mingyan Liu,Dawn Song

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial examples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate adversarial perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply AdvGAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.