国产高清一区二区在线影院,国产精品一看一级毛片

Many problems in science and engineering involve optimizing an expensive black-box function over a high-dimensional space. For such black-box optimization (BBO) problems, we typically assume a small budget for online function evaluations, but also often have access to a fixed, offline dataset for pretraining. Prior approaches seek to utilize the offline data to approximate the function or its inverse but are not sufficiently accurate far from the data distribution. We propose BONET, a generative framework for pretraining a novel black-box optimizer using offline datasets. In BONET, we train an autoregressive model on fixed-length trajectories derived from an offline dataset. We design a sampling strategy to synthesize trajectories from offline data using a simple heuristic of rolling out monotonic transitions from low-fidelity to high-fidelity samples. Empirically, we instantiate BONET using a causally masked Transformer and evaluate it on Design-Bench, where we rank the best on average, outperforming state-of-the-art baselines.

相關內容

黑盒

關注 1

在科學，計算和工程學中，黑盒是一種設備，系統或對象，可以根據其輸入和輸出（或傳輸特性）對其進行查看，而無需對其內部工作有任何了解。它的實現是“不透明的”（黑色）。幾乎任何事物都可以被稱為黑盒：晶體管，引擎，算法，人腦，機構或政府。為了使用典型的“黑匣子方法”來分析建模為開放系統的事物，僅考慮刺激/響應的行為，以推斷（未知）盒子。該黑匣子系統的通常表示形式是在該方框中居中的數據流程圖。黑盒的對立面是一個內部組件或邏輯可用于檢查的系統，通常將其稱為白盒（有時也稱為“透明盒”或“玻璃盒”）。

FDTD · 有向 · Extensibility · 單元 · UniFormer ·

2023 年 10 月 9 日

Generalized FDTD Scheme for Moving Electromagnetic Structures with Arbitrary Space-Time Configurations

Amir Bahrami,Zoé-Lise Deck-Léger,Zhiyu Li,Christophe Caloz

from arxiv, 13 pages, 11 figures

We present a generalized FDTD scheme to simulate moving electromagnetic structures with arbitrary space-time configurations. This scheme is a local adaptation and 2+1-dimensional extension of the uniform and 1+1-dimensional scheme recently reported in [1]. The local adaptation, which is allowed by the inherently matched nature of the generalized Yee cell to the conventional Yee cell, extends the range of applicability of the scheme in [1] to moving structures that involve multiple and arbitrary velocity profiles while being fully compatible with conventional absorbing boundary conditions and standard treatments of medium dispersion. We show that a direct application of the conventional FDTD scheme predicts qualitatively correct spectral transitions but quantitatively erroneous scattering amplitudes, we infer from this observation generalized, hybrid-physical and auxiliary (non-physical) - fields that automatically satisfy moving boundary conditions in the laboratory frame, and accordingly establish local update equations based on the related Maxwell's equations and constitutive relations. We subsequently provide a detailed stability analysis with a generalization of the Courant criterion to the dynamic regime. We finally validate and illustrate the proposed method by several representative examples. The proposed scheme fills an important gap in the open literature on computational electromagnetics and offers an unprecedented, direct solution for moving structures in commercial software platforms.

正則化項 · 優化器 · 代價 · 推斷 · 稀疏交互 ·

2023 年 10 月 9 日

Sparsistency for Inverse Optimal Transport

Francisco Andrade,Gabriel Peyre,Clarice Poon

Optimal Transport is a useful metric to compare probability distributions and to compute a pairing given a ground cost. Its entropic regularization variant (eOT) is crucial to have fast algorithms and reflect fuzzy/noisy matchings. This work focuses on Inverse Optimal Transport (iOT), the problem of inferring the ground cost from samples drawn from a coupling that solves an eOT problem. It is a relevant problem that can be used to infer unobserved/missing links, and to obtain meaningful information about the structure of the ground cost yielding the pairing. On one side, iOT benefits from convexity, but on the other side, being ill-posed, it requires regularization to handle the sampling noise. This work presents an in-depth theoretical study of the l1 regularization to model for instance Euclidean costs with sparse interactions between features. Specifically, we derive a sufficient condition for the robust recovery of the sparsity of the ground cost that can be seen as a far reaching generalization of the Lasso's celebrated Irrepresentability Condition. To provide additional insight into this condition, we work out in detail the Gaussian case. We show that as the entropic penalty varies, the iOT problem interpolates between a graphical Lasso and a classical Lasso, thereby establishing a connection between iOT and graph estimation, an important problem in ML.

可約的 · 泛函 · Oracle · 貪心 · 值域 ·

2023 年 10 月 8 日

Resource Efficient Boolean Function Solver on Quantum Computer

Xiang Li,Hanxiang Shen,Weiguo Gao,Yingzhou Li

Nonlinear boolean equation systems play an important role in a wide range of applications. Grover's algorithm is one of the best-known quantum search algorithms in solving the nonlinear boolean equation system on quantum computers. In this paper, we propose three novel techniques to improve the efficiency under Grover's algorithm framework. A W-cycle circuit construction introduces a recursive idea to increase the solvable number of boolean equations given a fixed number of qubits. Then, a greedy compression technique is proposed to reduce the oracle circuit depth. Finally, a randomized Grover's algorithm randomly chooses a subset of equations to form a random oracle every iteration, which further reduces the circuit depth and the number of ancilla qubits. Numerical results on boolean quadratic equations demonstrate the efficiency of the proposed techniques.

INTERACT · Weight · 全 · 全局極小解 · 無約束優化 ·

2023 年 10 月 7 日

Weighted Trace-Penalty Minimization for Full Configuration Interaction

Weiguo Gao,Yingzhou Li,Hanxiang Shen

A novel unconstrained optimization model named weighted trace-penalty minimization (WTPM) is proposed to address the extreme eigenvalue problem arising from the Full Configuration Interaction (FCI) method. Theoretical analysis shows that the global minimizers of the WTPM objective function are the desired eigenvectors, rather than the eigenspace. Analyzing the condition number of the Hessian operator in detail contributes to the determination of a near-optimal weight matrix. With the sparse feature of FCI matrices in mind, the coordinate descent (CD) method is adapted to WTPM and results in WTPM-CD method. The reduction of computational and storage costs in each iteration shows the efficiency of the proposed algorithm. Finally, the numerical experiments demonstrate the capability to address large-scale FCI matrices.

代價函數 · Performer · 自下而上 · 代價 · 泛函 ·

2023 年 10 月 6 日

Program Synthesis with Best-First Bottom-Up Search

Saqib Ameen,Levi H. S. Lelis

from arxiv, Published at the Journal of Artificial Intelligence Research (JAIR)

Cost-guided bottom-up search (BUS) algorithms use a cost function to guide the search to solve program synthesis tasks. In this paper, we show that current state-of-the-art cost-guided BUS algorithms suffer from a common problem: they can lose useful information given by the model and fail to perform the search in a best-first order according to a cost function. We introduce a novel best-first bottom-up search algorithm, which we call Bee Search, that does not suffer information loss and is able to perform cost-guided bottom-up synthesis in a best-first manner. Importantly, Bee Search performs best-first search with respect to the generation of programs, i.e., it does not even create in memory programs that are more expensive than the solution program. It attains best-first ordering with respect to generation by performing a search in an abstract space of program costs. We also introduce a new cost function that better uses the information provided by an existing cost model. Empirical results on string manipulation and bit-vector tasks show that Bee Search can outperform existing cost-guided BUS approaches when employing more complex domain-specific languages (DSLs); Bee Search and previous approaches perform equally well with simpler DSLs. Furthermore, our new cost function with Bee Search outperforms previous cost functions on string manipulation tasks.

MoDELS · Performer · ML · 機器學習建模 · Machine Learning ·

2023 年 10 月 6 日

Cost-Effective Retraining of Machine Learning Models

Ananth Mahadevan,Michael Mathioudakis

It is important to retrain a machine learning (ML) model in order to maintain its performance as the data changes over time. However, this can be costly as it usually requires processing the entire dataset again. This creates a trade-off between retraining too frequently, which leads to unnecessary computing costs, and not retraining often enough, which results in stale and inaccurate ML models. To address this challenge, we propose ML systems that make automated and cost-effective decisions about when to retrain an ML model. We aim to optimize the trade-off by considering the costs associated with each decision. Our research focuses on determining whether to retrain or keep an existing ML model based on various factors, including the data, the model, and the predictive queries answered by the model. Our main contribution is a Cost-Aware Retraining Algorithm called Cara, which optimizes the trade-off over streams of data and queries. To evaluate the performance of Cara, we analyzed synthetic datasets and demonstrated that Cara can adapt to different data drifts and retraining costs while performing similarly to an optimal retrospective algorithm. We also conducted experiments with real-world datasets and showed that Cara achieves better accuracy than drift detection baselines while making fewer retraining decisions, ultimately resulting in lower total costs.

潛在 · MoDELS · Learning · INFORMS · AIM ·

2023 年 10 月 4 日

Latent Diffusion Energy-Based Model for Interpretable Text Modeling

Peiyu Yu,Sirui Xie,Xiaojian Ma,Baoxiong Jia,Bo Pang,Ruiqi Gao,Yixin Zhu,Song-Chun Zhu,Ying Nian Wu

from arxiv, ICML 2022

Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in generative modeling. Fueled by its flexibility in the formulation and strong modeling power of the latent space, recent works built upon it have made interesting attempts aiming at the interpretability of text modeling. However, latent space EBMs also inherit some flaws from EBMs in data space; the degenerate MCMC sampling quality in practice can lead to poor generation quality and instability in training, especially on data with complex latent structures. Inspired by the recent efforts that leverage diffusion recovery likelihood learning as a cure for the sampling issue, we introduce a novel symbiosis between the diffusion models and latent space EBMs in a variational learning framework, coined as the latent diffusion energy-based model. We develop a geometric clustering-based regularization jointly with the information bottleneck to further improve the quality of the learned latent space. Experiments on several challenging tasks demonstrate the superior performance of our model on interpretable text modeling over strong counterparts.

蒸餾 · MoDELS · 聯邦學習 · 學成 · 歸納偏好 ·

2021 年 6 月 9 日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Zhuangdi Zhu,Junyuan Hong,Jiayu Zhou

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.

數據增強 · 圖 · 圖形處理器 · Performer · Neural Networks ·

2020 年 12 月 2 日

Data Augmentation for Graph Neural Networks

Tong Zhao,Yozen Liu,Leonardo Neves,Oliver Woodford,Meng Jiang,Neil Shah

from arxiv, AAAI 2021. This complete version contains the Appendix

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.

圖 · MoDELS · Continuity · 圖形處理器 · 隱藏層 ·

2020 年 6 月 7 日

Principal Neighbourhood Aggregation for Graph Nets

Gabriele Corso,Luca Cavalleri,Dominique Beaini,Pietro Liò,Petar Veli?kovi?

Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features - which occur regularly in real-world input domains and within the hidden layers of GNNs - and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. With this work, we hope to steer some of the GNN research towards new aggregation methods which we believe are essential in the search for powerful and robust models.