成人午夜性影院视频_一区二区三区精品国产亚洲_波多野结衣一区二区三区精品_国产美女精品免费网站下载_免费高清影视在线看_怡红院在线视频成年视频_亚洲天堂精品一区

Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in the field of generative modeling due to its flexibility in the formulation and strong modeling power of the latent space. However, the common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress; the degenerate MCMC sampling quality in practice often leads to degraded generation quality and instability in training, especially with highly multi-modal and/or high-dimensional target distributions. To remedy this sampling issue, in this paper we introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it. We provide theoretical evidence that the learned amortization of MCMC is a valid long-run MCMC sampler. Experiments on several image modeling benchmark datasets demonstrate the superior performance of our method compared with strong counterparts

相關內容

MCMC

關注 0

前向 · MoDELS · 代價 · 可約的 · 無偏估計 ·

2023 年 11 月 20 日

Forward Gradients for Data-Driven CFD Wall Modeling

Jan Hückelheim,Tadbhagya Kumar,Krishnan Raghavan,Pinaki Pal

Computational Fluid Dynamics (CFD) is used in the design and optimization of gas turbines and many other industrial/ scientific applications. However, the practical use is often limited by the high computational cost, and the accurate resolution of near-wall flow is a significant contributor to this cost. Machine learning (ML) and other data-driven methods can complement existing wall models. Nevertheless, training these models is bottlenecked by the large computational effort and memory footprint demanded by back-propagation. Recent work has presented alternatives for computing gradients of neural networks where a separate forward and backward sweep is not needed and storage of intermediate results between sweeps is not required because an unbiased estimator for the gradient is computed in a single forward sweep. In this paper, we discuss the application of this approach for training a subgrid wall model that could potentially be used as a surrogate in wall-bounded flow CFD simulations to reduce the computational overhead while preserving predictive accuracy.

BERT · 可辨認的 · 白盒 · 可理解性 · CASE ·

2023 年 11 月 19 日

Tensor-Aware Energy Accounting

Timur Babakol,Yu David Liu

With the rapid growth of Artificial Intelligence (AI) applications supported by deep learning (DL), the energy efficiency of these applications has an increasingly large impact on sustainability. We introduce Smaragdine, a new energy accounting system for tensor-based DL programs implemented with TensorFlow. At the heart of Smaragdine is a novel white-box methodology of energy accounting: Smaragdine is aware of the internal structure of the DL program, which we call tensor-aware energy accounting. With Smaragdine, the energy consumption of a DL program can be broken down into units aligned with its logical hierarchical decomposition structure. We apply Smaragdine for understanding the energy behavior of BERT, one of the most widely used language models. Layer-by-layer and tensor-by-tensor, Smaragdine is capable of identifying the highest energy/power-consuming components of BERT. Furthermore, we conduct two case studies on how Smaragdine supports downstream toolchain building, one on the comparative energy impact of hyperparameter tuning of BERT, the other on the energy behavior evolution when BERT evolves to its next generation, ALBERT.

二次規劃 · 方陣 · 極小點 · MATLAB · 優化器 ·

2023 年 11 月 19 日

On Non-Negative Quadratic Programming in Geometric Optimization

Siu-Wing Cheng,Man Ting Wong

We present experimental and theoretical results on a method that applies a numerical solver iteratively to solve several non-negative quadratic programming problems in geometric optimization. The method gains efficiency by exploiting the potential sparsity of the intermediate solutions. We implemented the method to call quadprog of MATLAB iteratively. In comparison with a single call of quadprog, we obtain a 10-fold speedup on two proximity graph problems in $\mathbb{R}^d$ on some public data sets, a 10-fold speedup on the minimum enclosing ball problem on random points in a unit cube in $\mathbb{R}^d$, and a 5-fold speedup on the polytope distance problem on random points from a cube in $\mathbb{R}^d$ when the input size is significantly larger than the dimension; we also obtain a 2-fold or more speedup on deblurring some gray-scale space and thermal images via non-negative least square. We compare with two minimum enclosing ball software by G\"{a}rtner and Fischer et al.; for 1000 nearly cospherical points or random points in a unit cube, the iterative method overtakes the software by G\"{a}rtner at 20 dimensions and the software by Fischer et al. at 170 dimensions. In the image deblurring experiments, the iterative method compares favorably with other software that can solve non-negative least square, including FISTA with backtracking, SBB, FNNLS, and lsqnonneg of MATLAB. We analyze theoretically the number of iterations taken by the iterative scheme to reduce the gap between the current solution value and the optimum by a factor $e$. Under certain assumptions, we prove a bound proportional to the square root of the number of variables.

Networking · Performer · 知識 (knowledge) · 情景 · MoDELS ·

2023 年 11 月 19 日

Open Set Dandelion Network for IoT Intrusion Detection

Jiashu Wu,Hao Dai,Kenneth B. Kent,Jerome Yen,Chengzhong Xu,Yang Wang

As IoT devices become widely, it is crucial to protect them from malicious intrusions. However, the data scarcity of IoT limits the applicability of traditional intrusion detection methods, which are highly data-dependent. To address this, in this paper we propose the Open-Set Dandelion Network (OSDN) based on unsupervised heterogeneous domain adaptation in an open-set manner. The OSDN model performs intrusion knowledge transfer from the knowledge-rich source network intrusion domain to facilitate more accurate intrusion detection for the data-scarce target IoT intrusion domain. Under the open-set setting, it can also detect newly-emerged target domain intrusions that are not observed in the source domain. To achieve this, the OSDN model forms the source domain into a dandelion-like feature space in which each intrusion category is compactly grouped and different intrusion categories are separated, i.e., simultaneously emphasising inter-category separability and intra-category compactness. The dandelion-based target membership mechanism then forms the target dandelion. Then, the dandelion angular separation mechanism achieves better inter-category separability, and the dandelion embedding alignment mechanism further aligns both dandelions in a finer manner. To promote intra-category compactness, the discriminating sampled dandelion mechanism is used. Assisted by the intrusion classifier trained using both known and generated unknown intrusion knowledge, a semantic dandelion correction mechanism emphasises easily-confused categories and guides better inter-category separability. Holistically, these mechanisms form the OSDN model that effectively performs intrusion knowledge transfer to benefit IoT intrusion detection. Comprehensive experiments on several intrusion datasets verify the effectiveness of the OSDN model, outperforming three state-of-the-art baseline methods by 16.9%.

估計/估計量 · OD · UKF · 穩健性 · 原點 ·

2023 年 11 月 17 日

Square-Root Higher-Order Unscented Estimators for Robust Orbit Determination

Yang Yang

Orbit determination (OD) is a fundamental problem in space surveillance and tracking, crucial for ensuring the safety of space assets. Real-world ground-based optical tracking scenarios often involve challenges such as limited measurement time, short visible arcs, and the presence of outliers, leading to sparse and non-Gaussian observational data. Additionally, the highly perturbative and nonlinear orbit dynamics of resident space objects (RSOs) in low Earth orbit (LEO) add further complexity to the OD problem. This paper introduces a novel variant of the higher-order unscented Kalman estimator (HOUSE) called $w$-HOUSE, which employs a square-root formulation and addresses the challenges posed by nonlinear and non-Gaussian OD problems. The effectiveness of $w$-HOUSE was demonstrated through synthetic and real-world measurements, specifically outlier-contaminated angle-only measurements collected for the Sentinel 6A satellite flying in LEO. Comparative analyses are conducted with the original HOUSE (referred to as $\delta$-HOUSE), unscented Kalman filters (UKF), conjugate unscented transformation (CUT) filters, and precise orbit determination solutions estimated via onboard global navigation satellite systems measurements. The results reveal that the proposed $w$-HOUSE filter exhibits greater robustness when dealing with varying values of the dependent parameter compared to the original $\delta$-HOUSE. Moreover, it surpasses all other filters in terms of positioning accuracy, achieving three-dimensional root-mean-square errors of less than 60 m in a three-day scenario. This research suggests that the new $w$-HOUSE filter represents a viable alternative to UKF and CUT filters, offering improved positioning performance in handling the nonlinear and non-Gaussian OD problem associated with LEO RSOs.

泛化理論 · domain shift · Performer · 模態 · 數據獲取 ·

2023 年 11 月 16 日

Gradient-Map-Guided Adaptive Domain Generalization for Cross Modality MRI Segmentation

Bingnan Li,Zhitong Gao,Xuming He

from arxiv, 9 pages, Machine Learning for Health (ML4H) 2023

Cross-modal MRI segmentation is of great value for computer-aided medical diagnosis, enabling flexible data acquisition and model generalization. However, most existing methods have difficulty in handling local variations in domain shift and typically require a significant amount of data for training, which hinders their usage in practice. To address these problems, we propose a novel adaptive domain generalization framework, which integrates a learning-free cross-domain representation based on image gradient maps and a class prior-informed test-time adaptation strategy for mitigating local domain shift. We validate our approach on two multi-modal MRI datasets with six cross-modal segmentation tasks. Across all the task settings, our method consistently outperforms competing approaches and shows a stable performance even with limited training data.

Med-PaLM 2 · Performer · 語言模型化 · MoDELS · 自動問答 ·

2023 年 5 月 16 日

Towards Expert-Level Medical Question Answering with Large Language Models

Karan Singhal,Tao Tu,Juraj Gottweis,Rory Sayres,Ellery Wulczyn,Le Hou,Kevin Clark,Stephen Pfohl,Heather Cole-Lewis,Darlene Neal,Mike Schaekermann,Amy Wang,Mohamed Amin,Sami Lachgar,Philip Mansfield,Sushant Prakash,Bradley Green,Ewa Dominowska,Blaise Aguera y Arcas,Nenad Tomasev,Yun Liu,Renee Wong,Christopher Semturs,S. Sara Mahdavi,Joelle Barral,Dale Webster,Greg S. Corrado,Yossi Matias,Shekoofeh Azizi,Alan Karthikesalingam,Vivek Natarajan

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.

圖 · Networking · Processing（編程語言） · 圖卷積 · 圖卷積神經網絡/圖卷積網絡 ·

2021 年 12 月 27 日

Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily

Tao Wang,Rui Wang,Di Jin,Dongxiao He,Yuxiao Huang

Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.

圖形處理器 · MoDELS · Networking · Neural Networks · 圖 ·

2021 年 6 月 9 日

Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling

Chuizheng Meng,Sirisha Rambhatla,Yan Liu

from arxiv, To be published in the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 21)

Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.