国产高清一区二区在线影院_国产精品一看一级毛片_人妻无码一区二区三区不卡_国产精品亚洲一区二区无码色欲_国内精品视频在线播放_女生下面喷水的视频_99久久久国产精品99综合无

Understanding the relationship between different parts of the image plays a crucial role in many visual recognition tasks. Despite the fact that Convolutional Neural Networks (CNNs) have demonstrated impressive results in detecting single objects, they lack the capability to extract the relationship between various regions of an image, which is a crucial factor in human action recognition. To address this problem, this paper proposes a new module that functions like a convolutional layer using Vision Transformer (ViT). The proposed action recognition model comprises two components: the first part is a deep convolutional network that extracts high-level spatial features from the image, and the second component of the model utilizes a Vision Transformer that extracts the relationship between various regions of the image using the feature map generated by the CNN output. The proposed model has been evaluated on the Stanford40 and PASCAL VOC 2012 action datasets and has achieved 95.5% mAP and 91.5% mAP results, respectively, which are promising compared to other state-of-the-art methods.

相關內容

卷(juan)積

關注 11

在數(shu)(shu)學（特別是(shi)功能分(fen)(fen)析）中，卷(juan)積(ji)是(shi)對兩(liang)個(ge)函(han)(han)數(shu)(shu)（f和g）的(de)數(shu)(shu)學運(yun)算，產生三個(ge)函(han)(han)數(shu)(shu)，表示(shi)第一(yi)個(ge)函(han)(han)數(shu)(shu)的(de)形(xing)狀如何被另一(yi)個(ge)函(han)(han)數(shu)(shu)修改。卷(juan)積(ji)一(yi)詞既指結果函(han)(han)數(shu)(shu)，又指計(ji)算結果的(de)過程。它定義為兩(liang)個(ge)函(han)(han)數(shu)(shu)的(de)乘積(ji)在一(yi)個(ge)函(han)(han)數(shu)(shu)反轉和移位(wei)后的(de)積(ji)分(fen)(fen)。并(bing)針(zhen)對所(suo)有(you)shift值評(ping)估積(ji)分(fen)(fen)，從而生成卷(juan)積(ji)函(han)(han)數(shu)(shu)。

Neural Networks · Networking · Analysis · 環 · MoDELS ·

2023 年 9 月 11 日

Quantitative Verification with Neural Networks

Alessandro Abate,Alec Edwards,Mirco Giacobbe,Hashan Punchihewa,Diptarko Roy

from arxiv, The conference version of this manuscript appeared at CONCUR 2023

We present a data-driven approach to the quantitative verification of probabilistic programs and stochastic dynamical models. Our approach leverages neural networks to compute tight and sound bounds for the probability that a stochastic process hits a target condition within finite time. This problem subsumes a variety of quantitative verification questions, from the reachability and safety analysis of discrete-time stochastic dynamical models, to the study of assertion-violation and termination analysis of probabilistic programs. We rely on neural networks to represent supermartingale certificates that yield such probability bounds, which we compute using a counterexample-guided inductive synthesis loop: we train the neural certificate while tightening the probability bound over samples of the state space using stochastic optimisation, and then we formally check the certificate's validity over every possible state using satisfiability modulo theories; if we receive a counterexample, we add it to our set of samples and repeat the loop until validity is confirmed. We demonstrate on a diverse set of benchmarks that, thanks to the expressive power of neural networks, our method yields smaller or comparable probability bounds than existing symbolic methods in all cases, and that our approach succeeds on models that are entirely beyond the reach of such alternative techniques.

Networking · 可約的 · Performer · 相關系數 · Integration ·

2023 年 9 月 11 日

Beamforming in Wireless Coded-Caching Systems

Sneha Madhusudan,Charitha Madapatha,Behrooz Makki,Hao Guo,Tommy Svensson

from arxiv, Submitted to IEEE Future Networks World Forum, 2023

Increased capacity in the access network poses capacity challenges on the transport network due to the aggregated traffic. However, there are spatial and time correlation in the user data demands that could potentially be utilized. To that end, we investigate a wireless transport network architecture that integrates beamforming and coded-caching strategies. Especially, our proposed design entails a server with multiple antennas that broadcasts content to cache nodes responsible for serving users. Traditional caching methods face the limitation of relying on the individual memory with additional overhead. Hence, we develop an efficient genetic algorithm-based scheme for beam optimization in the coded-caching system. By exploiting the advantages of beamforming and coded-caching, the architecture achieves gains in terms of multicast opportunities, interference mitigation, and reduced peak backhaul traffic. A comparative analysis of this joint design with traditional, un-coded caching schemes is also conducted to assess the benefits of the proposed approach. Additionally, we examine the impact of various buffering and decoding methods on the performance of the coded-caching scheme. Our findings suggest that proper beamforming is useful in enhancing the effectiveness of the coded-caching technique, resulting in significant reduction in peak backhaul traffic.

TransAct · 設計 · 情景 · 極大 · 數學 ·

2023 年 9 月 8 日

Maximizing Miner Revenue in Transaction Fee Mechanism Design

Ke Wu,Elaine Shi,Hao Chung

Transaction fee mechanism design is a new decentralized mechanism design problem where users bid for space on the blockchain. Several recent works showed that the transaction fee mechanism design fundamentally departs from classical mechanism design. They then systematically explored the mathematical landscape of this new decentralized mechanism design problem in two settings: in the plain setting where no cryptography is employed, and in a cryptography-assisted setting where the rules of the mechanism are enforced by a multi-party computation protocol. Unfortunately, in both settings, prior works showed that if we want the mechanism to incentivize honest behavior for both users as well as miners (possibly colluding with users), then the miner revenue has to be zero. Although adopting a relaxed, approximate notion of incentive compatibility gets around this zero miner-revenue limitation, the scaling of the miner revenue is nonetheless poor. In this paper, we show that if we make a mildly stronger reasonable-world assumption than prior works, we can circumvent the known limitations on miner revenue, and design auctions that generate optimal miner revenue. We also systematically explore the mathematical landscape of transaction fee mechanism design under the new reasonable-world and demonstrate how such assumptions can alter the feasibility and infeasibility landscape.

圖像分割 · MoDELS · INTERACT · GROUP · FAST ·

2023 年 9 月 8 日

Grouping Boundary Proposals for Fast Interactive Image Segmentation

Li Liu,Da Chen,Minglei Shu,Laurent D. Cohen

Geodesic models are known as an efficient tool for solving various image segmentation problems. Most of existing approaches only exploit local pointwise image features to track geodesic paths for delineating the objective boundaries. However, such a segmentation strategy cannot take into account the connectivity of the image edge features, increasing the risk of shortcut problem, especially in the case of complicated scenario. In this work, we introduce a new image segmentation model based on the minimal geodesic framework in conjunction with an adaptive cut-based circular optimal path computation scheme and a graph-based boundary proposals grouping scheme. Specifically, the adaptive cut can disconnect the image domain such that the target contours are imposed to pass through this cut only once. The boundary proposals are comprised of precomputed image edge segments, providing the connectivity information for our segmentation model. These boundary proposals are then incorporated into the proposed image segmentation model, such that the target segmentation contours are made up of a set of selected boundary proposals and the corresponding geodesic paths linking them. Experimental results show that the proposed model indeed outperforms state-of-the-art minimal paths-based image segmentation approaches.

塑造 · 穩健性 ·

2023 年 9 月 8 日

Bayesian Phase Search for Probabilistic Amplitude Shaping

Mohammad Taha Askari,Lutz Lampe

from arxiv, 4 pages, 2 figures. Submitted to the 49th European Conference on Optical Communications

We introduce a Bayesian carrier phase recovery (CPR) algorithm which is robust against low signal-to-noise ratio scenarios. It is therefore effective for phase recovery for probabilistic amplitude shaping (PAS). Results validate that the new algorithm overcomes the degradation experienced by blind phase-search CPR for PAS.

優化器 · 語言模型化 · BART · Less · 最優化 ·

2023 年 9 月 7 日

Claim Optimization in Computational Argumentation

Gabriella Skitalinskaya,Maximilian Splieth?ver,Henning Wachsmuth

from arxiv, Accepted as a long paper at INLG 2023

An optimal delivery of arguments is key to persuasion in any debate, both for humans and for AI systems. This requires the use of clear and fluent claims relevant to the given debate. Prior work has studied the automatic assessment of argument quality extensively. Yet, no approach actually improves the quality so far. To fill this gap, this paper proposes the task of claim optimization: to rewrite argumentative claims in order to optimize their delivery. As multiple types of optimization are possible, we approach this task by first generating a diverse set of candidate claims using a large language model, such as BART, taking into account contextual information. Then, the best candidate is selected using various quality metrics. In automatic and human evaluation on an English-language corpus, our quality-based candidate selection outperforms several baselines, improving 60% of all claims (worsening 16% only). Follow-up analyses reveal that, beyond copy editing, our approach often specifies claims with details, whereas it adds less evidence than humans do. Moreover, its capabilities generalize well to other domains, such as instructional texts.

Med-PaLM 2 · Performer · 語言模型化 · MoDELS · 自動問答 ·

2023 年 5 月 16 日

Towards Expert-Level Medical Question Answering with Large Language Models

Karan Singhal,Tao Tu,Juraj Gottweis,Rory Sayres,Ellery Wulczyn,Le Hou,Kevin Clark,Stephen Pfohl,Heather Cole-Lewis,Darlene Neal,Mike Schaekermann,Amy Wang,Mohamed Amin,Sami Lachgar,Philip Mansfield,Sushant Prakash,Bradley Green,Ewa Dominowska,Blaise Aguera y Arcas,Nenad Tomasev,Yun Liu,Renee Wong,Christopher Semturs,S. Sara Mahdavi,Joelle Barral,Dale Webster,Greg S. Corrado,Yossi Matias,Shekoofeh Azizi,Alan Karthikesalingam,Vivek Natarajan

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.

contrastive · 對比學習 · 學成 · Extensibility · 稀疏 ·

2022 年 3 月 25 日

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Fangzhou Hong,Liang Pan,Zhongang Cai,Ziwei Liu

from arxiv, CVPR 2022; Project Page //hongfz16.github.io/projects/HCMoCo.html; Codes available at //github.com/hongfz16/HCMoCo

Human-centric perception plays a vital role in vision and graphics. But their data annotations are prohibitively expensive. Therefore, it is desirable to have a versatile pre-train model that serves as a foundation for data-efficient downstream tasks transfer. To this end, we propose the Human-Centric Multi-Modal Contrastive Learning framework HCMoCo that leverages the multi-modal nature of human data (e.g. RGB, depth, 2D keypoints) for effective representation learning. The objective comes with two main challenges: dense pre-train for multi-modality data, efficient usage of sparse human priors. To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency. HCMoCo provides pre-train for different modalities by combining heterogeneous datasets, which allows efficient usage of existing task-specific human data. Extensive experiments on four downstream tasks of different modalities demonstrate the effectiveness of HCMoCo, especially under data-efficient settings (7.16% and 12% improvement on DensePose Estimation and Human Parsing). Moreover, we demonstrate the versatility of HCMoCo by exploring cross-modality supervision and missing-modality inference, validating its strong ability in cross-modal association and reasoning.

圖 · MoDELS · Continuity · 圖形處理器 · 隱藏層 ·

2020 年 6 月 7 日

Principal Neighbourhood Aggregation for Graph Nets

Gabriele Corso,Luca Cavalleri,Dominique Beaini,Pietro Liò,Petar Veli?kovi?

Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features - which occur regularly in real-world input domains and within the hidden layers of GNNs - and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. With this work, we hope to steer some of the GNN research towards new aggregation methods which we believe are essential in the search for powerful and robust models.

模式崩潰 · 對抗自編碼 · 自編碼器 · 峰值 · Better ·

2018 年 3 月 23 日

Generative Adversarial Autoencoder Networks

Ngoc-Trung Tran,Tuan-Anh Bui,Ngai-Man Cheung

We introduce an effective model to overcome the problem of mode collapse when training Generative Adversarial Networks (GAN). Firstly, we propose a new generator objective that finds it better to tackle mode collapse. And, we apply an independent Autoencoders (AE) to constrain the generator and consider its reconstructed samples as "real" samples to slow down the convergence of discriminator that enables to reduce the gradient vanishing problem and stabilize the model. Secondly, from mappings between latent and data spaces provided by AE, we further regularize AE by the relative distance between the latent and data samples to explicitly prevent the generator falling into mode collapse setting. This idea comes when we find a new way to visualize the mode collapse on MNIST dataset. To the best of our knowledge, our method is the first to propose and apply successfully the relative distance of latent and data samples for stabilizing GAN. Thirdly, our proposed model, namely Generative Adversarial Autoencoder Networks (GAAN), is stable and has suffered from neither gradient vanishing nor mode collapse issues, as empirically demonstrated on synthetic, MNIST, MNIST-1K, CelebA and CIFAR-10 datasets. Experimental results show that our method can approximate well multi-modal distribution and achieve better results than state-of-the-art methods on these benchmark datasets. Our model implementation is published here: //github.com/tntrung/gaan