又大又硬又长又粗免费看-欧美日韩国产在线一区二区观看

Real-time computational speed and a high degree of precision are requirements for computer-assisted interventions. Applying a segmentation network to a medical video processing task can introduce significant inter-frame prediction noise. Existing approaches can reduce inconsistencies by including temporal information but often impose requirements on the architecture or dataset. This paper proposes a method to include temporal information in any segmentation model and, thus, a technique to improve video segmentation performance without alterations during training or additional labeling. With Motion-Corrected Moving Average, we refine the exponential moving average between the current and previous predictions. Using optical flow to estimate the movement between consecutive frames, we can shift the prior term in the moving-average calculation to align with the geometry of the current frame. The optical flow calculation does not require the output of the model and can therefore be performed in parallel, leading to no significant runtime penalty for our approach. We evaluate our approach on two publicly available segmentation datasets and two proprietary endoscopic datasets and show improvements over a baseline approach.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · MoDELS · 估計/估計量 · Transformer模型 · 泛函 ·

2024 年 4 月 16 日

Bayesian Penalized Transformation Models: Structured Additive Location-Scale Regression for Arbitrary Conditional Distributions

Johannes Brachem,Paul F. V. Wiemann,Thomas Kneib

Penalized transformation models (PTMs) are a novel form of location-scale regression. In PTMs, the shape of the response's conditional distribution is estimated directly from the data, and structured additive predictors are placed on its location and scale. The core of the model is a monotonically increasing transformation function that relates the response distribution to a reference distribution. The transformation function is equipped with a smoothness prior that regularizes how much the estimated distribution diverges from the reference distribution. These models can be seen as a bridge between conditional transformation models and generalized additive models for location, scale and shape. Markov chain Monte Carlo inference for PTMs can be conducted with the No-U-Turn sampler and offers straightforward uncertainty quantification for the conditional distribution as well as for the covariate effects. A simulation study demonstrates the effectiveness of the approach. We apply the model to data from the Fourth Dutch Growth Study and the Framingham Heart Study. A full-featured implementation is available as a Python library.

INTERACT · 量子計算 · 在線 · INFORMS · ForCES ·

2024 年 4 月 16 日

Quantum Computing for All: Online Courses Built Around Interactive Visual Quantum Circuit Simulator

Juha Reinikainen,Vlad Stirbu,Teiko Heinosaari,Vesa Lappalainen,Tommi Mikkonen

Quantum computing is a highly abstract scientific discipline, which, however, is expected to have great practical relevance in future information technology. This forces educators to seek new methods to teach quantum computing for students with diverse backgrounds and with no prior knowledge of quantum physics. We have developed an online course built around an interactive quantum circuit simulator designed to enable easy creation and maintenance of course material with ranging difficulty. The immediate feedback and automatically evaluated tasks lowers the entry barrier to quantum computing for all students, regardless of their background.

massive MIMO · MIMO · ILP · Boosting（一種模型訓練加速方式） · 路徑 ·

2024 年 4 月 16 日

Smart Pilot Assignment for IoT in Massive MIMO Systems: A Path Towards Scalable IoT Infrastructure

Muhammad Kamran Saeed,Ashfaq Khokhar

from arxiv, Accepted At ICC-2024

5G sets the foundation for an era of creativity with its faster speeds, increased data throughput, reduced latency, and enhanced IoT connectivity, all enabled by Massive MIMO (M-MIMO) technology. M-MIMO boosts network efficiency and enhances user experience by employing intelligent user scheduling. This paper presents a user scheduling scheme and pilot assignment strategy designed for IoT devices, emphasizing mitigating pilot contamination, a key obstacle to improving spectral efficiency (SE) and system scalability in M-MIMO networks. We utilize a user clustering-based pilot allocation scheme to boost IoT device scalability in M-MIMO systems. Additionally, our smart pilot allocation minimizes interference and enhances SE by treating pilot assignment as a graph coloring problem, optimizing it through integer linear programming (ILP). Recognizing the computational complexity of ILP, we introduced a binary search-based heuristic predicated on interference threshold to expedite the computation, while maintaining a near-optimal solution. The simulation results show a significant decrease in the required pilot overhead (about 17%), and substantial enhancement in SE (about 8-14%).

Networking · SR · Performer · Attention · Readability ·

2024 年 4 月 15 日

PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution

Zuoyan Zhao,Hui Xue,Pengfei Fang,Shipeng Zhu

Scene text image super-resolution (STISR) aims at simultaneously increasing the resolution and readability of low-resolution scene text images, thus boosting the performance of the downstream recognition task. Two factors in scene text images, visual structure and semantic information, affect the recognition performance significantly. To mitigate the effects from these factors, this paper proposes a Prior-Enhanced Attention Network (PEAN). Specifically, an attention-based modulation module is leveraged to understand scene text images by neatly perceiving the local and global dependence of images, despite the shape of the text. Meanwhile, a diffusion-based module is developed to enhance the text prior, hence offering better guidance for the SR network to generate SR images with higher semantic accuracy. Additionally, a multi-task learning paradigm is employed to optimize the network, enabling the model to generate legible SR images. As a result, PEAN establishes new SOTA results on the TextZoom benchmark. Experiments are also conducted to analyze the importance of the enhanced text prior as a means of improving the performance of the SR network. Code will be made available at //github.com/jdfxzzy/PEAN.

端到端 · Performer · 稀疏 · SOTA · 模態 ·

2024 年 4 月 10 日

SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving

Diankun Zhang,Guoan Wang,Runwen Zhu,Jianbo Zhao,Xiwu Chen,Siyu Zhang,Jiahao Gong,Qibin Zhou,Wenyuan Zhang,Ningzi Wang,Feiyang Tan,Hangning Zhou,Ziyao Xu,Haotian Yao,Chi Zhang,Xiaojun Liu,Xiaoguang Di,Bin Li

End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system. Despite simplicity and clarity, the performance of end-to-end autonomous driving methods on sub-tasks is still far behind the single-task methods. Meanwhile, the widely used dense BEV features in previous end-to-end methods make it costly to extend to more modalities or tasks. In this paper, we propose a Sparse query-centric paradigm for end-to-end Autonomous Driving (SparseAD), where the sparse queries completely represent the whole driving scenario across space, time and tasks without any dense BEV representation. Concretely, we design a unified sparse architecture for perception tasks including detection, tracking, and online mapping. Moreover, we revisit motion prediction and planning, and devise a more justifiable motion planner framework. On the challenging nuScenes dataset, SparseAD achieves SOTA full-task performance among end-to-end methods and significantly narrows the performance gap between end-to-end paradigms and single-task methods. Codes will be released soon.

MoDELS · 蒸餾 · 樣本 · Better · 推斷 ·

2024 年 4 月 3 日

Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models

Nikita Starodubcev,Artem Fedorov,Artem Babenko,Dmitry Baranchuk

from arxiv, CVPR2024 camera ready

Knowledge distillation methods have recently shown to be a promising direction to speedup the synthesis of large-scale diffusion models by requiring only a few inference steps. While several powerful distillation methods were recently proposed, the overall quality of student samples is typically lower compared to the teacher ones, which hinders their practical usage. In this work, we investigate the relative quality of samples produced by the teacher text-to-image diffusion model and its distilled student version. As our main empirical finding, we discover that a noticeable portion of student samples exhibit superior fidelity compared to the teacher ones, despite the "approximate" nature of the student. Based on this finding, we propose an adaptive collaboration between student and teacher diffusion models for effective text-to-image synthesis. Specifically, the distilled model produces the initial sample, and then an oracle decides whether it needs further improvements with a slow teacher model. Extensive experiments demonstrate that the designed pipeline surpasses state-of-the-art text-to-image alternatives for various inference budgets in terms of human preference. Furthermore, the proposed approach can be naturally used in popular applications such as text-guided image editing and controllable generation.

獎勵函數 · Integration · 泛函 · Learning · 強化學習 ·

2024 年 4 月 1 日

Steady-State Error Compensation for Reinforcement Learning with Quadratic Rewards

Liyao Wang,Zishun Zheng,Yuan Lin

The selection of a reward function in Reinforcement Learning (RL) has garnered significant attention because of its impact on system performance. Issues of significant steady-state errors often manifest when quadratic reward functions are employed. Although absolute-value-type reward functions alleviate this problem, they tend to induce substantial fluctuations in specific system states, leading to abrupt changes. In response to this challenge, this study proposes an approach that introduces an integral term. By integrating this integral term into quadratic-type reward functions, the RL algorithm is adeptly tuned, augmenting the system's consideration of reward history, and consequently alleviates concerns related to steady-state errors. Through experiments and performance evaluations on the Adaptive Cruise Control (ACC) and lane change models, we validate that the proposed method effectively diminishes steady-state errors and does not cause significant spikes in some system states.

分解的 · 秩 · Networking · Learning · INTERACT ·

2024 年 3 月 30 日

DSFNet: Learning Disentangled Scenario Factorization for Multi-Scenario Route Ranking

Jiahao Yu,Yihai Duan,Longfei Xu,Chao Chen,Shuliang Liu,Li Chen,Kaikui Liu,Fan Yang,Ning Guo

Multi-scenario route ranking (MSRR) is crucial in many industrial mapping systems. However, the industrial community mainly adopts interactive interfaces to encourage users to select pre-defined scenarios, which may hinder the downstream ranking performance. In addition, in the academic community, the multi-scenario ranking works only come from other fields, and there are no works specifically focusing on route data due to lacking a publicly available MSRR dataset. Moreover, all the existing multi-scenario works still fail to address the three specific challenges of MSRR simultaneously, i.e. explosion of scenario number, high entanglement, and high-capacity demand. Different from the prior, to address MSRR, our key idea is to factorize the complicated scenario in route ranking into several disentangled factor scenario patterns. Accordingly, we propose a novel method, Disentangled Scenario Factorization Network (DSFNet), which flexibly composes scenario-dependent parameters based on a high-capacity multi-factor-scenario-branch structure. Then, a novel regularization is proposed to induce the disentanglement of factor scenarios. Furthermore, two extra novel techniques, i.e. scenario-aware batch normalization and scenario-aware feature filtering, are developed to improve the network awareness of scenario representation. Additionally, to facilitate MSRR research in the academic community, we propose MSDR, the first large-scale publicly available annotated industrial Multi-Scenario Driving Route dataset. Comprehensive experimental results demonstrate the superiority of our DSFNet, which has been successfully deployed in AMap to serve the major online traffic.

MoDELS · 語言模型化 · 得分 · 相關系數 · CRAFT ·

2024 年 3 月 29 日

ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models

Thibaut Thonet,Jos Rozen,Laurent Besacier

Research on Large Language Models (LLMs) has recently witnessed an increasing interest in extending models' context size to better capture dependencies within long documents. While benchmarks have been proposed to assess long-range abilities, existing efforts primarily considered generic tasks that are not necessarily aligned with real-world applications. In contrast, our work proposes a new benchmark for long-context LLMs focused on a practical meeting assistant scenario. In this scenario, the long contexts consist of transcripts obtained by automatic speech recognition, presenting unique challenges for LLMs due to the inherent noisiness and oral nature of such data. Our benchmark, named ELITR-Bench, augments the existing ELITR corpus' transcripts with 271 manually crafted questions and their ground-truth answers. Our experiments with recent long-context LLMs on ELITR-Bench highlight a gap between open-source and proprietary models, especially when questions are asked sequentially within a conversation. We also provide a thorough analysis of our GPT-4-based evaluation method, encompassing insights from a crowdsourcing study. Our findings suggest that while GPT-4's evaluation scores are correlated with human judges', its ability to differentiate among more than three score levels may be limited.

圖 · 知識圖譜 · 語言模型化 · entity · BERT ·

2019 年 9 月 7 日

KG-BERT: BERT for Knowledge Graph Completion

Liang Yao,Chengsheng Mao,Yuan Luo

Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks.