国产欧美日韩视频一区二区_一级黄色视频一区_久久国产精品污视频_中文字幕黄色一区二区在线视频_国产在线观看无码一区二区三_午夜无码无遮档在线视频_日本大陆三级片黄色网址

In this work we propose an audio recording segmentation method based on an adaptive change point detection (A-CPD) for machine guided weak label annotation of audio recording segments. The goal is to maximize the amount of information gained about the temporal activation's of the target sounds. For each unlabeled audio recording, we use a prediction model to derive a probability curve used to guide annotation. The prediction model is initially pre-trained on available annotated sound event data with classes that are disjoint from the classes in the unlabeled dataset. The prediction model then gradually adapts to the annotations provided by the annotator in an active learning loop. The queries used to guide the weak label annotator towards strong labels are derived using change point detection on these probabilities. We show that it is possible to derive strong labels of high quality even with a limited annotation budget, and show favorable results for A-CPD when compared to two baseline query strategies.

相關內容

標注

關注 2

多峰值 · MoDELS · GPT-4V · Vision · Performer ·

2024 年 4 月 25 日

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Zhe Chen,Weiyun Wang,Hao Tian,Shenglong Ye,Zhangwei Gao,Erfei Cui,Wenwen Tong,Kongzhi Hu,Jiapeng Luo,Zheng Ma,Ji Ma,Jiaqi Wang,Xiaoyi Dong,Hang Yan,Hewei Guo,Conghui He,Zhenjiang Jin,Chao Xu,Bin Wang,Xingjian Wei,Wei Li,Wenjian Zhang,Lewei Lu,Xizhou Zhu,Tong Lu,Dahua Lin,Yu Qiao

from arxiv, Technical report

In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs. (2) Dynamic High-Resolution: we divide images into tiles ranging from 1 to 40 of 448$\times$448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input. (3) High-Quality Bilingual Dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images, and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in OCR- and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks. Code has been released at //github.com/OpenGVLab/InternVL.

Machine Learning · Performer · Learning · Kubernetes · 可辨認的 ·

2024 年 4 月 25 日

Benchmarking Machine Learning Applications on Heterogeneous Architecture using Reframe

Christopher Rae,Joseph K. L. Lee,James Richings,Michele Weiland

from arxiv, Author accepted version of paper in the PERMAVOST workshop at the 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC 24)

With the rapid increase in machine learning workloads performed on HPC systems, it is beneficial to regularly perform machine learning specific benchmarks to monitor performance and identify issues. Furthermore, as part of the Edinburgh International Data Facility, EPCC currently hosts a wide range of machine learning accelerators including Nvidia GPUs, the Graphcore Bow Pod64 and Cerebras CS-2, which are managed via Kubernetes and Slurm. We extended the Reframe framework to support the Kubernetes scheduler backend, and utilise Reframe to perform machine learning benchmarks, and we discuss the preliminary results collected and challenges involved in integrating Reframe across multiple platforms and architectures.

MoDELS · Performer · 情景 · Machine Translation · 語言模型化 ·

2024 年 4 月 23 日

Setting up the Data Printer with Improved English to Ukrainian Machine Translation

Yurii Paniv,Dmytro Chaplynskyi,Nikita Trynus,Volodymyr Kyrylov

To build large language models for Ukrainian we need to expand our corpora with large amounts of new algorithmic tasks expressed in natural language. Examples of task performance expressed in English are abundant, so with a high-quality translation system our community will be enabled to curate datasets faster. To aid this goal, we introduce a recipe to build a translation system using supervised finetuning of a large pretrained language model with a noisy parallel dataset of 3M pairs of Ukrainian and English sentences followed by a second phase of training using 17K examples selected by k-fold perplexity filtering on another dataset of higher quality. Our decoder-only model named Dragoman beats performance of previous state of the art encoder-decoder models on the FLORES devtest set.

Performer · Learning · AI · 回合 · Projection ·

2024 年 4 月 22 日

Lessons Learned in Performing a Trustworthy AI and Fundamental Rights Assessment

Marjolein Boonstra,Frédérick Bruneault,Subrata Chakraborty,Tjitske Faber,Alessio Gallucci,Eleanore Hickman,Gerard Kema,Heejin Kim,Jaap Kooiker,Elisabeth Hildt,Annegret Lamadé,Emilie Wiinblad Mathez,Florian M?slein,Genien Pathuis,Giovanni Sartor,Marijke Steege,Alice Stocco,Willy Tadema,Jarno Tuimala,Isabel van Vledder,Dennis Vetter,Jana Vetter,Magnus Westerlund,Roberto V. Zicari

from arxiv, On behalf of the Z-Inspection$^{\small{\circledR}}$ Initiative

This report shares the experiences, results and lessons learned in conducting a pilot project ``Responsible use of AI'' in cooperation with the Province of Friesland, Rijks ICT Gilde-part of the Ministry of the Interior and Kingdom Relations (BZK) (both in The Netherlands) and a group of members of the Z-Inspection$^{\small{\circledR}}$ Initiative. The pilot project took place from May 2022 through January 2023. During the pilot, the practical application of a deep learning algorithm from the province of Fr\^yslan was assessed. The AI maps heathland grassland by means of satellite images for monitoring nature reserves. Environmental monitoring is one of the crucial activities carried on by society for several purposes ranging from maintaining standards on drinkable water to quantifying the CO2 emissions of a particular state or region. Using satellite imagery and machine learning to support decisions is becoming an important part of environmental monitoring. The main focus of this report is to share the experiences, results and lessons learned from performing both a Trustworthy AI assessment using the Z-Inspection$^{\small{\circledR}}$ process and the EU framework for Trustworthy AI, and combining it with a Fundamental Rights assessment using the Fundamental Rights and Algorithms Impact Assessment (FRAIA) as recommended by the Dutch government for the use of AI algorithms by the Dutch public authorities.

Performer · 通道 · 樣本 · INFORMS · state-of-the-art ·

2024 年 4 月 22 日

Amplifying Main Memory-Based Timing Covert and Side Channels using Processing-in-Memory Operations

Konstantinos Kanellopoulos,F. Nisa Bostanci,Ataberk Olgun,A. Giray Yaglikci,Ismail Emir Yuksel,Nika Mansouri Ghiasi,Zulal Bingol,Mohammad Sadrosadati,Onur Mutlu

The adoption of processing-in-memory (PiM) architectures has been gaining momentum because they provide high performance and low energy consumption by alleviating the data movement bottleneck. Yet, the security of such architectures has not been thoroughly explored. The adoption of PiM solutions provides a new way to directly access main memory, which can be potentially exploited by malicious user applications. We show that this new way to access main memory opens opportunities for high-throughput timing attack vectors that are hard-to-mitigate without significant performance overhead. We introduce IMPACT, a set of high-throughput main memory-based timing attacks that leverage characteristics of PiM architectures to establish covert and side channels. IMPACT enables high-throughput communication and private information leakage. To achieve this, IMPACT (i) eliminates expensive cache bypassing steps required by processor-centric main memory and cache-based timing attacks and (ii) leverages the intrinsic parallelism of PiM operations. First, we showcase two covert-channel attack variants that run on the host CPU and leverage PiM architectures to gain direct and fast access to main memory and establish high-throughput communication covert channels. Second, we showcase a side-channel attack on a DNA sequence analysis application that leaks the private characteristics of a user's sample genome by leveraging PiM operations. Our results demonstrate that (i) our covert channels achieve up to 14.16 Mb/s communication throughput, which is 6.38x faster than the state-of-the-art main memory-based covert channels, and (ii) our side-channel attack allows the attacker to determine the properties of a sample genome at a throughput of 7.5 Mb/s with 96% accuracy. We discuss and evaluate several countermeasures for IMPACT to enable secure and robust PiM architectures.

Microsoft Surface · MoDELS · Processing（編程語言） · 模型評估 · 3D ·

2024 年 4 月 22 日

3D Uncertain Implicit Surface Mapping using GMM and GP

Qianqian Zou,Monika Sester

In this study, we address the challenge of constructing continuous three-dimensional (3D) models that accurately represent uncertain surfaces, derived from noisy and incomplete LiDAR scanning data. Building upon our prior work, which utilized the Gaussian Process (GP) and Gaussian Mixture Model (GMM) for structured building models, we introduce a more generalized approach tailored for complex surfaces in urban scenes, where GMM Regression and GP with derivative observations are applied. A Hierarchical GMM (HGMM) is employed to optimize the number of GMM components and speed up the GMM training. With the prior map obtained from HGMM, GP inference is followed for the refinement of the final map. Our approach models the implicit surface of the geo-object and enables the inference of the regions that are not completely covered by measurements. The integration of GMM and GP yields well-calibrated uncertainty estimates alongside the surface model, enhancing both accuracy and reliability. The proposed method is evaluated on real data collected by a mobile mapping system. Compared to the performance in mapping accuracy and uncertainty quantification of other methods, such as Gaussian Process Implicit Surface map (GPIS) and log-Gaussian Process Implicit Surface map (Log-GPIS), the proposed method achieves lower RMSEs, higher log-likelihood values and lower computational costs for the evaluated datasets.

控制器 · MoDELS · 總體代價 · Performer · 代價 ·

2024 年 4 月 18 日

Distributed Model Predictive Control for Heterogeneous Platoons with Affine Spacing Policies and Arbitrary Communication Topologies

Michael H. Shaham,Taskin Padir

This paper presents a distributed model predictive control (DMPC) algorithm for a heterogeneous platoon using arbitrary communication topologies, as long as each vehicle is able to communicate with a preceding vehicle in the platoon. The proposed DMPC algorithm is able to accommodate any spacing policy that is affine in a vehicle's velocity, which includes constant distance or constant time headway spacing policies. By analyzing the total cost for the entire platoon, a sufficient condition is derived to guarantee platoon asymptotic stability. Simulation experiments with a platoon of 50 vehicles and hardware experiments with a platoon of four 1/10th scale vehicles validate the algorithm and compare performance under different spacing policies and communication topologies.

回合 · Automator · 3D · 機器人 · Extensibility ·

2024 年 4 月 18 日

Automated Real-Time Inspection in Indoor and Outdoor 3D Environments with Cooperative Aerial Robots

Andreas Anastasiou,Angelos Zacharia,Savvas Papaioannou,Panayiotis Kolios,Christos G. Panayiotou,Marios M. Polycarpou

from arxiv, 2024 International Conference on Unmanned Aircraft Systems (ICUAS)

This work introduces a cooperative inspection system designed to efficiently control and coordinate a team of distributed heterogeneous UAV agents for the inspection of 3D structures in cluttered, unknown spaces. Our proposed approach employs a two-stage innovative methodology. Initially, it leverages the complementary sensing capabilities of the robots to cooperatively map the unknown environment. It then generates optimized, collision-free inspection paths, thereby ensuring comprehensive coverage of the structure's surface area. The effectiveness of our system is demonstrated through qualitative and quantitative results from extensive Gazebo-based simulations that closely replicate real-world inspection scenarios, highlighting its ability to thoroughly inspect real-world-like 3D structures.

自下而上 · 優化器 · Processing（編程語言） · Networking · 離散化 ·

2021 年 10 月 12 日

Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design

Wenhao Gao,Rocío Mercado,Connor W. Coley

Molecular design and synthesis planning are two critical steps in the process of molecular discovery that we propose to formulate as a single shared task of conditional synthetic pathway generation. We report an amortized approach to generate synthetic pathways as a Markov decision process conditioned on a target molecular embedding. This approach allows us to conduct synthesis planning in a bottom-up manner and design synthesizable molecules by decoding from optimized conditional codes, demonstrating the potential to solve both problems of design and synthesis simultaneously. The approach leverages neural networks to probabilistically model the synthetic trees, one reaction step at a time, according to reactivity rules encoded in a discrete action space of reaction templates. We train these networks on hundreds of thousands of artificial pathways generated from a pool of purchasable compounds and a list of expert-curated templates. We validate our method with (a) the recovery of molecules using conditional generation, (b) the identification of synthesizable structural analogs, and (c) the optimization of molecular structures given oracle functions relevant to drug discovery.

MoDELS · 注意力機制 · RNN · 標注 · Networking ·

2017 年 12 月 20 日

Order-Free RNN with Visual Attention for Multi-Label Classification

Shang-Fu Chen,Yi-Chen Chen,Chih-Kuan Yeh,Yu-Chiang Frank Wang

from arxiv, Accepted at 32nd AAAI Conference on Artificial Intelligence (AAAI-18)

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.