亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this work we propose an audio recording segmentation method based on an adaptive change point detection (A-CPD) for machine guided weak label annotation of audio recording segments. The goal is to maximize the amount of information gained about the temporal activation's of the target sounds. For each unlabeled audio recording, we use a prediction model to derive a probability curve used to guide annotation. The prediction model is initially pre-trained on available annotated sound event data with classes that are disjoint from the classes in the unlabeled dataset. The prediction model then gradually adapts to the annotations provided by the annotator in an active learning loop. The queries used to guide the weak label annotator towards strong labels are derived using change point detection on these probabilities. We show that it is possible to derive strong labels of high quality even with a limited annotation budget, and show favorable results for A-CPD when compared to two baseline query strategies.

相關內容

In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs. (2) Dynamic High-Resolution: we divide images into tiles ranging from 1 to 40 of 448$\times$448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input. (3) High-Quality Bilingual Dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images, and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in OCR- and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks. Code has been released at //github.com/OpenGVLab/InternVL.

With the rapid increase in machine learning workloads performed on HPC systems, it is beneficial to regularly perform machine learning specific benchmarks to monitor performance and identify issues. Furthermore, as part of the Edinburgh International Data Facility, EPCC currently hosts a wide range of machine learning accelerators including Nvidia GPUs, the Graphcore Bow Pod64 and Cerebras CS-2, which are managed via Kubernetes and Slurm. We extended the Reframe framework to support the Kubernetes scheduler backend, and utilise Reframe to perform machine learning benchmarks, and we discuss the preliminary results collected and challenges involved in integrating Reframe across multiple platforms and architectures.

To build large language models for Ukrainian we need to expand our corpora with large amounts of new algorithmic tasks expressed in natural language. Examples of task performance expressed in English are abundant, so with a high-quality translation system our community will be enabled to curate datasets faster. To aid this goal, we introduce a recipe to build a translation system using supervised finetuning of a large pretrained language model with a noisy parallel dataset of 3M pairs of Ukrainian and English sentences followed by a second phase of training using 17K examples selected by k-fold perplexity filtering on another dataset of higher quality. Our decoder-only model named Dragoman beats performance of previous state of the art encoder-decoder models on the FLORES devtest set.

This report shares the experiences, results and lessons learned in conducting a pilot project ``Responsible use of AI'' in cooperation with the Province of Friesland, Rijks ICT Gilde-part of the Ministry of the Interior and Kingdom Relations (BZK) (both in The Netherlands) and a group of members of the Z-Inspection$^{\small{\circledR}}$ Initiative. The pilot project took place from May 2022 through January 2023. During the pilot, the practical application of a deep learning algorithm from the province of Fr\^yslan was assessed. The AI maps heathland grassland by means of satellite images for monitoring nature reserves. Environmental monitoring is one of the crucial activities carried on by society for several purposes ranging from maintaining standards on drinkable water to quantifying the CO2 emissions of a particular state or region. Using satellite imagery and machine learning to support decisions is becoming an important part of environmental monitoring. The main focus of this report is to share the experiences, results and lessons learned from performing both a Trustworthy AI assessment using the Z-Inspection$^{\small{\circledR}}$ process and the EU framework for Trustworthy AI, and combining it with a Fundamental Rights assessment using the Fundamental Rights and Algorithms Impact Assessment (FRAIA) as recommended by the Dutch government for the use of AI algorithms by the Dutch public authorities.

The adoption of processing-in-memory (PiM) architectures has been gaining momentum because they provide high performance and low energy consumption by alleviating the data movement bottleneck. Yet, the security of such architectures has not been thoroughly explored. The adoption of PiM solutions provides a new way to directly access main memory, which can be potentially exploited by malicious user applications. We show that this new way to access main memory opens opportunities for high-throughput timing attack vectors that are hard-to-mitigate without significant performance overhead. We introduce IMPACT, a set of high-throughput main memory-based timing attacks that leverage characteristics of PiM architectures to establish covert and side channels. IMPACT enables high-throughput communication and private information leakage. To achieve this, IMPACT (i) eliminates expensive cache bypassing steps required by processor-centric main memory and cache-based timing attacks and (ii) leverages the intrinsic parallelism of PiM operations. First, we showcase two covert-channel attack variants that run on the host CPU and leverage PiM architectures to gain direct and fast access to main memory and establish high-throughput communication covert channels. Second, we showcase a side-channel attack on a DNA sequence analysis application that leaks the private characteristics of a user's sample genome by leveraging PiM operations. Our results demonstrate that (i) our covert channels achieve up to 14.16 Mb/s communication throughput, which is 6.38x faster than the state-of-the-art main memory-based covert channels, and (ii) our side-channel attack allows the attacker to determine the properties of a sample genome at a throughput of 7.5 Mb/s with 96% accuracy. We discuss and evaluate several countermeasures for IMPACT to enable secure and robust PiM architectures.

In this study, we address the challenge of constructing continuous three-dimensional (3D) models that accurately represent uncertain surfaces, derived from noisy and incomplete LiDAR scanning data. Building upon our prior work, which utilized the Gaussian Process (GP) and Gaussian Mixture Model (GMM) for structured building models, we introduce a more generalized approach tailored for complex surfaces in urban scenes, where GMM Regression and GP with derivative observations are applied. A Hierarchical GMM (HGMM) is employed to optimize the number of GMM components and speed up the GMM training. With the prior map obtained from HGMM, GP inference is followed for the refinement of the final map. Our approach models the implicit surface of the geo-object and enables the inference of the regions that are not completely covered by measurements. The integration of GMM and GP yields well-calibrated uncertainty estimates alongside the surface model, enhancing both accuracy and reliability. The proposed method is evaluated on real data collected by a mobile mapping system. Compared to the performance in mapping accuracy and uncertainty quantification of other methods, such as Gaussian Process Implicit Surface map (GPIS) and log-Gaussian Process Implicit Surface map (Log-GPIS), the proposed method achieves lower RMSEs, higher log-likelihood values and lower computational costs for the evaluated datasets.

This paper presents a distributed model predictive control (DMPC) algorithm for a heterogeneous platoon using arbitrary communication topologies, as long as each vehicle is able to communicate with a preceding vehicle in the platoon. The proposed DMPC algorithm is able to accommodate any spacing policy that is affine in a vehicle's velocity, which includes constant distance or constant time headway spacing policies. By analyzing the total cost for the entire platoon, a sufficient condition is derived to guarantee platoon asymptotic stability. Simulation experiments with a platoon of 50 vehicles and hardware experiments with a platoon of four 1/10th scale vehicles validate the algorithm and compare performance under different spacing policies and communication topologies.

This work introduces a cooperative inspection system designed to efficiently control and coordinate a team of distributed heterogeneous UAV agents for the inspection of 3D structures in cluttered, unknown spaces. Our proposed approach employs a two-stage innovative methodology. Initially, it leverages the complementary sensing capabilities of the robots to cooperatively map the unknown environment. It then generates optimized, collision-free inspection paths, thereby ensuring comprehensive coverage of the structure's surface area. The effectiveness of our system is demonstrated through qualitative and quantitative results from extensive Gazebo-based simulations that closely replicate real-world inspection scenarios, highlighting its ability to thoroughly inspect real-world-like 3D structures.

Molecular design and synthesis planning are two critical steps in the process of molecular discovery that we propose to formulate as a single shared task of conditional synthetic pathway generation. We report an amortized approach to generate synthetic pathways as a Markov decision process conditioned on a target molecular embedding. This approach allows us to conduct synthesis planning in a bottom-up manner and design synthesizable molecules by decoding from optimized conditional codes, demonstrating the potential to solve both problems of design and synthesis simultaneously. The approach leverages neural networks to probabilistically model the synthetic trees, one reaction step at a time, according to reactivity rules encoded in a discrete action space of reaction templates. We train these networks on hundreds of thousands of artificial pathways generated from a pool of purchasable compounds and a list of expert-curated templates. We validate our method with (a) the recovery of molecules using conditional generation, (b) the identification of synthesizable structural analogs, and (c) the optimization of molecular structures given oracle functions relevant to drug discovery.

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

北京阿比特科技有限公司