亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='fqepc'></tfoot>

<legend id='fqepc'><style id='fqepc'><dir id='fqepc'><q id='fqepc'></q></dir></style></legend>

<i id='fqepc'><tr id='fqepc'><dt id='fqepc'><q id='fqepc'><span id='fqepc'><b id='fqepc'><form id='fqepc'><ins id='fqepc'></ins><ul id='fqepc'></ul><sub id='fqepc'></sub></form><legend id='fqepc'></legend><bdo id='fqepc'><pre id='fqepc'><center id='fqepc'></center></pre></bdo></b><th id='fqepc'></th></span></q></dt></tr></i><div id='fqepc'><tfoot id='fqepc'></tfoot><dl id='fqepc'><fieldset id='fqepc'></fieldset></dl></div>

·

磁流變材料 · 可辨認的 · INFORMS · 可理解性 · Cognition ·

2024 年 1 月 20 日

Sound Unblending: Exploring Sound Manipulations for Accessible Mixed-Reality Awareness

Ruei-Che Chang,Chia-Sheng Hung,Bing-Yu Chen,Dhruv Jain,Anhong Guo

Mixed-reality (MR) soundscapes blend real-world sound with virtual audio from hearing devices, presenting intricate auditory information that is hard to discern and differentiate. This is particularly challenging for blind or visually impaired individuals, who rely on sounds and descriptions in their everyday lives. To understand how complex audio information is consumed, we analyzed online forum posts within the blind community, identifying prevailing challenges, needs, and desired solutions. We synthesized the results and proposed Sound Unblending for increasing MR sound awareness, which includes six sound manipulations: Ambience Builder, Feature Shifter, Earcon Generator, Prioritizer, Spatializer, and Stylizer. To evaluate the effectiveness of sound unblending, we conducted a user study with 18 blind participants across three simulated MR scenarios, where participants identified specific sounds within intricate soundscapes. We found that sound unblending increased MR sound awareness and minimized cognitive load. Finally, we developed three real-world example applications to demonstrate the practicality of sound unblending.

相關內容

磁流變材料

磁流變材料

磁流變（Magnetorheological，簡稱MR）材料是一種流變性能可由磁場控制的新型智能材料。由于其響應快（ms量級）、可逆性好（撤去磁場后，又恢復初始狀態）、以及通過調節磁場大小來控制材料的力學性能連續變化，因而近年來在汽車、建筑、振動控制等領域得到廣泛應用。

無偏 · 未標記 · 標注 · Learning · 假陽性 ·

2024 年 3 月 5 日

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

Yuxin Guo,Shijie Ma,Hu Su,Zhiqing Wang,Yuhao Zhao,Wei Zou,Siyang Sun,Yun Zheng

from arxiv, Accepted to NeurIPS2023

Audio-Visual Source Localization (AVSL) aims to locate sounding objects within video frames given the paired audio clips. Existing methods predominantly rely on self-supervised contrastive learning of audio-visual correspondence. Without any bounding-box annotations, they struggle to achieve precise localization, especially for small objects, and suffer from blurry boundaries and false positives. Moreover, the naive semi-supervised method is poor in fully leveraging the information of abundant unlabeled data. In this paper, we propose a novel semi-supervised learning framework for AVSL, namely Dual Mean-Teacher (DMT), comprising two teacher-student structures to circumvent the confirmation bias issue. Specifically, two teachers, pre-trained on limited labeled data, are employed to filter out noisy samples via the consensus between their predictions, and then generate high-quality pseudo-labels by intersecting their confidence maps. The sufficient utilization of both labeled and unlabeled data and the proposed unbiased framework enable DMT to outperform current state-of-the-art methods by a large margin, with CIoU of 90.4% and 48.8% on Flickr-SoundNet and VGG-Sound Source, obtaining 8.9%, 9.6% and 4.6%, 6.4% improvements over self- and semi-supervised methods respectively, given only 3% positional-annotations. We also extend our framework to some existing AVSL methods and consistently boost their performance.

逼真度 · 噪聲 · 去噪 · 潛在 · MoDELS ·

2024 年 3 月 5 日

Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation

Weijie Li,Litong Gong,Yiran Zhu,Fanda Fan,Biao Wang,Tiezheng Ge,Bo Zheng

Image-to-video (I2V) generation tasks always suffer from keeping high fidelity in the open domains. Traditional image animation techniques primarily focus on specific domains such as faces or human poses, making them difficult to generalize to open domains. Several recent I2V frameworks based on diffusion models can generate dynamic content for open domain images but fail to maintain fidelity. We found that two main factors of low fidelity are the loss of image details and the noise prediction biases during the denoising process. To this end, we propose an effective method that can be applied to mainstream video diffusion models. This method achieves high fidelity based on supplementing more precise image information and noise rectification. Specifically, given a specified image, our method first adds noise to the input image latent to keep more details, then denoises the noisy latent with proper rectification to alleviate the noise prediction biases. Our method is tuning-free and plug-and-play. The experimental results demonstrate the effectiveness of our approach in improving the fidelity of generated videos. For more image-to-video generated results, please refer to the project website: //noise-rectification.github.io.

多跳 · Performer · 估計/估計量 · MoDELS · 有偏 ·

2024 年 3 月 5 日

Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment

Congzhi Zhang,Linhai Zhang,Deyu Zhou

from arxiv, Accepted by AAAI 2024

Conventional multi-hop fact verification models are prone to rely on spurious correlations from the annotation artifacts, leading to an obvious performance decline on unbiased datasets. Among the various debiasing works, the causal inference-based methods become popular by performing theoretically guaranteed debiasing such as casual intervention or counterfactual reasoning. However, existing causal inference-based debiasing methods, which mainly formulate fact verification as a single-hop reasoning task to tackle shallow bias patterns, cannot deal with the complicated bias patterns hidden in multiple hops of evidence. To address the challenge, we propose Causal Walk, a novel method for debiasing multi-hop fact verification from a causal perspective with front-door adjustment. Specifically, in the structural causal model, the reasoning path between the treatment (the input claim-evidence graph) and the outcome (the veracity label) is introduced as the mediator to block the confounder. With the front-door adjustment, the causal effect between the treatment and the outcome is decomposed into the causal effect between the treatment and the mediator, which is estimated by applying the idea of random walk, and the causal effect between the mediator and the outcome, which is estimated with normalized weighted geometric mean approximation. To investigate the effectiveness of the proposed method, an adversarial multi-hop fact verification dataset and a symmetric multi-hop fact verification dataset are proposed with the help of the large language model. Experimental results show that Causal Walk outperforms some previous debiasing methods on both existing datasets and the newly constructed datasets. Code and data will be released at //github.com/zcccccz/CausalWalk.

語言模型化 · 任務對話系統 · 可理解性 · MoDELS · Learning ·

2024 年 3 月 4 日

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Zhifeng Kong,Arushi Goel,Rohan Badlani,Wei Ping,Rafael Valle,Bryan Catanzaro

Augmenting large language models (LLMs) to understand audio -- including non-speech sounds and non-verbal speech -- is critically important for diverse real-world applications of LLMs. In this paper, we propose Audio Flamingo, a novel audio language model with 1) strong audio understanding abilities, 2) the ability to quickly adapt to unseen tasks via in-context learning and retrieval, and 3) strong multi-turn dialogue abilities. We introduce a series of training techniques, architecture design, and data strategies to enhance our model with these abilities. Extensive evaluations across various audio understanding tasks confirm the efficacy of our method, setting new state-of-the-art benchmarks. Our demo website is: \url{//audioflamingo.github.io/}.

Color · INTERACT · 可辨認的 · 可約的 · Continuity ·

2024 年 3 月 4 日

Piet: Facilitating Color Authoring for Motion Graphics Video

Xinyu Shi,Yinghou Wang,Yun Wang,Jian Zhao

from arxiv, Accepted by CHI 2024

Motion graphic (MG) videos are effective and compelling for presenting complex concepts through animated visuals; and colors are important to convey desired emotions, maintain visual continuity, and signal narrative transitions. However, current video color authoring workflows are fragmented, lacking contextual previews, hindering rapid theme adjustments, and not aligning with progressive authoring flows of designers. To bridge this gap, we introduce Piet, the first tool tailored for MG video color authoring. Piet features an interactive palette to visually represent color distributions, support controllable focus levels, and enable quick theme probing via grouped color shifts. We interviewed 6 domain experts to identify the frustrations in current tools and inform the design of Piet. An in-lab user study with 13 expert designers showed that Piet effectively simplified the MG video color authoring and reduced the friction in creative color theme exploration.

SimPLe · MoDELS · 數據集 · binary · state-of-the-art ·

2024 年 3 月 2 日

ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies

Oren Sultan,Yonatan Bitton,Ron Yosef,Dafna Shahaf

Analogy-making is central to human cognition, allowing us to adapt to novel situations -- an ability that current AI systems still lack. Most analogy datasets today focus on simple analogies (e.g., word analogies); datasets including complex types of analogies are typically manually curated and very small. We believe that this holds back progress in computational analogy. In this work, we design a data generation pipeline, ParallelPARC (Parallel Paragraph Creator) leveraging state-of-the-art Large Language Models (LLMs) to create complex, paragraph-based analogies, as well as distractors, both simple and challenging. We demonstrate our pipeline and create ProPara-Logy, a dataset of analogies between scientific processes. We publish a gold-set, validated by humans, and a silver-set, generated automatically. We test LLMs' and humans' analogy recognition in binary and multiple-choice settings, and found that humans outperform the best models (~13% gap) after a light supervision. We demonstrate that our silver-set is useful for training models. Lastly, we show challenging distractors confuse LLMs, but not humans. We hope our pipeline will encourage research in this emerging field.

Networking · CNN · MoDELS · Performer · 數學 ·

2023 年 3 月 5 日

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

Xuan Shen,Yaohua Wang,Ming Lin,Yilun Huang,Hao Tang,Xiuyu Sun,Yanzhi Wang

from arxiv, Accepted by CVPR 2023

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN models can achieve as good performance as ViT models when carefully tuned. While encouraging, designing such high-performance CNN models is challenging, requiring non-trivial prior knowledge of network design. To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way. In DeepMAD, a CNN network is modeled as an information processing system whose expressiveness and effectiveness can be analytically formulated by their structural parameters. Then a constrained mathematical programming (MP) problem is proposed to optimize these structural parameters. The MP problem can be easily solved by off-the-shelf MP solvers on CPUs with a small memory footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or training data is required during network design. The superiority of DeepMAD is validated on multiple large-scale computer vision benchmark datasets. Notably on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves 0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and 0.8% and 0.9% higher on Small level.

Pyramid · MoDELS · Extensibility · state-of-the-art · Performer ·

2022 年 12 月 1 日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Wan-Cyuan Fan,Yen-Chun Chen,Dongdong Chen,Yu Cheng,Lu Yuan,Yu-Chiang Frank Wang

from arxiv, AAAI 2023

Diffusion models (DMs) have shown great potential for high-quality image synthesis. However, when it comes to producing images with complex scenes, how to properly describe both image global structures and object details remains a challenging task. In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. Our model decomposes an input image into scale-dependent vector quantized features, followed by a coarse-to-fine gating for producing image output. During the above multi-scale representation learning stage, additional input conditions like text, scene graph, or image layout can be further exploited. Thus, Frido can be also applied for conditional or cross-modality image synthesis. We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and OpenImages, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO. Code is available at //github.com/davidhalladay/Frido.

學成 · Networking · INFORMS · Performer · Neural Networks ·

2020 年 2 月 27 日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh,Sunwoo Cho,Nam Ik Cho

from arxiv, Will be presented in CVPR 2020

Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a "bicubic" downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process.

樣例 · 相似度 · 語音識別 · 端到端 · 轉錄 ·

2018 年 1 月 5 日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini,David Wagner

We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (at a rate of up to 50 characters per second). We apply our iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

磁流變材料

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='fqepc'><strong id='fqepc'></strong><small id='fqepc'></small><button id='fqepc'></button><li id='fqepc'><noscript id='fqepc'><big id='fqepc'></big><dt id='fqepc'></dt></noscript></li></tr><ol id='fqepc'><option id='fqepc'><table id='fqepc'><blockquote id='fqepc'><tbody id='fqepc'></tbody></blockquote></table></option></ol><u id='fqepc'></u><kbd id='fqepc'><kbd id='fqepc'></kbd></kbd>

<code id='fqepc'><strong id='fqepc'></strong></code>

<fieldset id='fqepc'></fieldset>

<span id='fqepc'></span>

<ins id='fqepc'></ins>

<acronym id='fqepc'><em id='fqepc'></em><td id='fqepc'><div id='fqepc'></div></td></acronym><address id='fqepc'><big id='fqepc'><big id='fqepc'></big><legend id='fqepc'></legend></big></address>

<i id='fqepc'><div id='fqepc'><ins id='fqepc'></ins></div></i>

<i id='fqepc'></i>