黄色视频在线观看男人插女人的视频在线观看,中文字幕AV一区二区三区亭亭色,国产一级毛片中文字幕AV,久久精品无码一区二区三区日韩

Deep Learning models like Convolutional Neural Networks (CNN) are powerful image classifiers, but what factors determine whether they attend to similar image areas as humans do? While previous studies have focused on technological factors, little is known about the role of factors that affect human attention. In the present study, we investigated how the tasks used to elicit human attention maps interact with image characteristics in modulating the similarity between humans and CNN. We varied the intentionality of human tasks, ranging from spontaneous gaze during categorization over intentional gaze-pointing up to manual area selection. Moreover, we varied the type of image to be categorized, using either singular, salient objects, indoor scenes consisting of object arrangements, or landscapes without distinct objects defining the category. The human attention maps generated in this way were compared to the CNN attention maps revealed by explainable artificial intelligence (Grad-CAM). The influence of human tasks strongly depended on image type: For objects, human manual selection produced maps that were most similar to CNN, while the specific eye movement task has little impact. For indoor scenes, spontaneous gaze produced the least similarity, while for landscapes, similarity was equally low across all human tasks. To better understand these results, we also compared the different human attention maps to each other. Our results highlight the importance of taking human factors into account when comparing the attention of humans and CNN.

相關內容

相似度

關注 2

特征選擇 · INFORMS · 情景 · 模型評估 · 統計量 ·

2023 年 11 月 30 日

A data-science pipeline to enable the Interpretability of Many-Objective Feature Selection

Uchechukwu F. Njoku,Alberto Abelló,Besim Bilalli,Gianluca Bontempi

from arxiv, 8 pages, 5 figures, 6 tables

Many-Objective Feature Selection (MOFS) approaches use four or more objectives to determine the relevance of a subset of features in a supervised learning task. As a consequence, MOFS typically returns a large set of non-dominated solutions, which have to be assessed by the data scientist in order to proceed with the final choice. Given the multi-variate nature of the assessment, which may include criteria (e.g. fairness) not related to predictive accuracy, this step is often not straightforward and suffers from the lack of existing tools. For instance, it is common to make use of a tabular presentation of the solutions, which provide little information about the trade-offs and the relations between criteria over the set of solutions. This paper proposes an original methodology to support data scientists in the interpretation and comparison of the MOFS outcome by combining post-processing and visualisation of the set of solutions. The methodology supports the data scientist in the selection of an optimal feature subset by providing her with high-level information at three different levels: objectives, solutions, and individual features. The methodology is experimentally assessed on two feature selection tasks adopting a GA-based MOFS with six objectives (number of selected features, balanced accuracy, F1-Score, variance inflation factor, statistical parity, and equalised odds). The results show the added value of the methodology in the selection of the final subset of features.

特化 · MoDELS · Learning · 稀疏化 · GROUP ·

2023 年 11 月 30 日

Less is More -- Towards parsimonious multi-task models using structured sparsity

Richa Upadhyay,Ronald Phlypo,Rajkumar Saini,Marcus Liwicki

from arxiv, accepted at First Conference on Parsimony and Learning (CPAL 2024)

Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used Multi-Task Learning (MTL) datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model's performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

線性的 · 線性模型 · 圖 · 歸納偏好 · MoDELS ·

2023 年 11 月 30 日

Relating graph auto-encoders to linear models

Solveig Klepper,Ulrike von Luxburg

from arxiv, accepted to TMLR

Graph auto-encoders are widely used to construct graph representations in Euclidean vector spaces. However, it has already been pointed out empirically that linear models on many tasks can outperform graph auto-encoders. In our work, we prove that the solution space induced by graph auto-encoders is a subset of the solution space of a linear map. This demonstrates that linear embedding models have at least the representational power of graph auto-encoders based on graph convolutional networks. So why are we still using nonlinear graph auto-encoders? One reason could be that actively restricting the linear solution space might introduce an inductive bias that helps improve learning and generalization. While many researchers believe that the nonlinearity of the encoder is the critical ingredient towards this end, we instead identify the node features of the graph as a more powerful inductive bias. We give theoretical insights by introducing a corresponding bias in a linear model and analyzing the change in the solution space. Our experiments are aligned with other empirical work on this question and show that the linear encoder can outperform the nonlinear encoder when using feature information.

機器人 · bulk · 設計 · motivation · prototype ·

2023 年 11 月 29 日

Method for robotic motion compensation during PET imaging of mobile subjects

Junxiang Wang,Iulian I. Iordachita,Peter Kazanzides

from arxiv, 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Studies of the human brain during natural activities, such as locomotion, would benefit from the ability to image deep brain structures during these activities. While Positron Emission Tomography (PET) can image these structures, the bulk and weight of current scanners are not compatible with the desire for a wearable device. This has motivated the design of a robotic system to support a PET imaging system around the subject's head and to move the system to accommodate natural motion. We report here the design and experimental evaluation of a prototype robotic system that senses motion of a subject's head, using parallel string encoders connected between the robot-supported imaging ring and a helmet worn by the subject. This measurement is used to robotically move the imaging ring (coarse motion correction) and to compensate for residual motion during image reconstruction (fine motion correction). Minimization of latency and measurement error are the key design goals, respectively, for coarse and fine motion correction. The system is evaluated using recorded human head motions during locomotion, with a mock imaging system consisting of lasers and cameras, and is shown to provide an overall system latency of about 80 ms, which is sufficient for coarse motion correction and collision avoidance, as well as a measurement accuracy of about 0.5 mm for fine motion correction.

Use Case · CASE · 可辨認的 · motivation · 相似度 ·

2023 年 11 月 29 日

LoCoMotif: Discovering time-warped motifs in time series

Daan Van Wesenbeeck,Aras Yurtman,Wannes Meert,Hendrik Blockeel

from arxiv, 26 pages, 15 figures. Submitted to the journal track of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD) 2024 in partnership with the Data Mining and Knowledge Discovery journal. Source code of the method is available at //github.com/ML-KULeuven/locomotif

Time Series Motif Discovery (TSMD) refers to the task of identifying patterns that occur multiple times (possibly with minor variations) in a time series. All existing methods for TSMD have one or more of the following limitations: they only look for the two most similar occurrences of a pattern; they only look for patterns of a pre-specified, fixed length; they cannot handle variability along the time axis; and they only handle univariate time series. In this paper, we present a new method, LoCoMotif, that has none of these limitations. The method is motivated by a concrete use case from physiotherapy. We demonstrate the value of the proposed method on this use case. We also introduce a new quantitative evaluation metric for motif discovery, and benchmark data for comparing TSMD methods. LoCoMotif substantially outperforms the existing methods, on top of being more broadly applicable.

MoDELS · 可理解性 · Prompt · 語言模型化 · SimPLe ·

2023 年 11 月 29 日

SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models

Shanshan Zhong,Zhongzhan Huang,Wushao Wen,Jinghui Qin,Liang Lin

from arxiv, accepted by ACM MM 2023

Diffusion models, which have emerged to become popular text-to-image generation models, can produce high-quality and content-rich images guided by textual prompts. However, there are limitations to semantic understanding and commonsense reasoning in existing models when the input prompts are concise narrative, resulting in low-quality image generation. To improve the capacities for narrative prompts, we propose a simple-yet-effective parameter-efficient fine-tuning approach called the Semantic Understanding and Reasoning adapter (SUR-adapter) for pre-trained diffusion models. To reach this goal, we first collect and annotate a new dataset SURD which consists of more than 57,000 semantically corrected multi-modal samples. Each sample contains a simple narrative prompt, a complex keyword-based prompt, and a high-quality image. Then, we align the semantic representation of narrative prompts to the complex prompts and transfer knowledge of large language models (LLMs) to our SUR-adapter via knowledge distillation so that it can acquire the powerful semantic understanding and reasoning capabilities to build a high-quality textual semantic representation for text-to-image generation. We conduct experiments by integrating multiple LLMs and popular pre-trained diffusion models to show the effectiveness of our approach in enabling diffusion models to understand and reason concise natural language without image quality degradation. Our approach can make text-to-image diffusion models easier to use with better user experience, which demonstrates our approach has the potential for further advancing the development of user-friendly text-to-image generation models by bridging the semantic gap between simple narrative prompts and complex keyword-based prompts. The code is released at //github.com/Qrange-group/SUR-adapter.

MoDELS · 共軛 · Networking · Performer · Neural Networks ·

2023 年 11 月 24 日

Deep convolutional encoder-decoder hierarchical neural networks for conjugate heat transfer surrogate modeling

Takiah Ebbs-Picken,David A. Romero,Carlos M. Da Silva,Cristina H. Amon

Conjugate heat transfer (CHT) models are vital for the design of many engineering systems. However, high-fidelity CHT models are computationally intensive, which limits their use in applications such as design optimization, where hundreds to thousands of model evaluations are required. In this work, we develop a modular deep convolutional encoder-decoder hierarchical (DeepEDH) neural network, a novel deep-learning-based surrogate modeling methodology for computationally intensive CHT models. Leveraging convective temperature dependencies, we propose a two-stage temperature prediction architecture that couples velocity and temperature models. The proposed DeepEDH methodology is demonstrated by modeling the pressure, velocity, and temperature fields for a liquid-cooled cold-plate-based battery thermal management system with variable channel geometry. A computational model of the cold plate is developed and solved using the finite element method (FEM), generating a dataset of 1,500 simulations. The FEM results are transformed and scaled from unstructured to structured, image-like meshes to create training and test datasets. The DeepEDH methodology's performance is examined in relation to data scaling, training dataset size, and network depth. Our performance analysis covers the impact of the novel architecture, separate field models, output geometry masks, multi-stage temperature models, and optimizations of the hyperparameters and architecture. Furthermore, we quantify the influence of the CHT thermal boundary condition on surrogate model performance, highlighting improved temperature model performance with higher heat fluxes. Compared to other deep learning neural network surrogate models, such as U-Net and DenseED, the proposed DeepEDH methodology for CHT models exhibits up to a 65% enhancement in the coefficient of determination ($R^{2}$).

圖 · 異常點 · Networking · Performer · Learning ·

2023 年 8 月 13 日

Learning on Graphs with Out-of-Distribution Nodes

Yu Song,Donglin Wang

from arxiv, Accepted by KDD'22

Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs. While existing GNNs have shown great performance on various tasks related to graphs, little attention has been paid to the scenario where out-of-distribution (OOD) nodes exist in the graph during training and inference. Borrowing the concept from CV and NLP, we define OOD nodes as nodes with labels unseen from the training set. Since a lot of networks are automatically constructed by programs, real-world graphs are often noisy and may contain nodes from unknown distributions. In this work, we define the problem of graph learning with out-of-distribution nodes. Specifically, we aim to accomplish two tasks: 1) detect nodes which do not belong to the known distribution and 2) classify the remaining nodes to be one of the known classes. We demonstrate that the connection patterns in graphs are informative for outlier detection, and propose Out-of-Distribution Graph Attention Network (OODGAT), a novel GNN model which explicitly models the interaction between different kinds of nodes and separate inliers from outliers during feature propagation. Extensive experiments show that OODGAT outperforms existing outlier detection methods by a large margin, while being better or comparable in terms of in-distribution classification.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

Performer · Color · Networking · CRAFT · 均方誤差 ·

2018 年 1 月 25 日

C2MSNet: A Novel approach for single image haze removal

Akshay Dudhane,Subrahmanyam Murala

from arxiv, Accepted in Winter Conference on Applications of Computer Vision (WACV-2018)

Degradation of image quality due to the presence of haze is a very common phenomenon. Existing DehazeNet [3], MSCNN [11] tackled the drawbacks of hand crafted haze relevant features. However, these methods have the problem of color distortion in gloomy (poor illumination) environment. In this paper, a cardinal (red, green and blue) color fusion network for single image haze removal is proposed. In first stage, network fusses color information present in hazy images and generates multi-channel depth maps. The second stage estimates the scene transmission map from generated dark channels using multi channel multi scale convolutional neural network (McMs-CNN) to recover the original scene. To train the proposed network, we have used two standard datasets namely: ImageNet [5] and D-HAZY [1]. Performance evaluation of the proposed approach has been carried out using structural similarity index (SSIM), mean square error (MSE) and peak signal to noise ratio (PSNR). Performance analysis shows that the proposed approach outperforms the existing state-of-the-art methods for single image dehazing.