欧美精品日韩精品国内精品_亚洲一区二区免费视频_天天爽天天干天天透天天要天天射_亚洲日韩欧在线观看_欧美成综合网网站_在线观看免费黄色视频_中文在线字幕日韩一区二区

While the term `art' defies any concrete definition, this paper aims to examine how digital images produced by generative AI systems, such as Midjourney, have come to be so regularly referred to as such. The discourse around the classification of AI-generated imagery as art is currently somewhat homogeneous, lacking the more nuanced aspects that would apply to more traditional modes of artistic media production. This paper aims to bring important philosophical considerations to the surface of the discussion around AI-generated imagery in the context of art. We employ existing philosophical frameworks and theories of language to suggest that some AI-generated imagery, by virtue of its visual properties within these frameworks, can be presented as `readymades' for consideration as art.

相關內容

Midjourney

關注 1

MoDELS · INFORMS · Performer · contrastive · 可行 ·

2023 年 9 月 1 日

Amyloid-Beta Axial Plane PET Synthesis from Structural MRI: An Image Translation Approach for Screening Alzheimer's Disease

Fernando Vega,Abdoljalil Addeh,M. Ethan MacDonald

from arxiv, Abstract submitted and presented to the International Society of Magnetic Resonance in Medicine (ISMRM 2023), Toronto, Canada

In this work, an image translation model is implemented to produce synthetic amyloid-beta PET images from structural MRI that are quantitatively accurate. Image pairs of amyloid-beta PET and structural MRI were used to train the model. We found that the synthetic PET images could be produced with a high degree of similarity to truth in terms of shape, contrast and overall high SSIM and PSNR. This work demonstrates that performing structural to quantitative image translation is feasible to enable the access amyloid-beta information from only MRI.

MoDELS · Guidance · 優化器 · Extensibility · 相互獨立的 ·

2023 年 9 月 1 日

MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance

Ernie Chu,Tzuhsuan Huang,Shuo-Yen Lin,Jun-Cheng Chen

from arxiv, Project page: //medm2023.github.io

This study introduces an efficient and effective method, MeDM, that utilizes pre-trained image Diffusion Models for video-to-video translation with consistent temporal flow. The proposed framework can render videos from scene position information, such as a normal G-buffer, or perform text-guided editing on videos captured in real-world scenarios. We employ explicit optical flows to construct a practical coding that enforces physical constraints on generated frames and mediates independent frame-wise scores. By leveraging this coding, maintaining temporal consistency in the generated videos can be framed as an optimization problem with a closed-form solution. To ensure compatibility with Stable Diffusion, we also suggest a workaround for modifying observed-space scores in latent-space Diffusion Models. Notably, MeDM does not require fine-tuning or test-time optimization of the Diffusion Models. Through extensive qualitative, quantitative, and subjective experiments on various benchmarks, the study demonstrates the effectiveness and superiority of the proposed approach. Project page can be found at //medm2023.github.io

Markov · 隱馬爾科夫模型 · MoDELS · 估計/估計量 · 線性的 ·

2023 年 9 月 1 日

Nonparametric Identification and Estimation of Earnings Dynamics using a Hidden Markov Model: Evidence from the PSID

Tong Zhou

This paper presents a hidden Markov model designed to investigate the complex nature of earnings persistence. The proposed model assumes that the residuals of log-earnings consist of a persistent component and a transitory component, both following general Markov processes. Nonparametric identification is achieved through spectral decomposition of linear operators, and a modified stochastic EM algorithm is introduced for model estimation. Applying the framework to the Panel Study of Income Dynamics (PSID) dataset, we find that the earnings process displays nonlinear persistence, conditional skewness, and conditional kurtosis. Additionally, the transitory component is found to possess non-Gaussian properties, resulting in a significantly asymmetric distributional impact when high-earning households face negative shocks or low-earning households encounter positive shocks. Our empirical findings also reveal the presence of ARCH effects in earnings at horizons ranging from 2 to 8 years, further highlighting the complex dynamics of earnings persistence.

Learning · 正則化項 · 表示 · 潛在 · INFORMS ·

2023 年 8 月 31 日

RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability

Chuning Zhu,Max Simchowitz,Siri Gadipudi,Abhishek Gupta

Visual model-based RL methods typically encode image observations into low-dimensional representations in a manner that does not eliminate redundant information. This leaves them susceptible to spurious variations -- changes in task-irrelevant components such as background distractors or lighting conditions. In this paper, we propose a visual model-based RL method that learns a latent representation resilient to such spurious variations. Our training objective encourages the representation to be maximally predictive of dynamics and reward, while constraining the information flow from the observation to the latent representation. We demonstrate that this objective significantly bolsters the resilience of visual model-based RL methods to visual distractors, allowing them to operate in dynamic environments. We then show that while the learned encoder is resilient to spirious variations, it is not invariant under significant distribution shift. To address this, we propose a simple reward-free alignment procedure that enables test time adaptation of the encoder. This allows for quick adaptation to widely differing environments without having to relearn the dynamics and policy. Our effort is a step towards making model-based RL a practical and useful tool for dynamic, diverse domains. We show its effectiveness in simulation benchmarks with significant spurious variations as well as a real-world egocentric navigation task with noisy TVs in the background. Videos and code at //zchuning.github.io/repo-website/.

Performer · 極大似然 · state-of-the-art · 成對型 · 蒙特卡羅 ·

2023 年 8 月 31 日

On the Performance of RIS-Aided Spatial Scattering Modulation for mmWave Transmission

Xusheng Zhu,Wen Chen,Zhendong Li,Qingqing Wu,Ziheng Zhang,Kunlun Wang,Jun Li

from arxiv, arXiv admin note: substantial text overlap with arXiv:2307.14662

In this paper, we investigate a state-of-the-art reconfigurable intelligent surface (RIS)-assisted spatial scattering modulation (SSM) scheme for millimeter-wave (mmWave) systems, where a more practical scenario that the RIS is near the transmitter while the receiver is far from RIS is considered. To this end, the line-of-sight (LoS) and non-LoS links are utilized in the transmitter-RIS and RIS-receiver channels, respectively. By employing the maximum likelihood detector at the receiver, the conditional pairwise error probability (CPEP) expression for the RIS-SSM scheme is derived under the two scenarios that the received beam demodulation is correct or not. Furthermore, the union upper bound of average bit error probability (ABEP) is obtained based on the CPEP expression. Finally, the derivation results are exhaustively validated by the Monte Carlo simulations.

估計/估計量 · 3D · MoDELS · 近似 · 機器人 ·

2023 年 8 月 31 日

6D Object Pose Estimation from Approximate 3D Models for Orbital Robotics

Maximilian Ulmer,Maximilian Durner,Martin Sundermeyer,Manuel Stoiber,Rudolph Triebel

from arxiv, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

We present a novel technique to estimate the 6D pose of objects from single images where the 3D geometry of the object is only given approximately and not as a precise 3D model. To achieve this, we employ a dense 2D-to-3D correspondence predictor that regresses 3D model coordinates for every pixel. In addition to the 3D coordinates, our model also estimates the pixel-wise coordinate error to discard correspondences that are likely wrong. This allows us to generate multiple 6D pose hypotheses of the object, which we then refine iteratively using a highly efficient region-based approach. We also introduce a novel pixel-wise posterior formulation by which we can estimate the probability for each hypothesis and select the most likely one. As we show in experiments, our approach is capable of dealing with extreme visual conditions including overexposure, high contrast, or low signal-to-noise ratio. This makes it a powerful technique for the particularly challenging task of estimating the pose of tumbling satellites for in-orbit robotic applications. Our method achieves state-of-the-art performance on the SPEED+ dataset and has won the SPEC2021 post-mortem competition.

MoDELS · 變換 · Vision · Extensibility · state-of-the-art ·

2023 年 8 月 31 日

IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer

Xiaochen Ma,Bo Du,Zhuohang Jiang,Ahmed Y. Al Hammadi,Jizhe Zhou

Advanced image tampering techniques are increasingly challenging the trustworthiness of multimedia, leading to the development of Image Manipulation Localization (IML). But what makes a good IML model? The answer lies in the way to capture artifacts. Exploiting artifacts requires the model to extract non-semantic discrepancies between manipulated and authentic regions, necessitating explicit comparisons between the two areas. With the self-attention mechanism, naturally, the Transformer should be a better candidate to capture artifacts. However, due to limited datasets, there is currently no pure ViT-based approach for IML to serve as a benchmark, and CNNs dominate the entire task. Nevertheless, CNNs suffer from weak long-range and non-semantic modeling. To bridge this gap, based on the fact that artifacts are sensitive to image resolution, amplified under multi-scale features, and massive at the manipulation border, we formulate the answer to the former question as building a ViT with high-resolution capacity, multi-scale feature extraction capability, and manipulation edge supervision that could converge with a small amount of data. We term this simple but effective ViT paradigm IML-ViT, which has significant potential to become a new benchmark for IML. Extensive experiments on five benchmark datasets verified our model outperforms the state-of-the-art manipulation localization methods.Code and models are available at \url{//github.com/SunnyHaze/IML-ViT}.

Vision · 圖 · 變換 · Networking · 圖形處理器 ·

2022 年 9 月 27 日

A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective

Chaoqi Chen,Yushuang Wu,Qiyuan Dai,Hong-Yu Zhou,Mutian Xu,Sibei Yang,Xiaoguang Han,Yizhou Yu

from arxiv, Preprint

Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (\emph{e.g.,} social network analysis and recommender systems), computer vision (\emph{e.g.,} object detection and point cloud learning), and natural language processing (\emph{e.g.,} relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, \emph{i.e.,} 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.

entity · MINE · 可約的 · 規范化的 · 實體對齊 ·

2021 年 3 月 29 日

Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Xin Mao,Wenting Wang,Yuanbin Wu,Man Lan

from arxiv, 12 pages; Accepted by TheWebConf(WWW) 2021

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.

Networking · 數據集 · 遷移學習 · 學成 · 可約的 ·

2018 年 5 月 10 日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Assia Benbihi,Matthieu Geist,Cédric Pradalier

Deep Convolutional Neural Networks have pushed the state-of-the art for semantic segmentation provided that a large amount of images together with pixel-wise annotations is available. Data collection is expensive and a solution to alleviate it is to use transfer learning. This reduces the amount of annotated data required for the network training but it does not get rid of this heavy processing step. We propose a method of transfer learning without annotations on the target task for datasets with redundant content and distinct pixel distributions. Our method takes advantage of the approximate content alignment of the images between two datasets when the approximation error prevents the reuse of annotation from one dataset to another. Given the annotations for only one dataset, we train a first network in a supervised manner. This network autonomously learns to generate deep data representations relevant to the semantic segmentation. Then the images in the new dataset, we train a new network to generate a deep data representation that matches the one from the first network on the previous dataset. The training consists in a regression between feature maps and does not require any annotations on the new dataset. We show that this method reaches performances similar to a classic transfer learning on the PASCAL VOC dataset with synthetic transformations.