苹果电影在线观看免费高清_亚洲精品无码黄色网站在线观看_日韩精品免费视频一区二区三区_色偷偷激情日本亚洲一区二区_国产激情视频一区二区_一区二区三区高清无马在线_欲帝精品导航

We address the problem of generating realistic 3D human-object interactions (HOIs) driven by textual prompts. To this end, we take a modular design and decompose the complex task into simpler sub-tasks. We first develop a dual-branch diffusion model (HOI-DM) to generate both human and object motions conditioned on the input text, and encourage coherent motions by a cross-attention communication module between the human and object motion generation branches. We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object during the interactions driven by the textual prompt. The APDM is independent of the results by the HOI-DM and thus can correct potential errors by the latter. Moreover, it stochastically generates the contacting points to diversify the generated motions. Finally, we incorporate the estimated contacting points into the classifier-guidance to achieve accurate and close contact between humans and objects. To train and evaluate our approach, we annotate BEHAVE dataset with text descriptions. Experimental results on BEHAVE and OMOMO demonstrate that our approach produces realistic HOIs with various interactions and different types of objects.

相關內容

INTERACT

關注 5

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · 數據集 · 自頂向下 · 縮放 · Facebook ·

2024 年 4 月 26 日

Bored to Death: Artificial Intelligence Research Reveals the Role of Boredom in Suicide Behavior

Shir Lissak,Yaakov Ophir,Refael Tikochinski,Anat Brunstein Klomek,Itay Sisso,Eyal Fruchter,Roi Reichart

Background: Recent advancements in Artificial Intelligence (AI) contributed significantly to suicide assessment, however, our theoretical understanding of this complex behavior is still limited. Objective: This study aimed to harness AI methodologies to uncover hidden risk factors that trigger or aggravate suicide behaviors. Method: The primary dataset included 228,052 Facebook postings by 1,006 users who completed the gold-standard Columbia Suicide Severity Rating Scale. This dataset was analyzed using a bottom-up research pipeline without a-priory hypotheses and its findings were validated using a top-down analysis of a new dataset. This secondary dataset included responses by 1,062 participants to the same suicide scale as well as to well-validated scales measuring depression and boredom. Results: An almost fully automated, AI-guided research pipeline resulted in four Facebook topics that predicted the risk of suicide, of which the strongest predictor was boredom. A comprehensive literature review using APA PsycInfo revealed that boredom is rarely perceived as a unique risk factor of suicide. A complementing top-down path analysis of the secondary dataset uncovered an indirect relationship between boredom and suicide, which was mediated by depression. An equivalent mediated relationship was observed in the primary Facebook dataset as well. However, here, a direct relationship between boredom and suicide risk was also observed. Conclusions: Integrating AI methods allowed the discovery of an under-researched risk factor of suicide. The study signals boredom as a maladaptive 'ingredient' that might trigger suicide behaviors, regardless of depression. Further studies are recommended to direct clinicians' attention to this burdening, and sometimes existential experience.

置信度 · MoDELS · 泛函 · 獎勵函數 · Performer ·

2024 年 4 月 26 日

When to Trust LLMs: Aligning Confidence with Response Quality

Shuchang Tao,Liuyi Yao,Hanxing Ding,Yuexiang Xie,Qi Cao,Fei Sun,Jinyang Gao,Huawei Shen,Bolin Ding

Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods, which rely on verbalizing confidence to tell the reliability by inducing top-k responses and sampling-aggregating multiple responses, often fail, due to the lack of objective guidance of confidence. To address this, we propose CONfidence-Quality-ORDerpreserving alignment approach (CONQORD), leveraging reinforcement learning with a tailored dual-component reward function. This function encompasses quality reward and orderpreserving alignment reward functions. Specifically, the order-preserving reward incentivizes the model to verbalize greater confidence for responses of higher quality to align the order of confidence and quality. Experiments demonstrate that our CONQORD significantly improves the alignment performance between confidence levels and response accuracy, without causing the model to become over-cautious. Furthermore, the aligned confidence provided by CONQORD informs when to trust LLMs, and acts as a determinant for initiating the retrieval process of external knowledge. Aligning confidence with response quality ensures more transparent and reliable responses, providing better trustworthiness.

設計 · 變換 · 有向 · 可約的 · Integration ·

2024 年 4 月 26 日

Beyond Efficiency and Convenience. Using Post-growth Values as a Nucleus to Transform Design Education and Society

Matthias Laschke,Lenneke Kuijer

In this position paper we present Municipan, an artefact resulting from a post-growth design experiment, applied in a student design project. In contrast to mainstream human-centered design directed at efficiency and convenience, which we argue leads to deskilling, dependency, and the progression of the climate crisis, we challenged students to envision an opposite user that is willing to invest time and effort and learn new skills. While Municipan is not a direct step towards a postgrowth society, integrating the way it was created in design education can act as a nucleus, bringing forth design professionals inclined to create technologies with potential to gradually transform society towards postgrowth living. Bringing in examples from our own research, we illustrate that designs created in this mindset, such as heating systems that train cold resistance, or navigation systems that train orientation have potential to reskill users, reduce technological dependency and steer consumption within planetary limits.

INFORMS · Agent · 設計 · Extensibility · 可辨認的 ·

2024 年 4 月 26 日

Designing Shared Information Displays for Agents of Varying Strategic Sophistication

Dongping Zhang,Jason Hartline,Jessica Hullman

from arxiv, 34 pages, 11 figures, 7 tables. Accepted by ACM CSCW 2024

Data-driven predictions are often perceived as inaccurate in hindsight due to behavioral responses. In this study, we explore the role of interface design choices in shaping individuals' decision-making processes in response to predictions presented on a shared information display in a strategic setting. We introduce a novel staged experimental design to investigate the effects of design features, such as visualizations of prediction uncertainty and error, within a repeated congestion game. In this game, participants assume the role of taxi drivers and use a shared information display to decide where to search for their next ride. Our experimental design endows agents with varying level-$k$ depths of thinking, allowing some agents to possess greater sophistication in anticipating the decisions of others using the same information display. Through several extensive experiments, we identify trade-offs between displays that optimize individual decisions and those that best serve the collective social welfare of the system. We find that the influence of display characteristics varies based on an agent's strategic sophistication. We observe that design choices promoting individual-level decision-making can lead to suboptimal system outcomes, as manifested by a lower realization of potential social welfare. However, this decline in social welfare is offset by a reduction in the distribution shift, narrowing the gap between predicted and realized system outcomes, which potentially enhances the perceived reliability and trustworthiness of the information display post hoc. Our findings pave the way for new research questions concerning the design of effective prediction interfaces in strategic environments.

層 · 解碼 · 推斷 · 暫退法 · MoDELS ·

2024 年 4 月 25 日

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Mostafa Elhoushi,Akshat Shrivastava,Diana Liskovich,Basil Hosmer,Bram Wasti,Liangzhen Lai,Anas Mahmoud,Bilge Acun,Saurabh Agarwal,Ahmed Roman,Ahmed A Aly,Beidi Chen,Carole-Jean Wu

from arxiv, Code open sourcing is in progress

We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exit at earlier layers, without adding any auxiliary layers or modules to the model. Third, we present a novel self-speculative decoding solution where we exit at early layers and verify and correct with remaining layers of the model. Our proposed self-speculative decoding approach has less memory footprint than other speculative decoding approaches and benefits from shared compute and activations of the draft and verification stages. We run experiments on different Llama model sizes on different types of training: pretraining from scratch, continual pretraining, finetuning on specific data domain, and finetuning on specific task. We implement our inference solution and show speedups of up to 2.16x on summarization for CNN/DM documents, 1.82x on coding, and 2.0x on TOPv2 semantic parsing task.

噪聲 · MoDELS · Learning · 講稿 · 分離的 ·

2024 年 4 月 25 日

One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns

Arman Maesumi,Dylan Hu,Krishi Saripalli,Vladimir G. Kim,Matthew Fisher,S?ren Pirk,Daniel Ritchie

from arxiv, In ACM Transactions on Graphics (Proceedings of SIGGRAPH) 2024, 21 pages

Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit "natural" random variation. Many different types of noise exist, each produced by a separate algorithm. In this paper, we present a single generative model which can learn to generate multiple types of noise as well as blend between them. In addition, it is capable of producing spatially-varying noise blends despite not having access to such data for training. These features are enabled by training a denoising diffusion model using a novel combination of data augmentation and network conditioning techniques. Like procedural noise generators, the model's behavior is controllable via interpretable parameters and a source of randomness. We use our model to produce a variety of visually compelling noise textures. We also present an application of our model to improving inverse procedural material design; using our model in place of fixed-type noise nodes in a procedural material graph results in higher-fidelity material reconstructions without needing to know the type of noise in advance.

收縮 · 有偏 · 平穩的 · 噪聲 · 平穩分布 ·

2024 年 4 月 24 日

Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA

Yixuan Zhang,Dongyan Huo,Yudong Chen,Qiaomin Xie

from arxiv, ACM SIGMETRICS 2024. 71 pages, 3 figures

Motivated by Q-learning, we study nonsmooth contractive stochastic approximation (SA) with constant stepsize. We focus on two important classes of dynamics: 1) nonsmooth contractive SA with additive noise, and 2) synchronous and asynchronous Q-learning, which features both additive and multiplicative noise. For both dynamics, we establish weak convergence of the iterates to a stationary limit distribution in Wasserstein distance. Furthermore, we propose a prelimit coupling technique for establishing steady-state convergence and characterize the limit of the stationary distribution as the stepsize goes to zero. Using this result, we derive that the asymptotic bias of nonsmooth SA is proportional to the square root of the stepsize, which stands in sharp contrast to smooth SA. This bias characterization allows for the use of Richardson-Romberg extrapolation for bias reduction in nonsmooth SA.

多峰值 · MoDELS · Performer · Integration · 語言模型化 ·

2024 年 2 月 19 日

The (R)Evolution of Multimodal Large Language Models: A Survey

Davide Caffagni,Federico Cocchi,Luca Barsellotti,Nicholas Moratelli,Sara Sarto,Lorenzo Baraldi,Lorenzo Baraldi,Marcella Cornia,Rita Cucchiara

Connecting text and visual modalities plays an essential role in generative intelligence. For this reason, inspired by the success of large language models, significant research efforts are being devoted to the development of Multimodal Large Language Models (MLLMs). These models can seamlessly integrate visual and textual modalities, both as input and output, while providing a dialogue-based interface and instruction-following capabilities. In this paper, we provide a comprehensive review of recent visual-based MLLMs, analyzing their architectural choices, multimodal alignment strategies, and training techniques. We also conduct a detailed analysis of these models across a wide range of tasks, including visual grounding, image generation and editing, visual understanding, and domain-specific applications. Additionally, we compile and describe training datasets and evaluation benchmarks, conducting comparisons among existing models in terms of performance and computational requirements. Overall, this survey offers a comprehensive overview of the current state of the art, laying the groundwork for future MLLMs.

Networking · 講稿 · 可理解性 · AI · 多代理人模型 ·

2023 年 10 月 11 日

Generative Agent-Based Social Networks for Disinformation: Research Opportunities and Open Challenges

Javier Pastor-Galindo,Pantaleone Nespoli,José A. Ruipérez-Valiente

This article presents the affordances that Generative Artificial Intelligence can have in disinformation context, one of the major threats to our digitalized society. We present a research framework to generate customized agent-based social networks for disinformation simulations that would enable understanding and evaluation of the phenomena whilst discussing open challenges.

Automator · AutoML · Machine Learning · 學成 · 可約的 ·

2019 年 1 月 17 日

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Quanming Yao,Mengshuo Wang,Yuqiang Chen,Wenyuan Dai,Hu Yi-Qi,Li Yu-Feng,Tu Wei-Wei,Yang Qiang,Yu Yang

from arxiv, This is a preliminary and will be kept updated

Machine learning techniques have deeply rooted in our everyday life. However, since it is knowledge- and labor-intensive to pursue good learning performance, human experts are heavily involved in every aspect of machine learning. In order to make machine learning techniques easier to apply and reduce the demand for experienced human experts, automated machine learning (AutoML) has emerged as a hot topic with both industrial and academic interest. In this paper, we provide an up to date survey on AutoML. First, we introduce and define the AutoML problem, with inspiration from both realms of automation and machine learning. Then, we propose a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods. Subsequently, we categorize and review the existing works from two aspects, i.e., the problem setup and the employed techniques. Finally, we provide a detailed analysis of AutoML approaches and explain the reasons underneath their successful applications. We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.