姑娘日本电影免费观看全集中文,日韩一区二区三区免费在线观看,国产精品色视频一区二欧美,又粗又大又猛又黄又爽大片视频

Hierarchical leaf vein segmentation is a crucial but under-explored task in agricultural sciences, where analysis of the hierarchical structure of plant leaf venation can contribute to plant breeding. While current segmentation techniques rely on data-driven models, there is no publicly available dataset specifically designed for hierarchical leaf vein segmentation. To address this gap, we introduce the HierArchical Leaf Vein Segmentation (HALVS) dataset, the first public hierarchical leaf vein segmentation dataset. HALVS comprises 5,057 real-scanned high-resolution leaf images collected from three plant species: soybean, sweet cherry, and London planetree. It also includes human-annotated ground truth for three orders of leaf veins, with a total labeling effort of 83.8 person-days. Based on HALVS, we further develop a label-efficient learning paradigm that leverages partial label information, i.e. missing annotations for tertiary veins. Empirical studies are performed on HALVS, revealing new observations, challenges, and research directions on leaf vein segmentation.

相關內容

數據集(ji)

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

SGD · 平穩的 · 平穩分布 · 學習率 · Learning ·

2024 年 6 月 23 日

Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary Distribution

Naoki Yoshida,Shogo Nakakita,Masaaki Imaizumi

from arxiv, 28 pages

We consider a variant of the stochastic gradient descent (SGD) with a random learning rate and reveal its convergence properties. SGD is a widely used stochastic optimization algorithm in machine learning, especially deep learning. Numerous studies reveal the convergence properties of SGD and its simplified variants. Among these, the analysis of convergence using a stationary distribution of updated parameters provides generalizable results. However, to obtain a stationary distribution, the update direction of the parameters must not degenerate, which limits the applicable variants of SGD. In this study, we consider a novel SGD variant, Poisson SGD, which has degenerated parameter update directions and instead utilizes a random learning rate. Consequently, we demonstrate that a distribution of a parameter updated by Poisson SGD converges to a stationary distribution under weak assumptions on a loss function. Based on this, we further show that Poisson SGD finds global minima in non-convex optimization problems and also evaluate the generalization error using this method. As a proof technique, we approximate the distribution by Poisson SGD with that of the bouncy particle sampler (BPS) and derive its stationary distribution, using the theoretical advance of the piece-wise deterministic Markov process (PDMP).

Facebook AI Research · INFORMS · 可辨認的 · 樣例 · 有向 ·

2024 年 6 月 21 日

Balancing The Perception of Cheating Detection, Privacy and Fairness: A Mixed-Methods Study of Visual Data Obfuscation in Remote Proctoring

Suvadeep Mukherjee,Verena Distler,Gabriele Lenzini,Pedro Cardoso-Leite

Remote proctoring technology, a cheating-preventive measure, often raises privacy and fairness concerns that may affect test-takers' experiences and the validity of test results. Our study explores how selectively obfuscating information in video recordings can protect test-takers' privacy while ensuring effective and fair cheating detection. Interviews with experts (N=9) identified four key video regions indicative of potential cheating behaviors: the test-taker's face, body, background and the presence of individuals in the background. Experts recommended specific obfuscation methods for each region based on privacy significance and cheating behavior frequency, ranging from conventional blurring to advanced methods like replacement with deepfake, 3D avatars and silhouetting. We then conducted a vignette experiment with potential test-takers (N=259, non-experts) to evaluate their perceptions of cheating detection, visual privacy and fairness, using descriptions and examples of still images for each expert-recommended combination of video regions and obfuscation methods. Our results indicate that the effectiveness of obfuscation methods varies by region. Tailoring remote proctoring with region-specific advanced obfuscation methods can improve the perceptions of privacy and fairness compared to the conventional methods, though it may decrease perceived information sufficiency for detecting cheating. However, non-experts preferred conventional blurring for videos they were more willing to share, highlighting a gap between the perceived effectiveness of the advanced obfuscation methods and their practical acceptance. This study contributes to the field of user-centered privacy by suggesting promising directions to address current remote proctoring challenges and guiding future research.

Machine Learning · Learning · CASE · 數據集 · CAD ·

2024 年 6 月 20 日

Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD

Tereso del Río,Matthew England

Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the variable ordering decision. We introduce an augmentation technique for polynomial systems problems that allows us to balance and further augment the dataset, improving the machine learning results by 28\% and 38\% on average, respectively. We then demonstrate how the existing machine learning methodology used for the problem $-$ classification $-$ might be recast into the regression paradigm. While this does not have a radical change on the performance, it does widen the scope in which the methodology can be applied to make choices.

語言模型化 · MoDELS · 大語言模型 · INTERACT · INFORMS ·

2024 年 6 月 20 日

Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy

Efe Bozkir,Süleyman ?zdel,Ka Hei Carrie Lau,Mengdi Wang,Hong Gao,Enkelejda Kasneci

from arxiv, ACM Conversational User Interfaces 2024

Advances in artificial intelligence and human-computer interaction will likely lead to extended reality (XR) becoming pervasive. While XR can provide users with interactive, engaging, and immersive experiences, non-player characters are often utilized in pre-scripted and conventional ways. This paper argues for using large language models (LLMs) in XR by embedding them in avatars or as narratives to facilitate inclusion through prompt engineering and fine-tuning the LLMs. We argue that this inclusion will promote diversity for XR use. Furthermore, the versatile conversational capabilities of LLMs will likely increase engagement in XR, helping XR become ubiquitous. Lastly, we speculate that combining the information provided to LLM-powered spaces by users and the biometric data obtained might lead to novel privacy invasions. While exploring potential privacy breaches, examining user privacy concerns and preferences is also essential. Therefore, despite challenges, LLM-powered XR is a promising area with several opportunities.

泛函 · 塑造 · MoDELS · 廣義函數 · 估計/估計量 ·

2024 年 6 月 18 日

Intrinsic Modeling of Shape-Constrained Functional Data, With Applications to Growth Curves and Activity Profiles

Poorbita Kundu,Hans-Georg Müller

Shape-constrained functional data encompass a wide array of application fields especially in the life sciences, such as activity profiling, growth curves, healthcare and mortality. Most existing methods for general functional data analysis often ignore that such data are subject to inherent shape constraints, while some specialized techniques rely on strict distributional assumptions. We propose an approach for modeling such data that harnesses the intrinsic geometry of functional trajectories by decomposing them into size and shape components. We focus on the two most prevalent shape constraints, positivity and monotonicity, and develop individual-level estimators for the size and shape components. Furthermore, we demonstrate the applicability of our approach by conducting subsequent analyses involving Fr\'{e}chet mean and Fr\'{e}chet regression and establish rates of convergence for the empirical estimators. Illustrative examples include simulations and data applications for activity profiles for Mediterranean fruit flies during their entire lifespan and for data from the Z\"{u}rich longitudinal growth study.

極大 · Agent · Markov · Extensibility · Principle ·

2024 年 6 月 18 日

The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough

Riccardo Zamboni,Duilio Cirino,Marcello Restelli,Mirco Mutti

The problem of pure exploration in Markov decision processes has been cast as maximizing the entropy over the state distribution induced by the agent's policy, an objective that has been extensively studied. However, little attention has been dedicated to state entropy maximization under partial observability, despite the latter being ubiquitous in applications, e.g., finance and robotics, in which the agent only receives noisy observations of the true state governing the system's dynamics. How can we address state entropy maximization in those domains? In this paper, we study the simple approach of maximizing the entropy over observations in place of true latent states. First, we provide lower and upper bounds to the approximation of the true state entropy that only depends on some properties of the observation function. Then, we show how knowledge of the latter can be exploited to compute a principled regularization of the observation entropy to improve performance. With this work, we provide both a flexible approach to bring advances in state entropy maximization to the POMDP setting and a theoretical characterization of its intrinsic limits.

數據集 · 話題 · surge · 周期的 · Projection ·

2024 年 6 月 18 日

EUvsDisinfo: a Dataset for Multilingual Detection of Pro-Kremlin Disinformation in News Articles

Jo?o A. Leite,Olesya Razuvayevskaya,Kalina Bontcheva,Carolina Scarton

from arxiv, 4 pages, 3 figures, 2 tables

This work introduces EUvsDisinfo, a multilingual dataset of trustworthy and disinformation articles related to pro-Kremlin themes. It is sourced directly from the debunk articles written by experts leading the EUvsDisinfo project. Our dataset is the largest to-date resource in terms of the overall number of articles and distinct languages. It also provides the largest topical and temporal coverage. Using this dataset, we investigate the dissemination of pro-Kremlin disinformation across different languages, uncovering language-specific patterns targeting specific disinformation topics. We further analyse the evolution of topic distribution over an eight-year period, noting a significant surge in disinformation content before the full-scale invasion of Ukraine in 2022. Lastly, we demonstrate the dataset's applicability in training models to effectively distinguish between disinformation and trustworthy content in multilingual settings.

Cognition · Performer · Agent · 知識 (knowledge) · MoDELS ·

2023 年 7 月 14 日

Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

Zhenhailong Wang,Shaoguang Mao,Wenshan Wu,Tao Ge,Furu Wei,Heng Ji

from arxiv, work in progress

Human intelligence thrives on the concept of cognitive synergy, where collaboration and information integration among different cognitive processes yield superior outcomes compared to individual cognitive processes in isolation. Although Large Language Models (LLMs) have demonstrated promising performance as general task-solving agents, they still struggle with tasks that require intensive domain knowledge and complex reasoning. In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. A cognitive synergist refers to an intelligent agent that collaborates with multiple minds, combining their individual strengths and knowledge, to enhance problem-solving and overall performance in complex tasks. By dynamically identifying and simulating different personas based on task inputs, SPP unleashes the potential of cognitive synergy in LLMs. We have discovered that assigning multiple, fine-grained personas in LLMs elicits better problem-solving abilities compared to using a single or fixed number of personas. We evaluate SPP on three challenging tasks: Trivia Creative Writing, Codenames Collaborative, and Logic Grid Puzzle, encompassing both knowledge-intensive and reasoning-intensive types. Unlike previous works, such as Chain-of-Thought, that solely enhance the reasoning abilities in LLMs, SPP effectively elicits internal knowledge acquisition abilities, reduces hallucination, and maintains strong reasoning capabilities. Code, data, and prompts can be found at: //github.com/MikeWangWZHL/Solo-Performance-Prompting.git.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

學成 · 大數據 · 相同 · 人工智能 · 統計方法 ·

2020 年 5 月 5 日

A Survey of Learning Causality with Data: Problems and Methods

Ruocheng Guo,Lu Cheng,Jundong Li,P. Richard Hahn,Huan Liu

from arxiv, 35 pages, accepted by ACM CSUR

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.