两个人的视频免费国语版-我和子的性关系过程在线观看

Gaussian Process Upper Confidence Bound (GP-UCB) is one of the most popular methods for optimizing black-box functions with noisy observations, due to its simple structure and superior performance. Its empirical successes lead to a natural, yet unresolved question: Is GP-UCB regret optimal? In this paper, we offer the first generally affirmative answer to this important open question in the Bayesian optimization literature. We establish new upper bounds on both the simple and cumulative regret of GP-UCB when the objective function to optimize admits certain smoothness property. These upper bounds match the known minimax lower bounds (up to logarithmic factors independent of the feasible region's dimensionality) for optimizing functions with the same smoothness. Intriguingly, our findings indicate that, with the same level of exploration, GP-UCB can simultaneously achieve optimality in both simple and cumulative regret. The crux of our analysis hinges on a refined uniform error bound for online estimation of functions in reproducing kernel Hilbert spaces. This error bound, which we derive from empirical process theory, is of independent interest, and its potential applications may reach beyond the scope of this study.

相關內容

優化器

關注 4

Facebook AI Research · 上采樣 · MoDELS · 情景 · 多樣性 ·

2024 年 1 月 24 日

Benchmarking the Fairness of Image Upsampling Methods

Mike Laszkiewicz,Imant Daunhawer,Julia E. Vogt,Asja Fischer,Johannes Lederer

Recent years have witnessed a rapid development of deep generative models for creating synthetic media, such as images and videos. While the practical applications of these models in everyday tasks are enticing, it is crucial to assess the inherent risks regarding their fairness. In this work, we introduce a comprehensive framework for benchmarking the performance and fairness of conditional generative models. We develop a set of metrics$\unicode{x2013}$inspired by their supervised fairness counterparts$\unicode{x2013}$to evaluate the models on their fairness and diversity. Focusing on the specific application of image upsampling, we create a benchmark covering a wide variety of modern upsampling methods. As part of the benchmark, we introduce UnfairFace, a subset of FairFace that replicates the racial distribution of common large-scale face datasets. Our empirical study highlights the importance of using an unbiased training set and reveals variations in how the algorithms respond to dataset imbalances. Alarmingly, we find that none of the considered methods produces statistically fair and diverse results.

特化 · 估計/估計量 · 線性回歸 · 線性的 · 方陣 ·

2024 年 1 月 24 日

The Fragility of Sparsity

Michal Kolesár,Ulrich K. Müller,Sebastian T. Roelsgaard

from arxiv, 44 pages, including appendices

We show, using three empirical applications, that linear regression estimates which rely on the assumption of sparsity are fragile in two ways. First, we document that different choices of the regressor matrix that do not impact ordinary least squares (OLS) estimates, such as the choice of baseline category with categorical controls, can move sparsity-based estimates two standard errors or more. Second, we develop two tests of the sparsity assumption based on comparing sparsity-based estimators with OLS. The tests tend to reject the sparsity assumption in all three applications. Unless the number of regressors is comparable to or exceeds the sample size, OLS yields more robust results at little efficiency cost.

近似 · 集成 · 曲率 · 泛函 · Principle ·

2024 年 1 月 24 日

Adversarial Detection by Approximation of Ensemble Boundary

T. Windeatt

from arxiv, 17 pages, 5 figures, 5 tables

A new method of detecting adversarial attacks is proposed for an ensemble of Deep Neural Networks (DNNs) solving two-class pattern recognition problems. The ensemble is combined using Walsh coefficients which are capable of approximating Boolean functions and thereby controlling the complexity of the ensemble decision boundary. The hypothesis in this paper is that decision boundaries with high curvature allow adversarial perturbations to be found, but change the curvature of the decision boundary, which is then approximated in a different way by Walsh coefficients compared to the clean images. By observing the difference in Walsh coefficient approximation between clean and adversarial images, it is shown experimentally that transferability of attack may be used for detection. Furthermore, approximating the decision boundary may aid in understanding the learning and transferability properties of DNNs. While the experiments here use images, the proposed approach of modelling two-class ensemble decision boundaries could in principle be applied to any application area. Code for approximating Boolean functions using Walsh coefficients: //doi.org/10.24433/CO.3695905.v1

CASES · 自動問答 · MoDELS · LangChain · 語言模型化 ·

2024 年 1 月 23 日

Revolutionizing Retrieval-Augmented Generation with Enhanced PDF Structure Recognition

Demiao Lin

from arxiv, 18 pages, 16 figures

With the rapid development of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) has become a predominant method in the field of professional knowledge-based question answering. Presently, major foundation model companies have opened up Embedding and Chat API interfaces, and frameworks like LangChain have already integrated the RAG process. It appears that the key models and steps in RAG have been resolved, leading to the question: are professional knowledge QA systems now approaching perfection? This article discovers that current primary methods depend on the premise of accessing high-quality text corpora. However, since professional documents are mainly stored in PDFs, the low accuracy of PDF parsing significantly impacts the effectiveness of professional knowledge-based QA. We conducted an empirical RAG experiment across hundreds of questions from the corresponding real-world professional documents. The results show that, ChatDOC, a RAG system equipped with a panoptic and pinpoint PDF parser, retrieves more accurate and complete segments, and thus better answers. Empirical experiments show that ChatDOC is superior to baseline on nearly 47% of questions, ties for 38% of cases, and falls short on only 15% of cases. It shows that we may revolutionize RAG with enhanced PDF structure recognition.

Facebook AI Research · Ghost（博客程序） · 相互獨立的 · 成比例 · 情景 ·

2024 年 1 月 23 日

The Fairness of Redistricting Ghost

Jia-Wei Liang,Nina Amenta

We explore the fairness of a redistricting game introduced by Mixon and Villar, which provides a two-party protocol for dividing a state into electoral districts, without the participation of an independent authority. We analyze the game in an abstract setting that ignores the geographic distribution of voters and assumes that voter preferences are fixed and known. We show that the minority player can always win at least $p-1$ districts, where $p$ is proportional to the percentage of minority voters. We give an upper bound on the number of districts won by the minority based on a "cracking" strategy for the majority.

MathVista · GPT-4V · Performer · MoDELS · 數學 ·

2024 年 1 月 21 日

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan Lu,Hritik Bansal,Tony Xia,Jiacheng Liu,Chunyuan Li,Hannaneh Hajishirzi,Hao Cheng,Kai-Wei Chang,Michel Galley,Jianfeng Gao

from arxiv, 116 pages, 120 figures. Accepted to ICLR 2024

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks. It consists of 6,141 examples, derived from 28 existing multimodal datasets involving mathematics and 3 newly created datasets (i.e., IQTest, FunctionQA, and PaperQA). Completing these tasks requires fine-grained, deep visual understanding and compositional reasoning, which all state-of-the-art foundation models find challenging. With MathVista, we have conducted a comprehensive, quantitative evaluation of 12 prominent foundation models. The best-performing GPT-4V model achieves an overall accuracy of 49.9%, substantially outperforming Bard, the second-best performer, by 15.1%. Our in-depth analysis reveals that the superiority of GPT-4V is mainly attributed to its enhanced visual perception and mathematical reasoning. However, GPT-4V still falls short of human performance by 10.4%, as it often struggles to understand complex figures and perform rigorous reasoning. This significant gap underscores the critical role that MathVista will play in the development of general-purpose AI agents capable of tackling mathematically intensive and visually rich real-world tasks. We further explore the new ability of self-verification, the application of self-consistency, and the interactive chatbot capabilities of GPT-4V, highlighting its promising potential for future research. The project is available at //mathvista.github.io/.

TIP · 支持向量機 · 模型評估 · 模態 · 可辨認的 ·

2024 年 1 月 19 日

Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation

Jared Lawson,Madison Veliky,Colette P. Abah,Mary S. Dietrich,Rohan Chitale,Nabil Simaan

Objective: The objective of this work is to introduce and demonstrate the effectiveness of a novel sensing modality for contact detection between an off-the-shelf aspiration catheter and a thrombus. Methods: A custom robotic actuator with a pressure sensor was used to generate an oscillatory vacuum excitation and sense the pressure inside the extracorporeal portion of the catheter. Vacuum pressure profiles and robotic motion data were used to train a support vector machine (SVM) classification model to detect contact between the aspiration catheter tip and a mock thrombus. Validation consisted of benchtop accuracy verification, as well as user study comparison to the current standard of angiographic presentation. Results: Benchtop accuracy of the sensing modality was shown to be 99.67%. The user study demonstrated statistically significant improvement in identifying catheter-thrombus contact compared to the current standard. The odds ratio of successful detection of clot contact was 2.86 (p=0.03) when using the proposed sensory method compared to without it. Conclusion: The results of this work indicate that the proposed sensing modality can offer intraoperative feedback to interventionalists that can improve their ability to detect contact between the distal tip of a catheter and a thrombus. Significance: By offering a relatively low-cost technology that affords off-the-shelf aspiration catheters as clot-detecting sensors, interventionalists can improve the first-pass effect of the mechanical thrombectomy procedure while reducing procedural times and mental burden.

語言模型化 · MoDELS · 泛化理論 · 可辨認的 · Continuity ·

2023 年 7 月 12 日

A Comprehensive Overview of Large Language Models

Humza Naveed,Asad Ullah Khan,Shi Qiu,Muhammad Saqib,Saeed Anwar,Muhammad Usman,Nick Barnes,Ajmal Mian

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

Learning · 情景 · 簇 · Better · Processing（編程語言） ·

2023 年 1 月 19 日

A Survey of Meta-Reinforcement Learning

Jacob Beck,Risto Vuorio,Evan Zheran Liu,Zheng Xiong,Luisa Zintgraf,Chelsea Finn,Shimon Whiteson

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

知識表示 · Things · 推薦系統 · MoDELS · 邊 ·

2018 年 5 月 10 日

A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things

Yinhao Li,Awa Alqahtani,Ellis Solaiman,Charith Perera,Prem Prakash Jayaraman,Boualem Benatallah,Rajiv Ranjan

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.