国产成人精品三级在线-爆乳护士一区二区三区在线播放

The Bayesian Context Trees (BCT) framework is a recently introduced, general collection of statistical and algorithmic tools for modelling, analysis and inference with discrete-valued time series. The foundation of this development is built in part on some well-known information-theoretic ideas and techniques, including Rissanen's tree sources and Willems et al.'s context-tree weighting algorithm. This paper presents a collection of theoretical results that provide mathematical justifications and further insight into the BCT modelling framework and the associated practical tools. It is shown that the BCT prior predictive likelihood (the probability of a time series of observations averaged over all models and parameters) is both pointwise and minimax optimal, in agreement with the MDL principle and the BIC criterion. The posterior distribution is shown to be asymptotically consistent with probability one (over both models and parameters), and asymptotically Gaussian (over the parameters). And the posterior predictive distribution is also shown to be asymptotically consistent with probability one.

相關內容

Weight

關注 0

GPT-3 · MoDELS · 樣例 · 泛函 · 數據集 ·

2023 年 9 月 1 日

GPTCloneBench: A comprehensive benchmark of semantic clones and cross-language clones using GPT-3 model and SemanticCloneBench

Ajmain Inqiad Alam,Palash Ranjan Roy,Farouq Al-omari,Chanchal Kumar Roy,Banani Roy,Kevin Schneider

from arxiv, Accepted in 39th IEEE International Conference on Software Maintenance and Evolution(ICSME 2023)

With the emergence of Machine Learning, there has been a surge in leveraging its capabilities for problem-solving across various domains. In the code clone realm, the identification of type-4 or semantic clones has emerged as a crucial yet challenging task. Researchers aim to utilize Machine Learning to tackle this challenge, often relying on the BigCloneBench dataset. However, it's worth noting that BigCloneBench, originally not designed for semantic clone detection, presents several limitations that hinder its suitability as a comprehensive training dataset for this specific purpose. Furthermore, CLCDSA dataset suffers from a lack of reusable examples aligning with real-world software systems, rendering it inadequate for cross-language clone detection approaches. In this work, we present a comprehensive semantic clone and cross-language clone benchmark, GPTCloneBench by exploiting SemanticCloneBench and OpenAI's GPT-3 model. In particular, using code fragments from SemanticCloneBench as sample inputs along with appropriate prompt engineering for GPT-3 model, we generate semantic and cross-language clones for these specific fragments and then conduct a combination of extensive manual analysis, tool-assisted filtering, functionality testing and automated validation in building the benchmark. From 79,928 clone pairs of GPT-3 output, we created a benchmark with 37,149 true semantic clone pairs, 19,288 false semantic pairs(Type-1/Type-2), and 20,770 cross-language clones across four languages (Java, C, C#, and Python). Our benchmark is 15-fold larger than SemanticCloneBench, has more functional code examples for software systems and programming language support than CLCDSA, and overcomes BigCloneBench's qualities, quantification, and language variety limitations.

DC · 相互獨立的 · 線性的 · 原點 · 分解 ·

2023 年 8 月 31 日

Revisiting the specification decomposition for synthesis based on LTL solvers

Josu Oca,Montserrat Hermo,Alexander Bolotov

Recently, several algorithms have been proposed for decomposing reactive synthesis specifications into independent and simpler sub-specifications. Being inspired by one of the approaches, developed by Antonio Iannopollo (2018), who designed the so-called (DC) algorithm, we present here our solution that takes his ideas further and provides mathematical formalisation of the strategy behind DC. We rigorously define the main notions involved in the algorithm, explain the technique, and demonstrate its application on examples. The core technique of DC is based on the detection of independent variables in linear temporal logic formulae by exploiting the power and efficiency of a model checker. Although the DC algorithm is sound, it is not complete, as its author already pointed out. In this paper, we provide a counterexample that shows this fact and propose relevant changes to adapt the original DC strategy to ensure its correctness. The modification of DC and the detailed proof of its soundness and completeness are the main contributions of this work.

INTERACT · 優化器 · MoDELS · 控制器 · Networking ·

2023 年 8 月 31 日

On a Connection between Differential Games, Optimal Control, and Energy-based Models for Multi-Agent Interactions

Christopher Diehl,Tobias Klosek,Martin Krüger,Nils Murzyn,Torsten Bertram

from arxiv, International Conference on Machine Learning, Workshop on New Frontiers in Learning, Control, and Dynamical Systems (ICML 2023 Frontiers4LCD)

Game theory offers an interpretable mathematical framework for modeling multi-agent interactions. However, its applicability in real-world robotics applications is hindered by several challenges, such as unknown agents' preferences and goals. To address these challenges, we show a connection between differential games, optimal control, and energy-based models and demonstrate how existing approaches can be unified under our proposed Energy-based Potential Game formulation. Building upon this formulation, this work introduces a new end-to-end learning application that combines neural networks for game-parameter inference with a differentiable game-theoretic optimization layer, acting as an inductive bias. The experiments using simulated mobile robot pedestrian interactions and real-world automated driving data provide empirical evidence that the game-theoretic layer improves the predictive performance of various neural network backbones.

MoDELS · Extensibility · 可約的 · 分離的 · 可理解性 ·

2023 年 8 月 30 日

Formal specification terminology for demographic agent-based models of fixed-step single-clocked simulations

Atiyah Elsheikh

from arxiv, arXiv admin note: substantial text overlap with arXiv:2307.16548

This document presents adequate formal terminology for the mathematical specification of a subset of Agent Based Models (ABMs) in the field of Demography. The simulation of the targeted ABMs follows a fixed-step single-clocked pattern. The proposed terminology further improves the model understanding and can act as a stand-alone methodology for the specification and optionally the documentation of a significant set of (demographic) ABMs. Nevertheless, it is imaginable the this terminology probably with further extensions can be merged with the largely-informal widely-used model documentation and communication O.D.D. protocol [Grimm and et al., 2020, Amouroux et al., 2010] to reduce many sources of ambiguity, hindering model replications by other modelers. A published demographic model documentation, largely simplified version of the Lone Parent Model [Gostoli and Silverman, 2020] is separately published in [Elsheikh, 2023b] as illustration for the formal terminology. The model was implemented in the Julia language [Elsheikh, 2023a] based on the Agents.jl julia package [Datseris et al., 2022].

近似 · 周期的 · 維數災難 · Analysis · 稀疏 ·

2023 年 8 月 29 日

Compressive Fourier collocation methods for high-dimensional diffusion equations with periodic boundary conditions

Weiqi Wang,Simone Brugiapaglia

from arxiv, 33 pages, 9 figures

High-dimensional Partial Differential Equations (PDEs) are a popular mathematical modelling tool, with applications ranging from finance to computational chemistry. However, standard numerical techniques for solving these PDEs are typically affected by the curse of dimensionality. In this work, we tackle this challenge while focusing on stationary diffusion equations defined over a high-dimensional domain with periodic boundary conditions. Inspired by recent progress in sparse function approximation in high dimensions, we propose a new method called compressive Fourier collocation. Combining ideas from compressive sensing and spectral collocation, our method replaces the use of structured collocation grids with Monte Carlo sampling and employs sparse recovery techniques, such as orthogonal matching pursuit and $\ell^1$ minimization, to approximate the Fourier coefficients of the PDE solution. We conduct a rigorous theoretical analysis showing that the approximation error of the proposed method is comparable with the best $s$-term approximation (with respect to the Fourier basis) to the solution. Using the recently introduced framework of random sampling in bounded Riesz systems, our analysis shows that the compressive Fourier collocation method mitigates the curse of dimensionality with respect to the number of collocation points under sufficient conditions on the regularity of the diffusion coefficient. We also present numerical experiments that illustrate the accuracy and stability of the method for the approximation of sparse and compressible solutions.

Perplexity · 超參數 · tuning · 隨機近鄰嵌入 · Analysis ·

2023 年 8 月 29 日

Tuning the perplexity for and computing sampling-based t-SNE embeddings

Martin Skrodzki,Nicolas Chaves-de-Plaza,Klaus Hildebrandt,Thomas H?llt,Elmar Eisemann

Widely used pipelines for the analysis of high-dimensional data utilize two-dimensional visualizations. These are created, e.g., via t-distributed stochastic neighbor embedding (t-SNE). When it comes to large data sets, applying these visualization techniques creates suboptimal embeddings, as the hyperparameters are not suitable for large data. Cranking up these parameters usually does not work as the computations become too expensive for practical workflows. In this paper, we argue that a sampling-based embedding approach can circumvent these problems. We show that hyperparameters must be chosen carefully, depending on the sampling rate and the intended final embedding. Further, we show how this approach speeds up the computation and increases the quality of the embeddings.

Performer · 數據集 · HTTPS · Performance · AI ·

2023 年 8 月 29 日

Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication

Shahriar Elahi Dhruvo,Mohammad Akhlaqur Rahman,Manash Kumar Mandal,Md. Istiak Hossain Shihab,A. A. Noman Ansary,Kaneez Fatema Shithi,Sanjida Khanom,Rabeya Akter,Safaeid Hossain Arib,M. N. Ansary,Sazia Mehnaz,Rezwana Sultana,Sejuti Rahman,Sayma Sultana Chowdhury,Sabbir Ahmed Chowdhury,Farig Sadeque,Asif Sushmit

from arxiv, 6 pages, 7 figures

The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It also allows validators to make sure of the quality of both the recorded videos and the annotations through manual validation to develop high-quality datasets for deep learning-based Automatic Sign Language Recognition. To demonstrate the system's efficacy; we collected the largest sign language dataset for Bangladeshi Sign Language dialect, perform deep learning based Sign Language Recognition modeling, and report the benchmark performance. The Bornil platform, BornilDB v1.0 Dataset, and the codebases are available on //bornil.bengali.ai

Prompt · 推斷 · MoDELS · state-of-the-art · Performer ·

2023 年 8 月 29 日

All-in-SAM: from Weak Annotation to Pixel-wise Nuclei Segmentation with Prompt-based Finetuning

Can Cui,Ruining Deng,Quan Liu,Tianyuan Yao,Shunxing Bao,Lucas W. Remedios,Yucheng Tang,Yuankai Huo

The Segment Anything Model (SAM) is a recently proposed prompt-based segmentation model in a generic zero-shot segmentation approach. With the zero-shot segmentation capacity, SAM achieved impressive flexibility and precision on various segmentation tasks. However, the current pipeline requires manual prompts during the inference stage, which is still resource intensive for biomedical image segmentation. In this paper, instead of using prompts during the inference stage, we introduce a pipeline that utilizes the SAM, called all-in-SAM, through the entire AI development workflow (from annotation generation to model finetuning) without requiring manual prompts during the inference stage. Specifically, SAM is first employed to generate pixel-level annotations from weak prompts (e.g., points, bounding box). Then, the pixel-level annotations are used to finetune the SAM segmentation model rather than training from scratch. Our experimental results reveal two key findings: 1) the proposed pipeline surpasses the state-of-the-art (SOTA) methods in a nuclei segmentation task on the public Monuseg dataset, and 2) the utilization of weak and few annotations for SAM finetuning achieves competitive performance compared to using strong pixel-wise annotated data.

分離的 · 圖 · 樣例 · 近似 · contrastive ·

2023 年 8 月 28 日

Borel versions of the Local Lemma and LOCAL algorithms for graphs of finite asymptotic separation index

Anton Bernshteyn,Felix Weilacher

Asymptotic separation index is a parameter that measures how easily a Borel graph can be approximated by its subgraphs with finite components. In contrast to the more classical notion of hyperfiniteness, asymptotic separation index is well-suited for combinatorial applications in the Borel setting. The main result of this paper is a Borel version of the Lov\'asz Local Lemma -- a powerful general-purpose tool in probabilistic combinatorics -- under a finite asymptotic separation index assumption. As a consequence, we show that locally checkable labeling problems that are solvable by efficient randomized distributed algorithms admit Borel solutions on bounded degree Borel graphs with finite asymptotic separation index. From this we derive a number of corollaries, for example a Borel version of Brooks's theorem for graphs with finite asymptotic separation index.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.