人人操人人莫人人草,亚洲专区中文字幕专区

In this study, we address the central issue of statistical inference for Markov jump processes using discrete time observations. The primary problem at hand is to accurately estimate the infinitesimal generator of a Markov jump process, a critical task in various applications. To tackle this problem, we begin by reviewing established methods for generating sample paths from a Markov jump process conditioned to endpoints, known as Markov bridges. Additionally, we introduce a novel algorithm grounded in the concept of time-reversal, which serves as our main contribution. Our proposed method is then employed to estimate the infinitesimal generator of a Markov jump process. To achieve this, we use a combination of Markov Chain Monte Carlo techniques and the Monte Carlo Expectation-Maximization algorithm. The results obtained from our approach demonstrate its effectiveness in providing accurate parameter estimates. To assess the efficacy of our proposed method, we conduct a comprehensive comparative analysis with existing techniques (Bisection, Uniformization, Direct, Rejection, and Modified Rejection), taking into consideration both speed and accuracy. Notably, our method stands out as the fastest among the alternatives while maintaining high levels of precision.

相關內容

Markov

關注 1

黑盒子 · RNN · Learning · GPT-4 · MoDELS ·

2024 年 2 月 7 日

Opening the AI black box: program synthesis via mechanistic interpretability

Eric J. Michaud,Isaac Liao,Vedang Lad,Ziming Liu,Anish Mudide,Chloe Loughridge,Zifan Carl Guo,Tara Rezaei Kheirkhah,Mateja Vukeli?,Max Tegmark

from arxiv, 24 pages

We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. We test MIPS on a benchmark of 62 algorithmic tasks that can be learned by an RNN and find it highly complementary to GPT-4: MIPS solves 32 of them, including 13 that are not solved by GPT-4 (which also solves 30). MIPS uses an integer autoencoder to convert the RNN into a finite state machine, then applies Boolean or integer symbolic regression to capture the learned algorithm. As opposed to large language models, this program synthesis technique makes no use of (and is therefore not limited by) human training data such as algorithms and code from GitHub. We discuss opportunities and challenges for scaling up this approach to make machine-learned models more interpretable and trustworthy.

Machine Learning · 模型評估 · 知識 (knowledge) · MoDELS · Learning ·

2024 年 2 月 6 日

Interpretable domain knowledge enhanced machine learning framework on axial capacity prediction of circular CFST columns

Dian Wang,Zhigang Ren,Gen Kondo,Peipeng Li

from arxiv, Journal Research Article

This study introduces a novel machine learning framework, integrating domain knowledge, to accurately predict the bearing capacity of CFSTs, bridging the gap between traditional engineering and machine learning techniques. Utilizing a comprehensive database of 2621 experimental data points on CFSTs, we developed a Domain Knowledge Enhanced Neural Network (DKNN) model. This model incorporates advanced feature engineering techniques, including Pearson correlation, XGBoost, and Random tree algorithms. The DKNN model demonstrated a marked improvement in prediction accuracy, with a Mean Absolute Percentage Error (MAPE) reduction of over 50% compared to existing models. Its robustness was confirmed through extensive performance assessments, maintaining high accuracy even in noisy environments. Furthermore, sensitivity and SHAP analysis were conducted to assess the contribution of each effective parameter to axial load capacity and propose design recommendations for the diameter of cross-section, material strength range and material combination. This research advances CFST predictive modelling, showcasing the potential of integrating machine learning with domain expertise in structural engineering. The DKNN model sets a new benchmark for accuracy and reliability in the field.

估計/估計量 · 穩健性 · Performer · MoDELS · 樣例 ·

2024 年 2 月 6 日

CausalMetaR: An R package for performing causally interpretable meta-analyses

Guanbo Wang,Sean McGrath,Yi Lian,Issa Dahabreh

Researchers would often like to leverage data from a collection of sources (e.g., primary studies in a meta-analysis) to estimate causal effects in a target population of interest. However, traditional meta-analytic methods do not produce causally interpretable estimates for a well-defined target population. In this paper, we present the CausalMetaR R package, which implements efficient and robust methods to estimate causal effects in a given internal or external target population using multi-source data. The package includes estimators of average and subgroup treatment effects for the entire target population. To produce efficient and robust estimates of causal effects, the package implements doubly robust and non-parametric efficient estimators and supports using flexible data-adaptive (e.g., machine learning techniques) methods and cross-fitting techniques to estimate the nuisance models (e.g., the treatment model, the outcome model). We describe the key features of the package and demonstrate how to use the package through an example.

預測器/決策函數 · 有偏 · 線性的 · MoDELS · AIM ·

2024 年 2 月 6 日

Random features models: a way to study the success of naive imputation

Alexis Ayme,Claire Boyer,Aymeric Dieuleveut,Erwan Scornet

Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data. Yet, this simple method could be expected to induce a large bias for prediction purposes, as the imputed input may strongly differ from the true underlying data. However, recent works suggest that this bias is low in the context of high-dimensional linear predictors when data is supposed to be missing completely at random (MCAR). This paper completes the picture for linear predictors by confirming the intuition that the bias is negligible and that surprisingly naive imputation also remains relevant in very low dimension.To this aim, we consider a unique underlying random features model, which offers a rigorous framework for studying predictive performances, whilst the dimension of the observed features varies.Building on these theoretical results, we establish finite-sample bounds on stochastic gradient (SGD) predictors applied to zero-imputed data, a strategy particularly well suited for large-scale learning.If the MCAR assumption appears to be strong, we show that similar favorable behaviors occur for more complex missing data scenarios.

列 · 解碼 · 相同 ·

2024 年 2 月 5 日

New constructions of MSRD codes

Umberto Martínez-Pe?as

In this work, we provide four methods for constructing new maximum sum-rank distance (MSRD) codes. The first method, a variant of cartesian products, allows faster decoding than known MSRD codes of the same parameters. The other three methods allow us to extend or modify existing MSRD codes in order to obtain new explicit MSRD codes for sets of matrix sizes (numbers of rows and columns in different blocks) that were not attainable by previous constructions. In this way, we show that MSRD codes exist (by giving explicit constructions) for new ranges of parameters, in particular with different numbers of rows and columns at different positions.

話題 · Performer · LDA · INFORMS · CASE ·

2024 年 2 月 5 日

Multilingual transformer and BERTopic for short text topic modeling: The case of Serbian

Darija Medvecki,Bojana Ba?aragin,Adela Ljaji?,Nikola Milo?evi?

This paper presents the results of the first application of BERTopic, a state-of-the-art topic modeling technique, to short text written in a morphologi-cally rich language. We applied BERTopic with three multilingual embed-ding models on two levels of text preprocessing (partial and full) to evalu-ate its performance on partially preprocessed short text in Serbian. We also compared it to LDA and NMF on fully preprocessed text. The experiments were conducted on a dataset of tweets expressing hesitancy toward COVID-19 vaccination. Our results show that with adequate parameter setting, BERTopic can yield informative topics even when applied to partially pre-processed short text. When the same parameters are applied in both prepro-cessing scenarios, the performance drop on partially preprocessed text is minimal. Compared to LDA and NMF, judging by the keywords, BERTopic offers more informative topics and gives novel insights when the number of topics is not limited. The findings of this paper can be significant for re-searchers working with other morphologically rich low-resource languages and short text.

置換 · 生成式人工智能 · AI · 可約的 · motivation ·

2024 年 2 月 5 日

The illusion of artificial inclusion

William Agnew,A. Stevie Bergman,Jennifer Chien,Mark Díaz,Seliem El-Sayed,Jaylen Pittman,Shakir Mohamed,Kevin R. McKee

from arxiv, Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI 2024)

Human participants play a central role in the development of modern artificial intelligence (AI) technology, in psychological science, and in user research. Recent advances in generative AI have attracted growing interest to the possibility of replacing human participants in these domains with AI surrogates. We survey several such "substitution proposals" to better understand the arguments for and against substituting human participants with modern generative AI. Our scoping review indicates that the recent wave of these proposals is motivated by goals such as reducing the costs of research and development work and increasing the diversity of collected data. However, these proposals ignore and ultimately conflict with foundational values of work with human participants: representation, inclusion, and understanding. This paper critically examines the principles and goals underlying human participation to help chart out paths for future work that truly centers and empowers participants.

Integration · 近似 · 泛函 · 徑向基函數 · 有限差分 ·

2024 年 2 月 3 日

Variable-order fractional Laplacian and its accurate and efficient computations with meshfree methods

Yixuan Wu,Yanzhi Zhang

from arxiv, 27 pages, 8 figures

The variable-order fractional Laplacian plays an important role in the study of heterogeneous systems. In this paper, we propose the first numerical methods for the variable-order Laplacian $(-\Delta)^{\alpha({\bf x})/2}$ with $0 < \alpha({\bf x}) \le 2$, which will also be referred as the variable-order fractional Laplacian if $\alpha({\bf x})$ is strictly less than 2. We present a class of hypergeometric functions whose variable-order Laplacian can be analytically expressed. Building on these analytical results, we design the meshfree methods based on globally supported radial basis functions (RBFs), including Gaussian, generalized inverse multiquadric, and Bessel-type RBFs, to approximate the variable-order Laplacian $(-\Delta)^{\alpha({\bf x})/2}$. Our meshfree methods integrate the advantages of both pseudo-differential and hypersingular integral forms of the variable-order fractional Laplacian, and thus avoid numerically approximating the hypersingular integral. Moreover, our methods are simple and flexible of domain geometry, and their computer implementation remains the same for any dimension $d \ge 1$. Compared to finite difference methods, our methods can achieve a desired accuracy with much fewer points. This fact makes our method much attractive for problems involving variable-order fractional Laplacian where the number of points required is a critical cost. We then apply our method to study solution behaviors of variable-order fractional PDEs arising in different fields, including transition of waves between classical and fractional media, and coexistence of anomalous and normal diffusion in both diffusion equation and the Allen-Cahn equation. These results would provide insights for further understanding and applications of variable-order fractional derivatives.

穩健性 · 估計/估計量 · Processing（編程語言） · 推斷 · Continuity ·

2024 年 2 月 3 日

Robust inference for geographic regression discontinuity designs: assessing the impact of police precincts

Emmett B. Kendall,Brenden Beck,Joseph Antonelli

We study variation in policing outcomes attributable to differential policing practices in New York City (NYC) using geographic regression discontinuity designs (GeoRDDs). By focusing on small geographic windows near police precinct boundaries we can estimate local average treatment effects of police precincts on arrest rates. We propose estimands and develop estimators for the GeoRDD when the data come from a spatial point process. Additionally, standard GeoRDDs rely on continuity assumptions of the potential outcome surface or a local randomization assumption within a window around the boundary. These assumptions, however, can easily be violated in realistic applications. We develop a novel and robust approach to testing whether there are differences in policing outcomes that are caused by differences in police precincts across NYC. Importantly, this approach is applicable to standard regression discontinuity designs with both numeric and point process data. This approach is robust to violations of traditional assumptions made, and is valid under weaker assumptions. We use a unique form of resampling to provide a valid estimate of our test statistic's null distribution even under violations of standard assumptions. This procedure gives substantially different results in the analysis of NYC arrest rates than those that rely on standard assumptions.

語音識別 · MoDELS · Less · TransAct · 錯誤率 ·

2024 年 2 月 2 日

Digits micro-model for accurate and secure transactions

Chirag Chhablani,Nikhita Sharma,Jordan Hosier,Vijay K. Gurbani

from arxiv, 7 pages, 1 figure, 5 tables

Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google Speech-to-text (STT), or Amazon Transcribe -- or open source (OpenAI's Whisper). Such ASR models are trained on hundreds of thousands of hours of audio data and require considerable resources to run. Despite recent progress large speech recognition models, we highlight the potential of smaller, specialized "micro" models. Such light models can be trained perform well on number recognition specific tasks, competing with general models like Whisper or Google STT while using less than 80 minutes of training time and occupying at least an order of less memory resources. Also, unlike larger speech recognition models, micro-models are trained on carefully selected and curated datasets, which makes them highly accurate, agile, and easy to retrain, while using low compute resources. We present our work on creating micro models for multi-digit number recognition that handle diverse speaking styles reflecting real-world pronunciation patterns. Our work contributes to domain-specific ASR models, improving digit recognition accuracy, and privacy of data. An added advantage, their low resource consumption allows them to be hosted on-premise, keeping private data local instead uploading to an external cloud. Our results indicate that our micro-model makes less errors than the best-of-breed commercial or open-source ASRs in recognizing digits (1.8% error rate of our best micro-model versus 5.8% error rate of Whisper), and has a low memory footprint (0.66 GB VRAM for our model versus 11 GB VRAM for Whisper).