The use of Bayesian filtering has been widely used in mathematical finance, primarily in Stochastic Volatility models. They help in estimating unobserved latent variables from observed market data. This field saw huge developments in recent years, because of the increased computational power and increased research in the model parameter estimation and implied volatility theory. In this paper, we design a novel method to estimate underlying states (volatility and risk) from option prices using Bayesian filtering theory and Posterior Cramer-Rao Lower Bound (PCRLB), further using it for option price prediction. Several Bayesian filters like Extended Kalman Filter (EKF), Unscented Kalman Filter (UKF), Particle Filter (PF) are used for latent state estimation of Black-Scholes model under a GARCH model dynamics. We employ an Average and Best case switching strategy for adaptive state estimation of a non-linear, discrete-time state space model (SSM) like Black-Scholes, using PCRLB based performance measure to judge the best filter at each time step [1]. Since estimating closed-form solution of PCRLB is non-trivial, we employ a particle filter based approximation of PCRLB based on [2]. We test our proposed framework on option data from S$\&$P 500, estimating the underlying state from the real option price, and using it to estimate theoretical price of the option and forecasting future prices. Our proposed method performs much better than the individual applied filter used for estimating the underlying state and substantially improve forecasting capabilities.
The estimation of parameter standard errors for semi-variogram models is challenging, given the two-step process required to fit a parametric model to spatially correlated data. Motivated by an application in the social-epidemiology, we focus on exponential semi-variogram models fitted to data between 500 to 2000 observations and little control over the sampling design. Previously proposed methods for the estimation of standard errors cannot be applied in this context. Approximate closed form solutions are too costly using generalized least squares in terms of memory capacities. The generalized bootstrap proposed by Olea and Pardo-Ig\'uzquiza is nonetheless applicable with weighted instead of generalized least squares. However, the standard error estimates are hugely biased and imprecise. Therefore, we propose a filtering method added to the generalized bootstrap. The new development is presented and evaluated with a simulation study which shows that the generalized bootstrap with check-based filtering leads to massively improved results compared to the quantile-based filter method and previously developed approaches. We provide a case study using birthweight data.
Evaluating predictive models is a crucial task in predictive analytics. This process is especially challenging with time series data where the observations show temporal dependencies. Several studies have analysed how different performance estimation methods compare with each other for approximating the true loss incurred by a given forecasting model. However, these studies do not address how the estimators behave for model selection: the ability to select the best solution among a set of alternatives. We address this issue and compare a set of estimation methods for model selection in time series forecasting tasks. We attempt to answer two main questions: (i) how often is the best possible model selected by the estimators; and (ii) what is the performance loss when it does not. We empirically found that the accuracy of the estimators for selecting the best solution is low, and the overall forecasting performance loss associated with the model selection process ranges from 1.2% to 2.3%. We also discovered that some factors, such as the sample size, are important in the relative performance of the estimators.
As a traditional and widely-adopted mortality rate projection technique, by representing the log mortality rate as a simple bilinear form $\log(m_{x,t})=a_x+b_xk_t$. The Lee-Carter model has been extensively studied throughout the past 30 years, however, the performance of the model in the presence of outliers has been paid little attention, particularly for the parameter estimation of $b_x$. In this paper, we propose a robust estimation method for Lee-Carter model by formulating it as a probabilistic principal component analysis (PPCA) with multivariate $t$-distributions, and an efficient expectation-maximization (EM) algorithm for implementation. The advantages of the method are threefold. It yields significantly more robust estimates of both $b_x$ and $k_t$, preserves the fundamental interpretation for $b_x$ as the first principal component as in the traditional approach and is flexible to be integrated into other existing time series models for $k_t$. The parameter uncertainties are examined by adopting a standard residual bootstrap. A simulation study based on Human Mortality Database shows superior performance of the proposed model compared to other conventional approaches.
Trajectory forecasting is critical for autonomous platforms to make safe planning and actions. Currently, most trajectory forecasting methods assume that object trajectories have been extracted and directly develop trajectory predictors based on the ground truth trajectories. However, this assumption does not hold in practical situations. Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories. In this paper, we propose a trajectory predictor directly based on detection results without relying on explicitly formed trajectories. Different from the traditional methods which encode the motion cue of an agent based on its clearly defined trajectory, we extract the motion information only based on the affinity cues among detection results, in which an affinity-aware state update mechanism is designed to take the uncertainty of association into account. In addition, considering that there could be multiple plausible matching candidates, we aggregate the states of them. This design relaxes the undesirable effect of noisy trajectory obtained from data association. Extensive ablation experiments validate the effectiveness of our method and its generalization ability on different detectors. Cross-comparison to other forecasting schemes further proves the superiority of our method. Code will be released upon acceptance.
The stabilizer rank of a quantum state $\psi$ is the minimal $r$ such that $\left| \psi \right \rangle = \sum_{j=1}^r c_j \left|\varphi_j \right\rangle$ for $c_j \in \mathbb{C}$ and stabilizer states $\varphi_j$. The running time of several classical simulation methods for quantum circuits is determined by the stabilizer rank of the $n$-th tensor power of single-qubit magic states. We prove a lower bound of $\Omega(n)$ on the stabilizer rank of such states, improving a previous lower bound of $\Omega(\sqrt{n})$ of Bravyi, Smith and Smolin (arXiv:1506.01396). Further, we prove that for a sufficiently small constant $\delta$, the stabilizer rank of any state which is $\delta$-close to those states is $\Omega(\sqrt{n}/\log n)$. This is the first non-trivial lower bound for approximate stabilizer rank. Our techniques rely on the representation of stabilizer states as quadratic functions over affine subspaces of $\mathbb{F}_2^n$, and we use tools from analysis of boolean functions and complexity theory. The proof of the first result involves a careful analysis of directional derivatives of quadratic polynomials, whereas the proof of the second result uses Razborov-Smolensky low degree polynomial approximations and correlation bounds against the majority function.
Policy makers typically face the problem of wanting to estimate the long-term effects of novel treatments, while only having historical data of older treatment options. We assume access to a long-term dataset where only past treatments were administered and a short-term dataset where novel treatments have been administered. We propose a surrogate based approach where we assume that the long-term effect is channeled through a multitude of available short-term proxies. Our work combines three major recent techniques in the causal machine learning literature: surrogate indices, dynamic treatment effect estimation and double machine learning, in a unified pipeline. We show that our method is consistent and provides root-n asymptotically normal estimates under a Markovian assumption on the data and the observational policy. We use a data-set from a major corporation that includes customer investments over a three year period to create a semi-synthetic data distribution where the major qualitative properties of the real dataset are preserved. We evaluate the performance of our method and discuss practical challenges of deploying our formal methodology and how to address them.
Simulator-based models are models for which the likelihood is intractable but simulation of synthetic data is possible. They are often used to describe complex real-world phenomena, and as such can often be misspecified in practice. Unfortunately, existing Bayesian approaches for simulators are known to perform poorly in those cases. In this paper, we propose a novel algorithm based on the posterior bootstrap and maximum mean discrepancy estimators. This leads to a highly-parallelisable Bayesian inference algorithm with strong robustness properties. This is demonstrated through an in-depth theoretical study which includes generalisation bounds and proofs of frequentist consistency and robustness of our posterior. The approach is then assessed on a range of examples including a g-and-k distribution and a toggle-switch model.
We consider the combinatorial bandits problem with semi-bandit feedback under finite sampling budget constraints, in which the learner can carry out its action only for a limited number of times specified by an overall budget. The action is to choose a set of arms, whereupon feedback for each arm in the chosen set is received. Unlike existing works, we study this problem in a non-stochastic setting with subset-dependent feedback, i.e., the semi-bandit feedback received could be generated by an oblivious adversary and also might depend on the chosen set of arms. In addition, we consider a general feedback scenario covering both the numerical-based as well as preference-based case and introduce a sound theoretical framework for this setting guaranteeing sensible notions of optimal arms, which a learner seeks to find. We suggest a generic algorithm suitable to cover the full spectrum of conceivable arm elimination strategies from aggressive to conservative. Theoretical questions about the sufficient and necessary budget of the algorithm to find the best arm are answered and complemented by deriving lower bounds for any learning algorithm for this problem scenario.
Panel Vector Autoregressions (PVARs) are a popular tool for analyzing multi-country datasets. However, the number of estimated parameters can be enormous, leading to computational and statistical issues. In this paper, we develop fast Bayesian methods for estimating PVARs using integrated rotated Gaussian approximations. We exploit the fact that domestic information is often more important than international information and group the coefficients accordingly. Fast approximations are used to estimate the latter while the former are estimated with precision using Markov chain Monte Carlo techniques. We illustrate, using a huge model of the world economy, that it produces competitive forecasts quickly.
Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.