A natural goal when designing online learning algorithms for non-stationary environments is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Intuitively, when the variation is small, it should be easier for the algorithm to achieve low regret, since past observations are predictive of future inputs. Such data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits. We obtain the first pathlength regret bounds for online control and estimation (e.g. Kalman filtering) in linear dynamical systems. The key idea in our derivation is to reduce pathlength-optimal filtering and control to certain variational problems in robust estimation and control; these reductions may be of independent interest. Numerical simulations confirm that our pathlength-optimal algorithms outperform traditional $H_2$ and $H_{\infty}$ algorithms when the environment varies over time.
Reducing the computational time required by high-fidelity, full order models (FOMs) for the solution of problems in cardiac mechanics is crucial to allow the translation of patient-specific simulations into clinical practice. While FOMs, such as those based on the finite element method, provide valuable information of the cardiac mechanical function, up to hundreds of thousands degrees of freedom may be needed to obtain accurate numerical results. As a matter of fact, simulating even just a few heartbeats can require hours to days of CPU time even on powerful supercomputers. In addition, cardiac models depend on a set of input parameters that we could let vary in order to explore multiple virtual scenarios. To compute reliable solutions at a greatly reduced computational cost, we rely on a reduced basis method empowered with a new deep-learning based operator approximation, which we refer to as Deep-HyROMnet technique. Our strategy combines a projection-based POD-Galerkin method with deep neural networks for the approximation of (reduced) nonlinear operators, overcoming the typical computational bottleneck associated with standard hyper-reduction techniques. This method is shown to provide reliable approximations to cardiac mechanics problems outperforming classical projection-based ROMs in terms of computational speed-up of orders of magnitude, and enhancing forward uncertainty quantification analysis otherwise unaffordable.
Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not all necessarily pre-specified. Online error control methodology provides a possible solution to the problem of multiplicity for platform trials where a relatively large number of hypotheses are expected to be tested over time. In the online testing framework, hypotheses are tested in a sequential manner, where at each time-step an analyst decides whether to reject the current null hypothesis without knowledge of future tests but based solely on past decisions. Methodology has recently been developed for online control of the false discovery rate as well as the familywise error rate (FWER). In this paper, we describe how to apply online error control to the platform trial setting, present extensive simulation results, and give some recommendations for the use of this new methodology in practice. We show that the algorithms for online error rate control can have a substantially lower FWER than uncorrected testing, while still achieving noticeable gains in power when compared with the use of a Bonferroni procedure. We also illustrate how online error control would have impacted a currently ongoing platform trial.
We study ROUND-UFP and ROUND-SAP, two generalizations of the classical BIN PACKING problem that correspond to the unsplittable flow problem on a path (UFP) and the storage allocation problem (SAP), respectively. We are given a path with capacities on its edges and a set of tasks where for each task we are given a demand and a subpath. In ROUND-UFP, the goal is to find a packing of all tasks into a minimum number of copies (rounds) of the given path such that for each copy, the total demand of tasks on any edge does not exceed the capacity of the respective edge. In ROUND-SAP, the tasks are considered to be rectangles and the goal is to find a non-overlapping packing of these rectangles into a minimum number of rounds such that all rectangles lie completely below the capacity profile of the edges. We show that in contrast to BIN PACKING, both the problems do not admit an asymptotic polynomial-time approximation scheme (APTAS), even when all edge capacities are equal. However, for this setting, we obtain asymptotic $(2+\varepsilon)$-approximations for both problems. For the general case, we obtain an $O(\log\log n)$-approximation algorithm and an $O(\log\log\frac{1}{\delta})$-approximation under $(1+\delta)$-resource augmentation for both problems. For the intermediate setting of the no bottleneck assumption (i.e., the maximum task demand is at most the minimum edge capacity), we obtain absolute $12$- and asymptotic $(16+\varepsilon)$-approximation algorithms for ROUND-UFP and ROUND-SAP, respectively.
Policy optimization is among the most popular and successful reinforcement learning algorithms, and there is increasing interest in understanding its theoretical guarantees. In this work, we initiate the study of policy optimization for the stochastic shortest path (SSP) problem, a goal-oriented reinforcement learning model that strictly generalizes the finite-horizon model and better captures many applications. We consider a wide range of settings, including stochastic and adversarial environments under full information or bandit feedback, and propose a policy optimization algorithm for each setting that makes use of novel correction terms and/or variants of dilated bonuses (Luo et al., 2021). For most settings, our algorithm is shown to achieve a near-optimal regret bound. One key technical contribution of this work is a new approximation scheme to tackle SSP problems that we call \textit{stacked discounted approximation} and use in all our proposed algorithms. Unlike the finite-horizon approximation that is heavily used in recent SSP algorithms, our new approximation enables us to learn a near-stationary policy with only logarithmic changes during an episode and could lead to an exponential improvement in space complexity.
Accurate and efficient estimation of the high dimensional channels is one of the critical challenges for practical applications of massive multiple-input multiple-output (MIMO). In the context of hybrid analog-digital (HAD) transceivers, channel estimation becomes even more complicated due to information loss caused by limited radio-frequency chains. The conventional compressive sensing (CS) algorithms usually suffer from unsatisfactory performance and high computational complexity. In this paper, we propose a novel deep learning (DL) based framework for uplink channel estimation in HAD massive MIMO systems. To better exploit the sparsity structure of channels in the angular domain, a novel angular space segmentation method is proposed, where the entire angular space is segmented into many small regions and a dedicated neural network is trained offline for each region. During online testing, the most suitable network is selected based on the information from the global positioning system. Inside each neural network, the region-specific measurement matrix and channel estimator are jointly optimized, which not only improves the signal measurement efficiency, but also enhances the channel estimation capability. Simulation results show that the proposed approach significantly outperforms the state-of-the-art CS algorithms in terms of estimation performance and computational complexity.
We say that a continuous real-valued function $x$ admits the Hurst roughness exponent $H$ if the $p^{\text{th}}$ variation of $x$ converges to zero if $p>1/H$ and to infinity if $p<1/H$. For the sample paths of many stochastic processes, such as fractional Brownian motion, the Hurst roughness exponent exists and equals the standard Hurst parameter. In our main result, we provide a mild condition on the Faber--Schauder coefficients of $x$ under which the Hurst roughness exponent exists and is given as the limit of the classical Gladyshev estimates $\widehat H_n(x)$. This result can be viewed as a strong consistency result for the Gladyshev estimators in an entirely model-free setting, because no assumption whatsoever is made on the possible dynamics of the function $x$. Nonetheless, our proof is probabilistic and relies on a martingale that is hidden in the Faber--Schauder expansion of $x$. Since the Gladyshev estimators are not scale-invariant, we construct several scale-invariant estimators that are derived from the sequence $(\widehat H_n)_{n\in\mathbb N}$. We also discuss how a dynamic change in the Hurst roughness parameter of a time series can be detected. Finally, we extend our results to the case in which the $p^{\text{th}}$ variation of $x$ is defined over a sequence of unequally spaced partitions. Our results are illustrated by means of high-frequency financial time series.
In this paper we propose a deep learning based numerical scheme for strongly coupled FBSDE, stemming from stochastic control. It is a modification of the deep BSDE method in which the initial value to the backward equation is not a free parameter, and with a new loss function being the weighted sum of the cost of the control problem, and a variance term which coincides with the means square error in the terminal condition. We show by a numerical example that a direct extension of the classical deep BSDE method to FBSDE, fails for a simple linear-quadratic control problem, and motivate why the new method works. Under regularity and boundedness assumptions on the exact controls of time continuous and time discrete control problems we provide an error analysis for our method. We show empirically that the method converges for three different problems, one being the one that failed for a direct extension of the deep BSDE method.
Successful applications of InfoNCE and its variants have popularized the use of contrastive variational mutual information (MI) estimators in machine learning. While featuring superior stability, these estimators crucially depend on costly large-batch training, and they sacrifice bound tightness for variance reduction. To overcome these limitations, we revisit the mathematics of popular variational MI bounds from the lens of unnormalized statistical modeling and convex optimization. Our investigation not only yields a new unified theoretical framework encompassing popular variational MI bounds but also leads to a novel, simple, and powerful contrastive MI estimator named as FLO. Theoretically, we show that the FLO estimator is tight, and it provably converges under stochastic gradient descent. Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently. The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
In this paper, we are interested in the performance of a variable-length stop-feedback (VLSF) code with $m$ optimal decoding times for the binary-input additive white Gaussian noise channel. We first develop tight approximations on the tail probability of length-$n$ cumulative information density. Building on the work of Yavas \emph{et al.}, for a given information density threshold, we formulate the integer program of minimizing the upper bound on average blocklength over all decoding times subject to the average error probability, minimum gap and integer constraints. Eventually, minimization of locally minimum upper bounds over all thresholds will yield the globally minimum upper bound and this is called the two-step minimization. For the integer program, we present a greedy algorithm that yields possibly suboptimal integer decoding times. By allowing a positive real-valued decoding time, we develop the gap-constrained sequential differential optimization (SDO) procedure that sequentially produces the optimal, real-valued decoding times. We identify the error regime in which Polyanskiy's scheme of stopping at zero does not improve the achievability bound. In this error regime, the two-step minimization with the gap-constrained SDO shows that a finite $m$ suffices to attain Polyanskiy's bound for VLSF codes with $m = \infty$.
We introduce Monte-Carlo Attention (MCA), a randomized approximation method for reducing the computational cost of self-attention mechanisms in Transformer architectures. MCA exploits the fact that the importance of each token in an input sequence varies with respect to their attention scores; thus, some degree of error can be tolerable when encoding tokens with low attention. Using approximate matrix multiplication, MCA applies different error bounds to encode input tokens such that those with low attention scores are computed with relaxed precision, whereas errors of salient elements are minimized. MCA can operate in parallel with other attention optimization schemes and does not require model modification. We study the theoretical error bounds and demonstrate that MCA reduces attention complexity (in FLOPS) for various Transformer models by up to 11$\times$ in GLUE benchmarks without compromising model accuracy.