We construct maximally recoverable codes (corresponding to partial MDS codes) which are based on linearized Reed-Solomon codes. The new codes have a smaller field size requirement compared with known constructions. For certain asymptotic regimes, the constructed codes have order-optimal alphabet size, asymptotically matching the known lower bound.
For an abelian group $H$ acting on the set $[\ell]$, an $(H,\ell)$-lift of a graph $G_0$ is a graph obtained by replacing each vertex by $\ell$ copies, and each edge by a matching corresponding to the action of an element of $H$. In this work, we show the following explicit constructions of expanders obtained via abelian lifts. For every (transitive) abelian group $H \leqslant \text{Sym}(\ell)$, constant degree $d \ge 3$ and $\epsilon > 0$, we construct explicit $d$-regular expander graphs $G$ obtained from an $(H,\ell)$-lift of a (suitable) base $n$-vertex expander $G_0$ with the following parameters: (i) $\lambda(G) \le 2\sqrt{d-1} + \epsilon$, for any lift size $\ell \le 2^{n^{\delta}}$ where $\delta=\delta(d,\epsilon)$, (ii) $\lambda(G) \le \epsilon \cdot d$, for any lift size $\ell \le 2^{n^{\delta_0}}$ for a fixed $\delta_0 > 0$, when $d \ge d_0(\epsilon)$, or (iii) $\lambda(G) \le \widetilde{O}(\sqrt{d})$, for lift size ``exactly'' $\ell = 2^{\Theta(n)}$. As corollaries, we obtain explicit quantum lifted product codes of Panteleev and Kalachev of almost linear distance (and also in a wide range of parameters) and explicit classical quasi-cyclic LDPC codes with wide range of circulant sizes. Items $(i)$ and $(ii)$ above are obtained by extending the techniques of Mohanty, O'Donnell and Paredes [STOC 2020] for $2$-lifts to much larger abelian lift sizes (as a byproduct simplifying their construction). This is done by providing a new encoding of special walks arising in the trace power method, carefully "compressing'" depth-first search traversals. Result $(iii)$ is via a simpler proof of Agarwal et al. [SIAM J. Discrete Math 2019] at the expense of polylog factors in the expansion.
We approach self-supervised learning of image representations from a statistical dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt Independence Criterion (SSL-HSIC). SSL-HSIC maximizes dependence between representations of transformations of an image and the image identity, while minimizing the kernelized variance of those representations. This framework yields a new understanding of InfoNCE, a variational lower bound on the mutual information (MI) between different transformations. While the MI itself is known to have pathologies which can result in learning meaningless representations, its bound is much better behaved: we show that it implicitly approximates SSL-HSIC (with a slightly different regularizer). Our approach also gives us insight into BYOL, a negative-free SSL method, since SSL-HSIC similarly learns local neighborhoods of samples. SSL-HSIC allows us to directly optimize statistical dependence in time linear in the batch size, without restrictive data assumptions or indirect mutual information estimators. Trained with or without a target network, SSL-HSIC matches the current state-of-the-art for standard linear evaluation on ImageNet, semi-supervised learning and transfer to other classification and vision tasks such as semantic segmentation, depth estimation and object recognition. Code is available at //github.com/deepmind/ssl_hsic .
The conditions for a Runge--Kutta method to be of order $p$ with $p\ge 5$ for a scalar non-autonomous problem are a proper subset of the order conditions for a vector problem. Nevertheless, Runge--Kutta methods that were derived historically only for scalar problems happened to be of the same order for vector problems. We relate the order conditions for scalar problems to factorisations of the Runge--Kutta trees into "atomic stumps" and enumerate those conditions up to $p=20$. Using a special search procedure over unsatisfied order conditions, new Runge--Kutta methods of "ambiguous orders" five and six are constructed. These are used to verify the validity of the results.
Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects' predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model above and beyond the linear relationship with the raw covariates. We demonstrate the approach using simulations and a reanalysis of an Alzheimer's Disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power calculations that account for these gains. Sample size reductions between 10% and 30% are attainable when using prognostic models that explain a clinically realistic percentage of the outcome variance.
A matrix formalism for the determination of the best estimator in certain simulation-based parameter estimation problems will be presented and discussed. The equations, termed as the Linear Template Fit, combine a linear regression with a least square method and its optimization. The Linear Template Fit employs only predictions that are calculated beforehand and which are provided for a few values of the parameter of interest. Therefore, the Linear Template Fit is particularly suited for parameter estimation with computationally intensive simulations that are otherwise often limited in their usability for statistical inference, or for performance critical applications. Equations for error propagation are discussed, and the analytic form provides comprehensive insights into the parameter estimation problem. Furthermore, the quickly-converging algorithm of the Quadratic Template Fit will be presented, which is suitable for a non-linear dependence on the parameters. As an example application, a determination of the strong coupling constant, $\alpha_s(m_Z)$, from inclusive jet cross section data at the CERN Large Hadron Collider is studied and compared with previously published results.
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers. Compared to convolutional layers, FC layers are more efficient, better at modeling the long-range dependencies and positional patterns, but worse at capturing the local structures, hence usually less favored for image recognition. We propose a structural re-parameterization technique that adds local prior into an FC to make it powerful for image recognition. Specifically, we construct convolutional layers inside a RepMLP during training and merge them into the FC for inference. On CIFAR, a simple pure-MLP model shows performance very close to CNN. By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs. Our intriguing findings highlight that combining the global representational capacity and positional perception of FC with the local prior of convolution can improve the performance of neural network with faster speed on both the tasks with translation invariance (e.g., semantic segmentation) and those with aligned images and positional patterns (e.g., face recognition). The code and models are available at //github.com/DingXiaoH/RepMLP.
We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to guarantee both optimism and convergence of the associated value iteration scheme. We prove that EB-SSP achieves the minimax regret rate $\widetilde{O}(B_{\star} \sqrt{S A K})$, where $K$ is the number of episodes, $S$ is the number of states, $A$ is the number of actions and $B_{\star}$ bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of $B_{\star}$, nor of $T_{\star}$ which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of $T_{\star}$ is available) where the regret only contains a logarithmic dependence on $T_{\star}$, thus yielding the first horizon-free regret bound beyond the finite-horizon MDP setting.
We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.
Various 3D reconstruction methods have enabled civil engineers to detect damage on a road surface. To achieve the millimetre accuracy required for road condition assessment, a disparity map with subpixel resolution needs to be used. However, none of the existing stereo matching algorithms are specially suitable for the reconstruction of the road surface. Hence in this paper, we propose a novel dense subpixel disparity estimation algorithm with high computational efficiency and robustness. This is achieved by first transforming the perspective view of the target frame into the reference view, which not only increases the accuracy of the block matching for the road surface but also improves the processing speed. The disparities are then estimated iteratively using our previously published algorithm where the search range is propagated from three estimated neighbouring disparities. Since the search range is obtained from the previous iteration, errors may occur when the propagated search range is not sufficient. Therefore, a correlation maxima verification is performed to rectify this issue, and the subpixel resolution is achieved by conducting a parabola interpolation enhancement. Furthermore, a novel disparity global refinement approach developed from the Markov Random Fields and Fast Bilateral Stereo is introduced to further improve the accuracy of the estimated disparity map, where disparities are updated iteratively by minimising the energy function that is related to their interpolated correlation polynomials. The algorithm is implemented in C language with a near real-time performance. The experimental results illustrate that the absolute error of the reconstruction varies from 0.1 mm to 3 mm.
In this paper we address the problem of learning robust cross-domain representations for sketch-based image retrieval (SBIR). While most SBIR approaches focus on extracting low- and mid-level descriptors for direct feature matching, recent works have shown the benefit of learning coupled feature representations to describe data from two related sources. However, cross-domain representation learning methods are typically cast into non-convex minimization problems that are difficult to optimize, leading to unsatisfactory performance. Inspired by self-paced learning, a learning methodology designed to overcome convergence issues related to local optima by exploiting the samples in a meaningful order (i.e. easy to hard), we introduce the cross-paced partial curriculum learning (CPPCL) framework. Compared with existing self-paced learning methods which only consider a single modality and cannot deal with prior knowledge, CPPCL is specifically designed to assess the learning pace by jointly handling data from dual sources and modality-specific prior information provided in the form of partial curricula. Additionally, thanks to the learned dictionaries, we demonstrate that the proposed CPPCL embeds robust coupled representations for SBIR. Our approach is extensively evaluated on four publicly available datasets (i.e. CUFS, Flickr15K, QueenMary SBIR and TU-Berlin Extension datasets), showing superior performance over competing SBIR methods.