We study dihedral quantum codes of short block length, a large class of quantum CSS codes obtained by the lifted product construction. We present code construction and give a formula for the code dimension, depending on the two classical codes on which the CSS code is based on. We also give a lower bound on the code distance. Finally we construct an example of short dihedral quantum codes, improving parameters of previously known quantum codes.
We study dynamic algorithms in the model of algorithms with predictions. We assume the algorithm is given imperfect predictions regarding future updates, and we ask how such predictions can be used to improve the running time. This can be seen as a model interpolating between classic online and offline dynamic algorithms. Our results give smooth tradeoffs between these two extreme settings. First, we give algorithms for incremental and decremental transitive closure and approximate APSP that take as an additional input a predicted sequence of updates (edge insertions, or edge deletions, respectively). They preprocess it in $\tilde{O}(n^{(3+\omega)/2})$ time, and then handle updates in $\tilde{O}(1)$ worst-case time and queries in $\tilde{O}(\eta^2)$ worst-case time. Here $\eta$ is an error measure that can be bounded by the maximum difference between the predicted and actual insertion (deletion) time of an edge, i.e., by the $\ell_\infty$-error of the predictions. The second group of results concerns fully dynamic problems with vertex updates, where the algorithm has access to a predicted sequence of the next $n$ updates. We show how to solve fully dynamic triangle detection, maximum matching, single-source reachability, and more, in $O(n^{\omega-1}+n\eta_i)$ worst-case update time. Here $\eta_i$ denotes how much earlier the $i$-th update occurs than predicted. Our last result is a reduction that transforms a worst-case incremental algorithm without predictions into a fully dynamic algorithm which is given a predicted deletion time for each element at the time of its insertion. As a consequence we can, e.g., maintain fully dynamic exact APSP with such predictions in $\tilde{O}(n^2)$ worst-case vertex insertion time and $\tilde{O}(n^2 (1+\eta_i))$ worst-case vertex deletion time (for the prediction error $\eta_i$ defined as above).
Negative control variables are sometimes used in non-experimental studies to detect the presence of confounding by hidden factors. A negative control outcome (NCO) is an outcome that is influenced by unobserved confounders of the exposure effects on the outcome in view, but is not causally impacted by the exposure. Tchetgen Tchetgen (2013) introduced the Control Outcome Calibration Approach (COCA) as a formal NCO counterfactual method to detect and correct for residual confounding bias. For identification, COCA treats the NCO as an error-prone proxy of the treatment-free counterfactual outcome of interest, and involves regressing the NCO on the treatment-free counterfactual, together with a rank-preserving structural model which assumes a constant individual-level causal effect. In this work, we establish nonparametric COCA identification for the average causal effect for the treated, without requiring rank-preservation, therefore accommodating unrestricted effect heterogeneity across units. This nonparametric identification result has important practical implications, as it provides single proxy confounding control, in contrast to recently proposed proximal causal inference, which relies for identification on a pair of confounding proxies. For COCA estimation we propose three separate strategies: (i) an extended propensity score approach, (ii) an outcome bridge function approach, and (iii) a doubly-robust approach. Finally, we illustrate the proposed methods in an application evaluating the causal impact of a Zika virus outbreak on birth rate in Brazil.
The fidelity of relighting is bounded by both geometry and appearance representations. For geometry, both mesh and volumetric approaches have difficulty modeling intricate structures like 3D hair geometry. For appearance, existing relighting models are limited in fidelity and often too slow to render in real-time with high-resolution continuous environments. In this work, we present Relightable Gaussian Codec Avatars, a method to build high-fidelity relightable head avatars that can be animated to generate novel expressions. Our geometry model based on 3D Gaussians can capture 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences. To support diverse materials of human heads such as the eyes, skin, and hair in a unified manner, we present a novel relightable appearance model based on learnable radiance transfer. Together with global illumination-aware spherical harmonics for the diffuse components, we achieve real-time relighting with spatially all-frequency reflections using spherical Gaussians. This appearance model can be efficiently relit under both point light and continuous illumination. We further improve the fidelity of eye reflections and enable explicit gaze control by introducing relightable explicit eye models. Our method outperforms existing approaches without compromising real-time performance. We also demonstrate real-time relighting of avatars on a tethered consumer VR headset, showcasing the efficiency and fidelity of our avatars.
We present a framework for constraining the automatic sequential generation of equations to obey the rules of dimensional analysis by construction. Combining this approach with reinforcement learning, we built $\Phi$-SO, a Physical Symbolic Optimization method for recovering analytical functions from physical data leveraging units constraints. Our symbolic regression algorithm achieves state-of-the-art results in contexts in which variables and constants have known physical units, outperforming all other methods on SRBench's Feynman benchmark in the presence of noise (exceeding 0.1%) and showing resilience even in the presence of significant (10%) levels of noise.
Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions. Choosing BC or IRL for imitation depends on the quality and state-action coverage of the demonstrations, as well as additional access to the Markov decision process. Hybrid strategies that combine BC and IRL are not common, as initial policy optimization against inaccurate rewards diminishes the benefit of pretraining the policy with BC. This work derives an imitation method that captures the strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement learning setting, we show that the behaviour-cloned policy can be used as both a shaped reward and a critic hypothesis space by inverting the regularized policy update. This coherency facilitates fine-tuning cloned policies using the reward estimate and additional interactions with the environment. This approach conveniently achieves imitation learning through initial behaviour cloning, followed by refinement via RL with online or offline data sources. The simplicity of the approach enables graceful scaling to high-dimensional and vision-based tasks, with stable learning and minimal hyperparameter tuning, in contrast to adversarial approaches. For the open-source implementation and simulation results, see //joemwatson.github.io/csil/.
We propose Functional Flow Matching (FFM), a function-space generative model that generalizes the recently-introduced Flow Matching model to operate in infinite-dimensional spaces. Our approach works by first defining a path of probability measures that interpolates between a fixed Gaussian measure and the data distribution, followed by learning a vector field on the underlying space of functions that generates this path of measures. Our method does not rely on likelihoods or simulations, making it well-suited to the function space setting. We provide both a theoretical framework for building such models and an empirical evaluation of our techniques. We demonstrate through experiments on several real-world benchmarks that our proposed FFM method outperforms several recently proposed function-space generative models.
Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this article, we comprehensively review DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.
Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyper-parameters from a candidate pool of \textbf{32 base attackers}. We design a search space where attack policy is represented as an attacking sequence, i.e., the output of the previous attacker is used as the initialization input for successors. Multi-objective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (\textbf{6 $\times$ faster than AutoAttack}), and achieves the new state-of-the-art on $l_{\infty}$, $l_{2}$ and unrestricted adversarial attacks.
We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.