This paper presents a {\delta}-PI algorithm which is based on damped Newton method for the H{\infty} tracking control problem of unknown continuous-time nonlinear system. A discounted performance function and an augmented system are used to get the tracking Hamilton-Jacobi-Isaac (HJI) equation. Tracking HJI equation is a nonlinear partial differential equation, traditional reinforcement learning methods for solving the tracking HJI equation are mostly based on the Newton method, which usually only satisfies local convergence and needs a good initial guess. Based upon the damped Newton iteration operator equation, a generalized tracking Bellman equation is derived firstly. The {\delta}-PI algorithm can seek the optimal solution of the tracking HJI equation by iteratively solving the generalized tracking Bellman equation. On-policy learning and off-policy learning {\delta}-PI reinforcement learning methods are provided, respectively. Off-policy version {\delta}-PI algorithm is a model-free algorithm which can be performed without making use of a priori knowledge of the system dynamics. NN-based implementation scheme for the off-policy {\delta}-PI algorithms is shown. The suitability of the model-free {\delta}-PI algorithm is illustrated with a nonlinear system simulation.
The $k$-principal component analysis ($k$-PCA) problem is a fundamental algorithmic primitive that is widely-used in data analysis and dimensionality reduction applications. In statistical settings, the goal of $k$-PCA is to identify a top eigenspace of the covariance matrix of a distribution, which we only have implicit access to via samples. Motivated by these implicit settings, we analyze black-box deflation methods as a framework for designing $k$-PCA algorithms, where we model access to the unknown target matrix via a black-box $1$-PCA oracle which returns an approximate top eigenvector, under two popular notions of approximation. Despite being arguably the most natural reduction-based approach to $k$-PCA algorithm design, such black-box methods, which recursively call a $1$-PCA oracle $k$ times, were previously poorly-understood. Our main contribution is significantly sharper bounds on the approximation parameter degradation of deflation methods for $k$-PCA. For a quadratic form notion of approximation we term ePCA (energy PCA), we show deflation methods suffer no parameter loss. For an alternative well-studied approximation notion we term cPCA (correlation PCA), we tightly characterize the parameter regimes where deflation methods are feasible. Moreover, we show that in all feasible regimes, $k$-cPCA deflation algorithms suffer no asymptotic parameter loss for any constant $k$. We apply our framework to obtain state-of-the-art $k$-PCA algorithms robust to dataset contamination, improving prior work both in sample complexity and approximation quality.
The remarkable capability of large language models (LLMs) for in-context learning (ICL) needs to be activated by demonstration examples. Prior work has extensively explored the selection of examples for ICL, predominantly following the "select then organize" paradigm, such approaches often neglect the internal relationships between examples and exist an inconsistency between the training and inference. In this paper, we formulate the problem as a $\textit{se}$quential $\textit{se}$lection problem and introduce $Se^2$, a sequential-aware method that leverages the LLM's feedback on varying context, aiding in capturing inter-relationships and sequential information among examples, significantly enriching the contextuality and relevance of ICL prompts. Meanwhile, we utilize beam search to seek and construct example sequences, enhancing both quality and diversity. Extensive experiments across 23 NLP tasks from 8 distinct categories illustrate that $Se^2$ markedly surpasses competitive baselines and achieves 42% relative improvement over random selection. Further in-depth analysis show the effectiveness of proposed strategies, highlighting $Se^2$'s exceptional stability and adaptability across various scenarios. Our code will be released to facilitate future research.
A convergent numerical method for $\alpha$-dissipative solutions of the Hunter--Saxton equation is derived. The method is based on applying a tailor-made projection operator to the initial data, and then solving exactly using the generalized method of characteristics. The projection step is the only step that introduces any approximation error. It is therefore crucial that its design ensures not only a good approximation of the initial data, but also that errors due to the energy dissipation at later times remain small. Furthermore, it is shown that the main quantity of interest, the wave profile, converges in $L^{\infty}$ for all $t \geq 0$, while a subsequence of the energy density converges weakly for almost every time.
Inferring causal structure from data is a challenging task of fundamental importance in science. Observational data are often insufficient to identify a system's causal structure uniquely. While conducting interventions (i.e., experiments) can improve the identifiability, such samples are usually challenging and expensive to obtain. Hence, experimental design approaches for causal discovery aim to minimize the number of interventions by estimating the most informative intervention target. In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT, that 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function. We provide extensive experiments in simulated and real-world datasets and demonstrate that GIT performs on par with competitive baselines, surpassing them in the low-data regime.
We analyze deep Neural Network emulation rates of smooth functions with point singularities in bounded, polytopal domains $\mathrm{D} \subset \mathbb{R}^d$, $d=2,3$. We prove exponential emulation rates in Sobolev spaces in terms of the number of neurons and in terms of the number of nonzero coefficients for Gevrey-regular solution classes defined in terms of weighted Sobolev scales in $\mathrm{D}$, comprising the countably-normed spaces of I.M. Babu\v{s}ka and B.Q. Guo. As intermediate result, we prove that continuous, piecewise polynomial high order (``$p$-version'') finite elements with elementwise polynomial degree $p\in\mathbb{N}$ on arbitrary, regular, simplicial partitions of polyhedral domains $\mathrm{D} \subset \mathbb{R}^d$, $d\geq 2$ can be exactly emulated by neural networks combining ReLU and ReLU$^2$ activations. On shape-regular, simplicial partitions of polytopal domains $\mathrm{D}$, both the number of neurons and the number of nonzero parameters are proportional to the number of degrees of freedom of the finite element space, in particular for the $hp$-Finite Element Method of I.M. Babu\v{s}ka and B.Q. Guo.
Geometric matching is an important topic in computational geometry and has been extensively studied over decades. In this paper, we study a geometric-matching problem, known as geometric many-to-many matching. In this problem, the input is a set $S$ of $n$ colored points in $\mathbb{R}^d$, which implicitly defines a graph $G = (S,E(S))$ where $E(S) = \{(p,q): p,q \in S \text{ have different colors}\}$, and the goal is to compute a minimum-cost subset $E^* \subseteq E(S)$ of edges that cover all points in $S$. Here the cost of $E^*$ is the sum of the costs of all edges in $E^*$, where the cost of a single edge $e$ is the Euclidean distance (or more generally, the $L_p$-distance) between the two endpoints of $e$. Our main result is a $(1+\varepsilon)$-approximation algorithm with an optimal running time $O_\varepsilon(n \log n)$ for geometric many-to-many matching in any fixed dimension, which works under any $L_p$-norm. This is the first near-linear approximation scheme for the problem in any $d \geq 2$. Prior to this work, only the bipartite case of geometric many-to-many matching was considered in $\mathbb{R}^1$ and $\mathbb{R}^2$, and the best known approximation scheme in $\mathbb{R}^2$ takes $O_\varepsilon(n^{1.5} \cdot \mathsf{poly}(\log n))$ time.
The variational autoencoder (VAE) typically employs a standard normal prior as a regularizer for the probabilistic latent encoder. However, the Gaussian tail often decays too quickly to effectively accommodate the encoded points, failing to preserve crucial structures hidden in the data. In this paper, we explore the use of heavy-tailed models to combat over-regularization. Drawing upon insights from information geometry, we propose $t^3$VAE, a modified VAE framework that incorporates Student's t-distributions for the prior, encoder, and decoder. This results in a joint model distribution of a power form which we argue can better fit real-world datasets. We derive a new objective by reformulating the evidence lower bound as joint optimization of KL divergence between two statistical manifolds and replacing with $\gamma$-power divergence, a natural alternative for power families. $t^3$VAE demonstrates superior generation of low-density regions when trained on heavy-tailed synthetic data. Furthermore, we show that $t^3$VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
Given the growing significance of reliable, trustworthy, and explainable machine learning, the requirement of uncertainty quantification for anomaly detection systems has become increasingly important. In this context, effectively controlling Type I error rates ($\alpha$) without compromising the statistical power ($1-\beta$) of these systems can build trust and reduce costs related to false discoveries, particularly when follow-up procedures are expensive. Leveraging the principles of conformal prediction emerges as a promising approach for providing respective statistical guarantees by calibrating a model's uncertainty. This work introduces a novel framework for anomaly detection, termed cross-conformal anomaly detection, building upon well-known cross-conformal methods designed for prediction tasks. With that, it addresses a natural research gap by extending previous works in the context of inductive conformal anomaly detection, relying on the split-conformal approach for model calibration. Drawing on insights from conformal prediction, we demonstrate that the derived methods for calculating cross-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) for uncertainty-quantified anomaly detection on benchmark datasets.
This paper introduces $\infty$-Diff, a generative diffusion model defined in an infinite-dimensional Hilbert space, which can model infinite resolution data. By training on randomly sampled subsets of coordinates and denoising content only at those locations, we learn a continuous function for arbitrary resolution sampling. Unlike prior neural field-based infinite-dimensional models, which use point-wise functions requiring latent compression, our method employs non-local integral operators to map between Hilbert spaces, allowing spatial context aggregation. This is achieved with an efficient multi-scale function-space architecture that operates directly on raw sparse coordinates, coupled with a mollified diffusion process that smooths out irregularities. Through experiments on high-resolution datasets, we found that even at an $8\times$ subsampling rate, our model retains high-quality diffusion. This leads to significant run-time and memory savings, delivers samples with lower FID scores, and scales beyond the training resolution while retaining detail.
The idea of decision-aware model learning, that models should be accurate where it matters for decision-making, has gained prominence in model-based reinforcement learning. While promising theoretical results have been established, the empirical performance of algorithms leveraging a decision-aware loss has been lacking, especially in continuous control problems. In this paper, we present a study on the necessary components for decision-aware reinforcement learning models and we showcase design choices that enable well-performing algorithms. To this end, we provide a theoretical and empirical investigation into algorithmic ideas in the field. We highlight that empirical design decisions established in the MuZero line of works, most importantly the use of a latent model, are vital to achieving good performance for related algorithms. Furthermore, we show that the MuZero loss function is biased in stochastic environments and establish that this bias has practical consequences. Building on these findings, we present an overview of which decision-aware loss functions are best used in what empirical scenarios, providing actionable insights to practitioners in the field.