This paper introduces the Lagrange Policy for Continuous Actions (LPCA), a reinforcement learning algorithm specifically designed for weakly coupled MDP problems with continuous action spaces. LPCA addresses the challenge of resource constraints dependent on continuous actions by introducing a Lagrange relaxation of the weakly coupled MDP problem within a neural network framework for Q-value computation. This approach effectively decouples the MDP, enabling efficient policy learning in resource-constrained environments. We present two variations of LPCA: LPCA-DE, which utilizes differential evolution for global optimization, and LPCA-Greedy, a method that incrementally and greadily selects actions based on Q-value gradients. Comparative analysis against other state-of-the-art techniques across various settings highlight LPCA's robustness and efficiency in managing resource allocation while maximizing rewards.
We derive a general lower bound for the generalized Hamming weights of nested matrix-product codes, with a particular emphasis on the cases with two and three constituent codes. We also provide an upper bound which is reminiscent of the bounds used for the minimum distance of matrix-product codes. When the constituent codes are two Reed-Solomon codes, we obtain an explicit formula for the generalized Hamming weights of the resulting matrix-product code. We also deal with the non-nested case for the case of two constituent codes.
This study proposes a unified theory and statistical learning approach for traffic conflict detection, addressing the long-existing call for a consistent and comprehensive methodology to evaluate the collision risk emerged in road user interactions. The proposed theory assumes a context-dependent probabilistic collision risk and frames conflict detection as estimating the risk by statistical learning from observed proximities and contextual variables. Three primary tasks are integrated: representing interaction context from selected observables, inferring proximity distributions in different contexts, and applying extreme value theory to relate conflict intensity with conflict probability. As a result, this methodology is adaptable to various road users and interaction scenarios, enhancing its applicability without the need for pre-labelled conflict data. Demonstration experiments are executed using real-world trajectory data, with the unified metric trained on lane-changing interactions on German highways and applied to near-crash events from the 100-Car Naturalistic Driving Study in the U.S. The experiments demonstrate the methodology's ability to provide effective collision warnings, generalise across different datasets and traffic environments, cover a broad range of conflicts, and deliver a long-tailed distribution of conflict intensity. This study contributes to traffic safety by offering a consistent and explainable methodology for conflict detection applicable across various scenarios. Its societal implications include enhanced safety evaluations of traffic infrastructures, more effective collision warning systems for autonomous and driving assistance systems, and a deeper understanding of road user behaviour in different traffic conditions, contributing to a potential reduction in accident rates and improving overall traffic safety.
The efficacy of interpolating via Variably Scaled Kernels (VSKs) is known to be dependent on the definition of a proper scaling function, but no numerical recipes to construct it are available. Previous works suggest that such a function should mimic the target one, but no theoretical evidence is provided. This paper fills both the gaps: it proves that a scaling function reflecting the target one may lead to enhanced approximation accuracy, and it provides a user-independent tool for learning the scaling function by means of Discontinuous Neural Networks ({\delta}NN), i.e., NNs able to deal with possible discontinuities. Numerical evidence supports our claims, as it shows that the key features of the target function can be clearly recovered in the learned scaling function.
This paper introduces a comprehensive framework to adjust a discrete test statistic for improving its hypothesis testing procedure. The adjustment minimizes the Wasserstein distance to a null-approximating continuous distribution, tackling some fundamental challenges inherent in combining statistical significances derived from discrete distributions. The related theory justifies Lancaster's mid-p and mean-value chi-squared statistics for Fisher's combination as special cases. However, in order to counter the conservative nature of Lancaster's testing procedures, we propose an updated null-approximating distribution. It is achieved by further minimizing the Wasserstein distance to the adjusted statistics within a proper distribution family. Specifically, in the context of Fisher's combination, we propose an optimal gamma distribution as a substitute for the traditionally used chi-squared distribution. This new approach yields an asymptotically consistent test that significantly improves type I error control and enhances statistical power.
This paper proposes LCFL, a novel clustering metric for evaluating clients' data distributions in federated learning. LCFL aligns with federated learning requirements, accurately assessing client-to-client variations in data distribution. It offers advantages over existing clustered federated learning methods, addressing privacy concerns, improving applicability to non-convex models, and providing more accurate classification results. LCFL does not require prior knowledge of clients' data distributions. We provide a rigorous mathematical analysis, demonstrating the correctness and feasibility of our framework. Numerical experiments with neural network instances highlight the superior performance of LCFL over baselines on several clustered federated learning benchmarks.
Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction datasets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different datasets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
Incorporating prior knowledge into pre-trained language models has proven to be effective for knowledge-driven NLP tasks, such as entity typing and relation extraction. Current pre-training procedures usually inject external knowledge into models by using knowledge masking, knowledge fusion and knowledge replacement. However, factual information contained in the input sentences have not been fully mined, and the external knowledge for injecting have not been strictly checked. As a result, the context information cannot be fully exploited and extra noise will be introduced or the amount of knowledge injected is limited. To address these issues, we propose MLRIP, which modifies the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a two-stage entity replacement strategy. Extensive experiments with comprehensive analyses illustrate the superiority of MLRIP over BERT-based models in military knowledge-driven NLP tasks.
This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the thesis, we will empirically study how training deep networks via stochastic gradient descent implicitly controls the networks' capacity. Subsequently, to show how this leads to better generalization, we will derive {\em data-dependent} {\em uniform-convergence-based} generalization bounds with improved dependencies on the parameter count. Uniform convergence has in fact been the most widely used tool in deep learning literature, thanks to its simplicity and generality. Given its popularity, in this thesis, we will also take a step back to identify the fundamental limits of uniform convergence as a tool to explain generalization. In particular, we will show that in some example overparameterized settings, {\em any} uniform convergence bound will provide only a vacuous generalization bound. With this realization in mind, in the last part of the thesis, we will change course and introduce an {\em empirical} technique to estimate generalization using unlabeled data. Our technique does not rely on any notion of uniform-convergece-based complexity and is remarkably precise. We will theoretically show why our technique enjoys such precision. We will conclude by discussing how future work could explore novel ways to incorporate distributional assumptions in generalization bounds (such as in the form of unlabeled data) and explore other tools to derive bounds, perhaps by modifying uniform convergence or by developing completely new tools altogether.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.