We present an experimental validation of a recently proposed optimization technique for reservoir computing, using an optoelectronic setup. Reservoir computing is a robust framework for signal processing applications, and the development of efficient optimization approaches remains a key challenge. The technique we address leverages solely a delayed version of the input signal to identify the optimal operational region of the reservoir, simplifying the traditionally time-consuming task of hyperparameter tuning. We verify the effectiveness of this approach on different benchmark tasks and reservoir operating conditions.
Randomized controlled trials (RCTs) serve as the cornerstone for understanding causal effects, yet extending inferences to target populations presents challenges due to effect heterogeneity and underrepresentation. Our paper addresses the critical issue of identifying and characterizing underrepresented subgroups in RCTs, proposing a novel framework for refining target populations to improve generalizability. We introduce an optimization-based approach, Rashomon Set of Optimal Trees (ROOT), to characterize underrepresented groups. ROOT optimizes the target subpopulation distribution by minimizing the variance of the target average treatment effect estimate, ensuring more precise treatment effect estimations. Notably, ROOT generates interpretable characteristics of the underrepresented population, aiding researchers in effective communication. Our approach demonstrates improved precision and interpretability compared to alternatives, as illustrated with synthetic data experiments. We apply our methodology to extend inferences from the Starting Treatment with Agonist Replacement Therapies (START) trial -- investigating the effectiveness of medication for opioid use disorder -- to the real-world population represented by the Treatment Episode Dataset: Admissions (TEDS-A). By refining target populations using ROOT, our framework offers a systematic approach to enhance decision-making accuracy and inform future trials in diverse populations.
With the growing availability of efficient tools, persistent homology is becoming a useful methodology in a variety of applications. Significant work has been devoted to implementing tools for persistent homology diagrams; however, computing representative cycles corresponding to each point in the diagram can still be inefficient. To circumvent this problem, we extend the twist algorithm of Chen and Kerber. Our extension is based on a new technique we call saving, which supplements their existing killing technique. The resulting two-pass strategy can be realized using an existing matrix reduction implementation as a black-box and improves the efficiency of computing representatives of persistent homology generators. We prove the correctness of the new approach and experimentally show its performance.
This study proposes a system designed to enumerate the process of collaborative composition among humans, using automatic music composition technology. By integrating multiple Recurrent Neural Network (RNN) models, the system provides an experience akin to collaborating with several composers, thereby fostering diverse creativity. Through dynamic adaptation to the user's creative intentions, based on feedback, the system enhances its capability to generate melodies that align with user preferences and creative needs. The system's effectiveness was evaluated through experiments with composers of varying backgrounds, revealing its potential to facilitate musical creativity and suggesting avenues for further refinement. The study underscores the importance of interaction between the composer and AI, aiming to make music composition more accessible and personalized. This system represents a step towards integrating AI into the creative process, offering a new tool for composition support and collaborative artistic exploration.
To enable large-scale and efficient deployment of artificial intelligence (AI), the combination of AI and edge computing has spawned Edge Intelligence, which leverages the computing and communication capabilities of end devices and edge servers to process data closer to where it is generated. A key technology for edge intelligence is the privacy-protecting machine learning paradigm known as Federated Learning (FL), which enables data owners to train models without having to transfer raw data to third-party servers. However, FL networks are expected to involve thousands of heterogeneous distributed devices. As a result, communication efficiency remains a key bottleneck. To reduce node failures and device exits, a Hierarchical Federated Learning (HFL) framework is proposed, where a designated cluster leader supports the data owner through intermediate model aggregation. Therefore, based on the improvement of edge server resource utilization, this paper can effectively make up for the limitation of cache capacity. In order to mitigate the impact of soft clicks on the quality of user experience (QoE), the authors model the user QoE as a comprehensive system cost. To solve the formulaic problem, the authors propose a decentralized caching algorithm with federated deep reinforcement learning (DRL) and federated learning (FL), where multiple agents learn and make decisions independently
We present AlloyInEcore, a tool for specifying metamodels with their static semantics to facilitate automated, formal reasoning on models. Software development projects require that software systems be specified in various models (e.g., requirements models, architecture models, test models, and source code). It is crucial to reason about those models to ensure the correct and complete system specifications. AlloyInEcore allows the user to specify metamodels with their static semantics, while, using the semantics, it automatically detects inconsistent models, and completes partial models. It has been evaluated on three industrial case studies in the automotive domain (//modelwriter.github.io/AlloyInEcore/).
We present an optimal transport framework for performing regression when both the covariate and the response are probability distributions on a compact Euclidean subset $\Omega\subset\mathbb{R}^d$, where $d>1$. Extending beyond compactly supported distributions, this method also applies when both the predictor and responses are Gaussian distributions on $\mathbb{R}^d$. Our approach generalizes an existing transportation-based regression model to higher dimensions. This model postulates that the conditional Fr\'echet mean of the response distribution is linked to the covariate distribution via an optimal transport map. We establish an upper bound for the rate of convergence of a plug-in estimator. We propose an iterative algorithm for computing the estimator, which is based on DC (Difference of Convex Functions) Programming. In the Gaussian case, the estimator achieves a parametric rate of convergence, and the computation of the estimator simplifies to a finite-dimensional optimization over positive definite matrices, allowing for an efficient solution. The performance of the estimator is demonstrated in a simulation study.
Many existing mechanisms to achieve differential privacy (DP) on infinite-dimensional functional summaries often involve embedding these summaries into finite-dimensional subspaces and applying traditional DP techniques. Such mechanisms generally treat each dimension uniformly and struggle with complex, structured summaries. This work introduces a novel mechanism for DP functional summary release: the Independent Component Laplace Process (ICLP) mechanism. This mechanism treats the summaries of interest as truly infinite-dimensional objects, thereby addressing several limitations of existing mechanisms. We establish the feasibility of the proposed mechanism in multiple function spaces. Several statistical estimation problems are considered, and we demonstrate one can enhance the utility of sanitized summaries by oversmoothing their non-private counterpart. Numerical experiments on synthetic and real datasets demonstrate the efficacy of the proposed mechanism.
The curse-of-dimensionality taxes computational resources heavily with exponentially increasing computational cost as the dimension increases. This poses great challenges in solving high-dimensional PDEs, as Richard E. Bellman first pointed out over 60 years ago. While there has been some recent success in solving numerically partial differential equations (PDEs) in high dimensions, such computations are prohibitively expensive, and true scaling of general nonlinear PDEs to high dimensions has never been achieved. We develop a new method of scaling up physics-informed neural networks (PINNs) to solve arbitrary high-dimensional PDEs. The new method, called Stochastic Dimension Gradient Descent (SDGD), decomposes a gradient of PDEs into pieces corresponding to different dimensions and randomly samples a subset of these dimensional pieces in each iteration of training PINNs. We prove theoretically the convergence and other desired properties of the proposed method. We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schr\"{o}dinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach. Notably, we solve nonlinear PDEs with nontrivial, anisotropic, and inseparable solutions in 100,000 effective dimensions in 12 hours on a single GPU using SDGD with PINNs. Since SDGD is a general training methodology of PINNs, it can be applied to any current and future variants of PINNs to scale them up for arbitrary high-dimensional PDEs.
Deployed multimodal systems can fail in ways that evaluators did not anticipate. In order to find these failures before deployment, we introduce MultiMon, a system that automatically identifies systematic failures -- generalizable, natural-language descriptions of patterns of model failures. To uncover systematic failures, MultiMon scrapes a corpus for examples of erroneous agreement: inputs that produce the same output, but should not. It then prompts a language model (e.g., GPT-4) to find systematic patterns of failure and describe them in natural language. We use MultiMon to find 14 systematic failures (e.g., "ignores quantifiers") of the CLIP text-encoder, each comprising hundreds of distinct inputs (e.g., "a shelf with a few/many books"). Because CLIP is the backbone for most state-of-the-art multimodal systems, these inputs produce failures in Midjourney 5.1, DALL-E, VideoFusion, and others. MultiMon can also steer towards failures relevant to specific use cases, such as self-driving cars. We see MultiMon as a step towards evaluation that autonomously explores the long tail of potential system failures. Code for MULTIMON is available at //github.com/tsb0601/MultiMon.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.