Individual modules of programmable matter participate in their system's collective behavior by expending energy to perform actions. However, not all modules may have access to the external energy source powering the system, necessitating a local and distributed strategy for supplying energy to modules. In this work, we present a general energy distribution framework for the canonical amoebot model of programmable matter that transforms energy-agnostic algorithms into energy-constrained ones with equivalent behavior and an $\mathcal{O}(n^2)$-round runtime overhead -- even under an unfair adversary -- provided the original algorithms satisfy certain conventions. We then prove that existing amoebot algorithms for leader election (ICDCN 2023) and shape formation (Distributed Computing, 2023) are compatible with this framework and show simulations of their energy-constrained counterparts, demonstrating how other unfair algorithms can be generalized to the energy-constrained setting with relatively little effort. Finally, we show that our energy distribution framework can be composed with the concurrency control framework for amoebot algorithms (Distributed Computing, 2023), allowing algorithm designers to focus on the simpler energy-agnostic, sequential setting but gain the general applicability of energy-constrained, asynchronous correctness.
Generative diffusion models can serve as a prior which ensures that solutions of image restoration systems adhere to the manifold of natural images. However, for restoring facial images, a personalized prior is necessary to accurately represent and reconstruct unique facial features of a given individual. In this paper, we propose a simple, yet effective, method for personalized restoration, called Dual-Pivot Tuning - a two-stage approach that personalize a blind restoration system while maintaining the integrity of the general prior and the distinct role of each component. Our key observation is that for optimal personalization, the generative model should be tuned around a fixed text pivot, while the guiding network should be tuned in a generic (non-personalized) manner, using the personalized generative model as a fixed ``pivot". This approach ensures that personalization does not interfere with the restoration process, resulting in a natural appearance with high fidelity to the person's identity and the attributes of the degraded image. We evaluated our approach both qualitatively and quantitatively through extensive experiments with images of widely recognized individuals, comparing it against relevant baselines. Surprisingly, we found that our personalized prior not only achieves higher fidelity to identity with respect to the person's identity, but also outperforms state-of-the-art generic priors in terms of general image quality. Project webpage: //personalized-restoration.github.io
Drowsy driving represents a major contributor to traffic accidents, and the implementation of driver drowsy driving detection systems has been proven to significantly reduce the occurrence of such accidents. Despite the development of numerous drowsy driving detection algorithms, many of them impose specific prerequisites such as the availability of complete facial images, optimal lighting conditions, and the use of RGB images. In our study, we introduce a novel approach called the Multi-Attention Fusion Drowsy Driving Detection Model (MAF). MAF is aimed at significantly enhancing classification performance, especially in scenarios involving partial facial occlusion and low lighting conditions. It accomplishes this by capitalizing on the local feature extraction capabilities provided by multi-attention fusion, thereby enhancing the algorithm's overall robustness. To enhance our dataset, we collected real-world data that includes both occluded and unoccluded faces captured under nighttime and daytime lighting conditions. We conducted a comprehensive series of experiments using both publicly available datasets and our self-built data. The results of these experiments demonstrate that our proposed model achieves an impressive driver drowsiness detection accuracy of 96.8%.
We consider estimation of a functional parameter of a realistically modeled data distribution based on independent and identically distributed observations. Suppose that the true function is defined as the minimizer of the expectation of a specified loss function over its parameter space. Estimators of the true function are provided, viewed as a data-adaptive coordinate transformation for the true function. For any $J$-dimensional real valued cadlag function with finite sectional variation norm, we define a candidate ensemble estimator as the mapping from the data into the composition of the cadlag function and the $J$ estimated functions. Using $V$-fold cross-validation, we define the cross-validated empirical risk of each cadlag function specific ensemble estimator. We then define the Meta Highly Adaptive Lasso Minimum Loss Estimator (M-HAL-MLE) as the cadlag function that minimizes this cross-validated empirical risk over all cadlag functions with a uniform bound on the sectional variation norm. For each of the $V$ training samples, this yields a composition of the M-HAL-MLE ensemble and the $J$ estimated functions trained on the training sample. We can estimate the true function with the average of these $V$ estimated functions, which we call the M-HAL super-learner. The M-HAL super-learner converges to the oracle estimator at a rate $n^{-2/3}$ (up till $\log n$-factor) w.r.t. excess risk, where the oracle estimator minimizes the excess risk among all considered ensembles. The excess risk of the oracle estimator and true function is generally second order. Under weak conditions on the $J$ candidate estimators, target features of the undersmoothed M-HAL super-learner are asymptotically linear estimators of the corresponding target features of true function, with influence curve either the efficient influence curve, or potentially, a super-efficient influence curve.
We investigate the problem of multiplex graph embedding, that is, graphs in which nodes interact through multiple types of relations (dimensions). In recent years, several methods have been developed to address this problem. However, the need for more effective and specialized approaches grows with the production of graph data with diverse characteristics. In particular, real-world multiplex graphs may exhibit a high number of dimensions, making it difficult to construct a single consensus representation. Furthermore, important information can be hidden in complex latent structures scattered in multiple dimensions. To address these issues, we propose HMGE, a novel embedding method based on hierarchical aggregation for high-dimensional multiplex graphs. Hierarchical aggregation consists of learning a hierarchical combination of the graph dimensions and refining the embeddings at each hierarchy level. Non-linear combinations are computed from previous ones, thus uncovering complex information and latent structures hidden in the multiplex graph dimensions. Moreover, we leverage mutual information maximization between local patches and global summaries to train the model without supervision. This allows to capture of globally relevant information present in diverse locations of the graph. Detailed experiments on synthetic and real-world data illustrate the suitability of our approach to downstream supervised tasks, including link prediction and node classification.
We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically adapted according to the underlying domain without any prior domain knowledge beyond a generative model. We empirically test SYMBOL in four large POMDP benchmark problems to demonstrate its effectiveness and robustness w.r.t. the choice of hyperparameters and evaluate its adaptive memory consumption. We also compare its performance with other open-loop planning algorithms and POMCP.
Continuously-observed event occurrences, often exhibit self- and mutually-exciting effects, which can be well modeled using temporal point processes. Beyond that, these event dynamics may also change over time, with certain periodic trends. We propose a novel variational auto-encoder to capture such a mixture of temporal dynamics. More specifically, the whole time interval of the input sequence is partitioned into a set of sub-intervals. The event dynamics are assumed to be stationary within each sub-interval, but could be changing across those sub-intervals. In particular, we use a sequential latent variable model to learn a dependency graph between the observed dimensions, for each sub-interval. The model predicts the future event times, by using the learned dependency graph to remove the noncontributing influences of past events. By doing so, the proposed model demonstrates its higher accuracy in predicting inter-event times and event types for several real-world event sequences, compared with existing state of the art neural point processes.
Few-shot prompting elicits the remarkable abilities of large language models by equipping them with a few demonstration examples in the input. However, the traditional method of providing large language models with all demonstration input-output pairs at once may not effectively guide large language models to learn the specific input-output mapping relationship. In this paper, inspired by the regulatory and supportive role of metacognition in students' learning, we propose a novel metacognition-enhanced few-shot prompting, which guides large language models to reflect on their thought processes to comprehensively learn the given demonstration examples. Furthermore, considering that positive reinforcement can improve students' learning motivation, we introduce positive reinforcement into our metacognition-enhanced few-shot prompting to promote the few-shot learning of large language models by providing response-based positive feedback. The experimental results on two real-world datasets show that our metacognition-enhanced few-shot prompting with positive reinforcement surpasses traditional few-shot prompting in classification accuracy and macro F1.
This paper deals with developing techniques for the reconstruction of high-dimensional datasets given each bivariate projection, as would be found in a matrix scatterplot. A graph-based solution is introduced, involving clique-finding, providing a set of possible rows that might make up the original dataset. Complications are discussed, including cases where phantom cliques are found, as well as cases where an exact solution is impossible. Additional methods are shown, with some dealing with fully deducing rows and others dealing with having to creatively produce methods that find some possibilities to be more likely than others. Results show that these methods are highly successful in recreating a significant portion of the original dataset in many cases - for randomly generated and real-world datasets - with the factors leading to a greater rate of failure being lower dimension, higher n, and lower interval.
This tutorial aims to establish connections between polynomial modular multiplication over a ring to circular convolution and discrete Fourier transform (DFT). The main goal is to extend the well-known theory of DFT in signal processing (SP) to other applications involving polynomials in a ring such as homomorphic encryption (HE). HE allows any third party to operate on the encrypted data without decrypting it in advance. Since most HE schemes are constructed from the ring-learning with errors (R-LWE) problem, efficient polynomial modular multiplication implementation becomes critical. Any improvement in the execution of these building blocks would have significant consequences for the global performance of HE. This lecture note describes three approaches to implementing long polynomial modular multiplication using the number theoretic transform (NTT): zero-padded convolution, without zero-padding, also referred to as negative wrapped convolution (NWC), and low-complexity NWC (LC-NWC).
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.