Traditional brain-computer systems are complex and expensive, and emotion classification algorithms lack repre-sentations of the intrinsic relationships between different channels of electroencephalogram (EEG) signals. There is still room for improvement in accuracy. To lower the research barrier for EEG and harness the rich information embedded in multi-channel EEG, we propose and implement a simple and user-friendly brain-computer system for classifying four emotions: happiness, sorrow, sadness, and tranquility. This system utilizes the fusion of convolutional attention mechanisms and fully pre-activated residual blocks, termed Attention-Convolution-based Pre-Activated Residual Network (ACPA-ResNet).In the hardware acquisition and preprocessing phase, we employ the ADS1299 integrated chip as the analog front-end and utilize the ESP32 microcontroller for initial EEG signal processing. Data is wirelessly transmitted to a PC through UDP protocol for further preprocessing. In the emotion analysis phase, ACPA-ResNet is designed to automatically extract and learn features from EEG signals, thereby enabling accurate classification of emotional states by learning time-frequency domain characteristics. ACPA-ResNet introduces an attention mechanism on the foundation of residual networks, adaptively assigning different weights to each channel. This allows it to focus on more meaningful EEG signals in both spatial and channel dimensions while avoiding the problems of gradient dispersion and explosion associated with deep network architectures.Through testing on 16 subjects, our system demonstrates stable EEG signal acquisition and transmission. The novel network significantly enhances emotion recognition accuracy, achieving an average emotion classification accuracy of 95.1%.
Computational problems concerning the orbit of a point under the action of a matrix group occur in numerous subfields of computer science, including complexity theory, program analysis, quantum computation, and automata theory. In many cases the focus extends beyond orbits proper to orbit closures under a suitable topology. Typically one starts from a group and several points and asks questions about the orbit closure of the points under the action of the group, e.g., whether two given orbit closures intersect. In this paper we consider a collection of what we call determination problems concerning groups and orbit closures. These problems begin with a given variety and seek to understand whether and how it arises either as an algebraic group or as an orbit closure. The how question asks whether the underlying group is $s$-generated, meaning it is topologically generated by $s$ matrices for a given number $s$. Among other applications, problems of this type have recently been studied in the context of synthesising loops subject to certain specified invariants on program variables. Our main result is a polynomial-space procedure that inputs a variety $V$ and a number $s$ and determines whether $V$ arises as an orbit closure of a point under an $s$-generated commutative matrix group. The main tools in our approach are rooted in structural properties of commutative algebraic matrix groups and lattice theory. We leave open the question of determining whether a variety is an orbit closure of a point under an algebraic matrix group (without the requirement of commutativity). In this regard, we note that a recent paper by Nosan et al. [NPSHW2021] gives an elementary procedure to compute the orbit closure of a point under finitely many matrices.
Speech encoders pretrained through self-supervised learning (SSL) have demonstrated remarkable performance in various downstream tasks, including Spoken Language Understanding (SLU) and Automatic Speech Recognition (ASR). For instance, fine-tuning SSL models for such tasks has shown significant potential, leading to improvements in the SOTA performance across challenging datasets. In contrast to existing research, this paper contributes by comparing the effectiveness of SSL approaches in the context of (i) the low-resource spoken Tunisian Arabic dialect and (ii) its combination with a low-resource SLU and ASR scenario, where only a few semantic annotations are available for fine-tuning. We conduct experiments using many SSL speech encoders on the TARIC-SLU dataset. We use speech encoders that were pre-trained on either monolingual or multilingual speech data. Some of them have also been refined without in-domain nor Tunisian data through multimodal supervised teacher-student paradigm. This study yields numerous significant findings that we are discussing in this paper.
We consider a reconfigurable intelligent surface (RIS) assisted cell-free massive multiple-input multiple-output non-orthogonal multiple access (NOMA) system, where each access point (AP) serves all the users with the aid of the RIS. We practically model the system by considering imperfect instantaneous channel state information (CSI) and employing imperfect successive interference cancellation at the users end. We first obtain the channel estimates using linear minimum mean square error approach considering the spatial correlation at the RIS and then derive a closed-form downlink spectral efficiency (SE) expression using the statistical CSI. We next formulate a joint optimization problem to maximize the sum SE of the system. We first introduce a novel successive Quadratic Transform (successive-QT) algorithm to optimize the transmit power coefficients using the concept of block optimization along with quadratic transform and then use the particle swarm optimization technique to design the RIS phase shifts. Note that most of the existing works on RIS-aided cell-free systems are specific instances of the general scenario studied in this work. We numerically show that i) the RIS-assisted link is more advantageous at lower transmit power regions where the direct link between AP and user is weak, ii) NOMA outperforms orthogonal multiple access schemes in terms of SE, and iii) the proposed joint optimization framework significantly improves the sum SE of the system.
Quantum low-density parity-check (qLDPC) codes offer a promising route to scalable fault-tolerant quantum computation with constant overhead. Recent advancements have shown that qLDPC codes can outperform the quantum memory capability of surface codes even with near-term hardware. The question of how to implement logical gates fault-tolerantly for these codes is still open. We present new examples of high-rate bivariate bicycle (BB) codes with enhanced symmetry properties. These codes feature explicit nice bases of logical operators (similar to toric codes) and support fold-transversal Clifford gates without overhead. As examples, we construct $[[98,6,12]]$ and $[[162, 8, 12]]$ BB codes which admit interesting fault-tolerant Clifford gates. Our work also lays the mathematical foundations for explicit bases of logical operators and fold-transversal gates in quantum two-block and group algebra codes, which might be of independent interest.
Emergent communication, or emergent language, is the field of research which studies how human language-like communication systems emerge de novo in deep multi-agent reinforcement learning environments. The possibilities of replicating the emergence of a complex behavior like language have strong intuitive appeal, yet it is necessary to complement this with clear notions of how such research can be applicable to other fields of science, technology, and engineering. This paper comprehensively reviews the applications of emergent communication research across machine learning, natural language processing, linguistics, and cognitive science. Each application is illustrated with a description of its scope, an explication of emergent communication's unique role in addressing it, a summary of the extant literature working towards the application, and brief recommendations for near-term research directions.
We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self-attention, positional encoding and feed-forward layer, affect its expressive power, and we study their combined effects through establishing explicit approximation rates. Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads. These theoretical insights are validated experimentally and offer natural suggestions for alternative architectures.
Crowd-sourcing deals with solving problems by assigning them to a large number of non-experts called crowd using their spare time. In these systems, the final answer to the question is determined by summing up the votes obtained from the community. The popularity of using these systems has increased by facilitation of access to community members through mobile phones and the Internet. One of the issues raised in crowd-sourcing is how to choose people and how to collect answers. Usually, the separation of users is done based on their performance in a pre-test. Designing the pre-test for performance calculation is challenging; The pre-test questions should be chosen in a way that they test the characteristics in people related to the main questions. One of the ways to increase the accuracy of crowd-sourcing systems is to pay attention to people's cognitive characteristics and decision-making model to form a crowd and improve the estimation of the accuracy of their answers to questions. People can estimate the correctness of their responses while making a decision. The accuracy of this estimate is determined by a quantity called metacognition ability. Metacoginition is referred to the case where the confidence level is considered along with the answer to increase the accuracy of the solution. In this paper, by both mathematical and experimental analysis, we would answer the following question: Is it possible to improve the performance of the crowd-sourcing system by knowing the metacognition of individuals and recording and using the users' confidence in their answers?
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.