In this paper, we introduce a new class of codes, called weighted parity-check codes, where each parity-check bit has a weight that indicates its likelihood to be one (instead of fixing each parity-check bit to be zero). It is applicable to a wide range of settings, e.g. asymmetric channels, channels with state and/or cost constraints, and the Wyner-Ziv problem, and can provably achieve the capacity. For the channels with state (Gelfand-Pinsker) setting, the proposed coding scheme has two advantages compared to the nested linear code. First, it achieves the capacity of any channel with state (e.g. asymmetric channels). Second, simulation results show that the proposed code achieves a smaller error rate compared to the nested linear code. We also discuss a sparse construction where the belief propagation algorithm can be applied to improve the coding efficiency.
Quantum machine learning has become an area of growing interest but has certain theoretical and hardware-specific limitations. Notably, the problem of vanishing gradients, or barren plateaus, renders the training impossible for circuits with high qubit counts, imposing a limit on the number of qubits that data scientists can use for solving problems. Independently, angle-embedded supervised quantum neural networks were shown to produce truncated Fourier series with a degree directly dependent on two factors: the depth of the encoding and the number of parallel qubits the encoding applied to. The degree of the Fourier series limits the model expressivity. This work introduces two new architectures whose Fourier degrees grow exponentially: the sequential and parallel exponential quantum machine learning architectures. This is done by efficiently using the available Hilbert space when encoding, increasing the expressivity of the quantum encoding. Therefore, the exponential growth allows staying at the low-qubit limit to create highly expressive circuits avoiding barren plateaus. Practically, parallel exponential architecture was shown to outperform the existing linear architectures by reducing their final mean square error value by up to 44.7% in a one-dimensional test problem. Furthermore, the feasibility of this technique was also shown on a trapped ion quantum processing unit.
We propose a simple and efficient approach to generate a prediction intervals (PI) for approximated and forecasted trends. Our method leverages a weighted asymmetric loss function to estimate the lower and upper bounds of the PI, with the weights determined by its coverage probability. We provide a concise mathematical proof of the method, show how it can be extended to derive PIs for parametrised functions and discuss its effectiveness when training deep neural networks. The presented tests of the method on a real-world forecasting task using a neural network-based model show that it can produce reliable PIs in complex machine learning scenarios.
The target sensing/localization performance is fundamentally limited by the line-of-sight link and severe signal attenuation over long distances. This paper considers a challenging scenario where the direct link between the base station (BS) and the target is blocked due to the surrounding blockages and leverages the intelligent reflecting surface (IRS) with some active sensors, termed as \textit{semi-passive IRS}, for localization. To be specific, the active sensors receive echo signals reflected by the target and apply signal processing techniques to estimate the target location. We consider the joint time-of-arrival (ToA) and direction-of-arrival (DoA) estimation for localization and derive the corresponding Cram\'{e}r-Rao bound (CRB), and then a simple ToA/DoA estimator without iteration is proposed. In particular, the relationships of the CRB for ToA/DoA with the number of frames for IRS beam adjustments, number of IRS reflecting elements, and number of sensors are theoretically analyzed and demystified. Simulation results show that the proposed semi-passive IRS architecture provides sub-meter level positioning accuracy even over a long localization range from the BS to the target and also demonstrate a significant localization accuracy improvement compared to the fully passive IRS architecture.
Boolean satisfiability (SAT) is a fundamental NP-complete problem with many applications, including automated planning and scheduling. To solve large instances, SAT solvers have to rely on heuristics, e.g., choosing a branching variable in DPLL and CDCL solvers. Such heuristics can be improved with machine learning (ML) models; they can reduce the number of steps but usually hinder the running time because useful models are relatively large and slow. We suggest the strategy of making a few initial steps with a trained ML model and then releasing control to classical heuristics; this simplifies cold start for SAT solving and can decrease both the number of steps and overall runtime, but requires a separate decision of when to release control to the solver. Moreover, we introduce a modification of Graph-Q-SAT tailored to SAT problems converted from other domains, e.g., open shop scheduling problems. We validate the feasibility of our approach with random and industrial SAT problems.
We study how ambient energy harvesting may be used as an attack vector in the battery-less Internet of Things (IoT). Battery-less IoT devices rely on ambient energy harvesting and are employed in a multitude of applications, including safety-critical ones such as biomedical implants. Due to scarce energy intakes and limited energy buffers, their executions become intermittent, alternating periods of active operation with periods of recharging energy buffers. Through an independent exploratory study and a follow-up systematic analysis, we demonstrate that by exerting limited control on ambient energy one can create situations of livelock, denial of service, and priority inversion, without physical device access. We call these situations energy attacks. Using concepts of approximate intermittent computing and machine learning, we design a technique that can detect energy attacks with 92%+ accuracy, that is, up to 37% better than the baselines, and with up to one fifth of their energy overhead. Crucially, by design, our technique does not cause any additional energy failure compared to the regular intermittent processing. We conclude with directions to inspire defense techniques and a discussion on the feasibility of energy attacks.
This paper considers the Cauchy problem for the nonlinear dynamic string equation of Kirchhoff-type with time-varying coefficients. The objective of this work is to develop a temporal discretization algorithm capable of approximating a solution to this initial-boundary value problem. To this end, a symmetric three-layer semi-discrete scheme is employed with respect to the temporal variable, wherein the value of a nonlinear term is evaluated at the middle node point. This approach enables the numerical solutions per temporal step to be obtained by inverting the linear operators, yielding a system of second-order linear ordinary differential equations. Local convergence of the proposed scheme is established, and it achieves quadratic convergence concerning the step size of the discretization of time on the local temporal interval. We have conducted several numerical experiments using the proposed algorithm for various test problems to validate its performance. It can be said that the obtained numerical results are in accordance with the theoretical findings.
We propose an unsupervised deep learning-based decoding scheme that enables one-shot decoding of polar codes. In the proposed scheme, rather than using the information bit vectors as labels for training the neural network (NN) through supervised learning as the conventional scheme did, the NN is trained to function as a bounded distance decoder by leveraging the generator matrix of polar codes through self-supervised learning. This approach eliminates the reliance on predefined labels, empowering the potential to train directly on the actual data within communication systems and thereby enhancing the applicability. Furthermore, computer simulations demonstrate that (i) the bit error rate (BER) and block error rate (BLER) performances of the proposed scheme can approach those of the maximum a posteriori (MAP) decoder for very short packets and (ii) the proposed NN decoder exhibits much superior generalization ability compared to the conventional one.
The estimation of parameter standard errors for semi-variogram models is challenging, given the two-step process required to fit a parametric model to spatially correlated data. Motivated by an application in the social-epidemiology, we focus on exponential semi-variogram models fitted to data between 500 to 2000 observations and little control over the sampling design. Previously proposed methods for the estimation of standard errors cannot be applied in this context. Approximate closed form solutions are too costly using generalized least squares in terms of memory capacities. The generalized bootstrap proposed by Olea and Pardo-Ig\'uzquiza is nonetheless applicable with weighted instead of generalized least squares. However, the standard error estimates are hugely biased and imprecise. Therefore, we propose a filtering method added to the generalized bootstrap. The new development is presented and evaluated with a simulation study which shows that the generalized bootstrap with check-based filtering leads to massively improved results compared to the quantile-based filter method and previously developed approaches. We provide a case study using birthweight data.
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs. First, we develop an asymmetric encoder-decoder architecture, with an encoder that operates only on the visible subset of patches (without mask tokens), along with a lightweight decoder that reconstructs the original image from the latent representation and mask tokens. Second, we find that masking a high proportion of the input image, e.g., 75%, yields a nontrivial and meaningful self-supervisory task. Coupling these two designs enables us to train large models efficiently and effectively: we accelerate training (by 3x or more) and improve accuracy. Our scalable approach allows for learning high-capacity models that generalize well: e.g., a vanilla ViT-Huge model achieves the best accuracy (87.8%) among methods that use only ImageNet-1K data. Transfer performance in downstream tasks outperforms supervised pre-training and shows promising scaling behavior.
Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it is fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information.