Block orthogonal sparse superposition (BOSS) code is a class of joint coded modulation methods, which can closely achieve the finite-blocklength capacity with a low-complexity decoder at a few coding rates under Gaussian channels. However, for fading channels, the code performance degrades considerably because coded symbols experience different channel fading effects. In this paper, we put forth novel joint demodulation and decoding methods for BOSS codes under fading channels. For a fast fading channel, we present a minimum mean square error approximate maximum a posteriori (MMSE-A-MAP) algorithm for the joint demodulation and decoding when channel state information is available at the receiver (CSIR). We also propose a joint demodulation and decoding method without using CSIR for a block fading channel scenario. We refer to this as the non-coherent sphere decoding (NSD) algorithm. Simulation results demonstrate that BOSS codes with MMSE-A-MAP decoding outperform CRC-aided polar codes, while NSD decoding achieves comparable performance to quasi-maximum likelihood decoding with significantly reduced complexity. Both decoding algorithms are suitable for parallelization, satisfying low-latency constraints. Additionally, real-time simulations on a software-defined radio testbed validate the feasibility of using BOSS codes for low-power transmission.
Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance. Existing approaches perform k-NN search with CE by approximating the CE similarity with a vector embedding space fit either with dual-encoders (DE) or CUR matrix factorization. DE-based retrieve-and-rerank approaches suffer from poor recall on new domains and the retrieval with DE is decoupled from the CE. While CUR-based approaches can be more accurate than the DE-based approach, they require a prohibitively large number of CE calls to compute item embeddings, thus making it impractical for deployment at scale. In this paper, we address these shortcomings with our proposed sparse-matrix factorization based method that efficiently computes latent query and item embeddings to approximate CE scores and performs k-NN search with the approximate CE similarity. We compute item embeddings offline by factorizing a sparse matrix containing query-item CE scores for a set of train queries. Our method produces a high-quality approximation while requiring only a fraction of CE calls as compared to CUR-based methods, and allows for leveraging DE to initialize the embedding space while avoiding compute- and resource-intensive finetuning of DE via distillation. At test time, the item embeddings remain fixed and retrieval occurs over rounds, alternating between a) estimating the test query embedding by minimizing error in approximating CE scores of items retrieved thus far, and b) using the updated test query embedding for retrieving more items. Our k-NN search method improves recall by up to 5% (k=1) and 54% (k=100) over DE-based approaches. Additionally, our indexing approach achieves a speedup of up to 100x over CUR-based and 5x over DE distillation methods, while matching or improving k-NN search recall over baselines.
Combining model-based and model-free reinforcement learning approaches, this paper proposes and analyzes an $\epsilon$-policy gradient algorithm for the online pricing learning task. The algorithm extends $\epsilon$-greedy algorithm by replacing greedy exploitation with gradient descent step and facilitates learning via model inference. We optimize the regret of the proposed algorithm by quantifying the exploration cost in terms of the exploration probability $\epsilon$ and the exploitation cost in terms of the gradient descent optimization and gradient estimation errors. The algorithm achieves an expected regret of order $\mathcal{O}(\sqrt{T})$ (up to a logarithmic factor) over $T$ trials.
Locally repairable codes (LRCs) are designed for distributed storage systems to reduce the repair bandwidth and disk I/O complexity during the storage node repair process. A code with $(r,\delta)$-locality (also called an $(r,\delta)$-LRC) can simultaneously repair up to $\delta-1$ symbols in a codeword by accessing at most $r$ other symbols in the codeword. In this paper, we propose a new method to calculate the $(r,\delta)$-locality of cyclic codes. Initially, we give a description of the algebraic structure of repeated-root cyclic codes of prime power lengths. Using this result, we derive a formula of $(r,\delta)$-locality of these cyclic codes for a wide range of $\delta$ values. Furthermore, we calculate the parameters of repeated-root cyclic codes of prime power lengths and obtain several infinite families of optimal cyclic $(r,\delta)$-LRCs, which exhibit new parameters compared with existing research on optimal $(r,\delta)$-LRCs with a cyclic structure. For the specific case of $\delta=2$, we have comprehensively identified all potential optimal cyclic $(r,2)$-LRCs of prime power lengths.
We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation, suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms assume a trusted central curator and do not extend to federated settings. Naively combining the secure and DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves four orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by taking advantage of constrained clustering techniques.
We consider a novel algorithm, for the completion of partially observed low-rank matrices in a structured setting where each entry can be chosen from a finite discrete alphabet set, such as in common recommender systems. The proposed low-rank matrix completion (MC) method is an improved variation of state-of-the-art (SotA) discrete aware matrix completion method which we previously proposed, in which discreteness is enforced by an $\ell_0$-norm regularizer, not by replaced with the $\ell_1$-norm, but instead approximated by a continuous and differentiable function normalized via fractional programming (FP) under a proximal gradient (PG) framework. Simulation results demonstrate the superior performance of the new method compared to the SotA techniques as well as the earlier $\ell_1$-norm-based discrete-aware matrix completion approach.
Prior work has explicated the coloniality of artificial intelligence (AI) development and deployment through mechanisms such as extractivism, automation, sociological essentialism, surveillance, and containment. However, that work has not engaged much with alignment: teaching behaviors to a large language model (LLM) in line with desired values, and has not considered a mechanism that arises within that process: moral absolutism -- a part of the coloniality of knowledge. Colonialism has a history of altering the beliefs and values of colonized peoples; in this paper, I argue that this history is recapitulated in current LLM alignment practices and technologies. Furthermore, I suggest that AI alignment be decolonialized using three forms of openness: openness of models, openness to society, and openness to excluded knowledges. This suggested approach to decolonial AI alignment uses ideas from the argumentative moral philosophical tradition of Hinduism, which has been described as an open-source religion. One concept used is vi\'{s}e\d{s}a-dharma, or particular context-specific notions of right and wrong. At the end of the paper, I provide a suggested reference architecture to work toward the proposed framework.
The number of low-weight codewords is critical to the performance of error-correcting codes. In 1970, Kasami and Tokura characterized the codewords of Reed-Muller (RM) codes whose weights are less than $2w_{\min}$, where $w_{\min}$ represents the minimum weight. In this paper, we extend their results to decreasing polar codes. We present the closed-form expressions for the number of codewords in decreasing polar codes with weights less than $2w_{\min}$. Moreover, the proposed enumeration algorithm runs in polynomial time with respect to the code length.
Coresets are arguably the most popular compression paradigm for center-based clustering objectives such as $k$-means. Given a point set $P$, a coreset $\Omega$ is a small, weighted summary that preserves the cost of all candidate solutions $S$ up to a $(1\pm \varepsilon)$ factor. For $k$-means in $d$-dimensional Euclidean space the cost for solution $S$ is $\sum_{p\in P}\min_{s\in S}\|p-s\|^2$. A very popular method for coreset construction, both in theory and practice, is Sensitivity Sampling, where points are sampled in proportion to their importance. We show that Sensitivity Sampling yields optimal coresets of size $\tilde{O}(k/\varepsilon^2\min(\sqrt{k},\varepsilon^{-2}))$ for worst-case instances. Uniquely among all known coreset algorithms, for well-clusterable data sets with $\Omega(1)$ cost stability, Sensitivity Sampling gives coresets of size $\tilde{O}(k/\varepsilon^2)$, improving over the worst-case lower bound. Notably, Sensitivity Sampling does not have to know the cost stability in order to exploit it: It is appropriately sensitive to the clusterability of the data set while being oblivious to it. We also show that any coreset for stable instances consisting of only input points must have size $\Omega(k/\varepsilon^2)$. Our results for Sensitivity Sampling also extend to the $k$-median problem, and more general metric spaces.
Description Logics are a formalism used in the knowledge representation, where the knowledge is captured in the form of concepts constructed in a controlled way from a restricted vocabulary. This allows one to test effectively for consistency of and the subsumption between the concepts. Unification of concepts may likewise become a useful tool in analysing the relations between concepts. The unification problem has been solved for the description logics $\mathcal{FL}_0$ and $\mathcal{EL}$. These small logics do not provide any means to express negation. Here we show an algorithm solving unification in $\mathcal{FL}_\bot$, the logic that extends $\mathcal{FL}_0$ with the bottom concept. Bottom allows one to express that two concepts are disjoint. Our algorithm runs in exponential time, with respect to the size of the problem.
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.