We use matrices over bit strings as platforms for Diffie-Hellman-like public key exchange protocols. When multiplying matrices like that, we use Boolean OR operation on bit strings in place of addition and Boolean AND operation in place of multiplication. As a result, (1) computations with these matrices are very efficient; (2) standard methods of attacking Diffie-Hellman-like protocols are not applicable.
Dawar and Wilsenach (ICALP 2020) introduce the model of symmetric arithmetic circuits and show an exponential separation between the sizes of symmetric circuits for computing the determinant and the permanent. The symmetry restriction is that the circuits which take a matrix input are unchanged by a permutation applied simultaneously to the rows and columns of the matrix. Under such restrictions we have polynomial-size circuits for computing the determinant but no subexponential size circuits for the permanent. Here, we consider a more stringent symmetry requirement, namely that the circuits are unchanged by arbitrary even permutations applied separately to rows and columns, and prove an exponential lower bound even for circuits computing the determinant. The result requires substantial new machinery. We develop a general framework for proving lower bounds for symmetric circuits with restricted symmetries, based on a new support theorem and new two-player restricted bijection games. These are applied to the determinant problem with a novel construction of matrices that are bi-adjacency matrices of graphs based on the CFI construction. Our general framework opens the way to exploring a variety of symmetry restrictions and studying trade-offs between symmetry and other resources used by arithmetic circuits.
Correlation measure of order $k$ is an important measure of randomness in binary sequences. This measure tries to look for dependence between several shifted version of a sequence. We study the relation between the correlation measure of order $k$ and another two pseudorandom measures: the $N$th linear complexity and the $N$th maximum order complexity. We simplify and improve several state-of-the-art lower bounds for these two measures using the Hamming bound as well as weaker bounds derived from it.
With the growth of wearable devices, which are usually constrained in computational power and user interface, this pairing has to be autonomous. Considering devices that do not have prior information about each other, a secure communication should be established by generating a shared secret key derived from a common context between the devices. Context-based pairing solutions increase the usability of wearable device pairing by eliminating any human involvement in the pairing process. This is possible by utilizing onboard sensors (with the same sensing modalities) to capture a common physical context (e.g., body motion, gait, heartbeat, respiration, and EMG signal). A wide range of approaches has been proposed to address autonomous pairing in wearable devices. This paper surveys context-based pairing in wearable devices by focusing on the signals and sensors exploited. We review the steps needed for generating a common key and provide a survey of existing techniques utilized in each step.
We propose Breath to Pair (B2P), a protocol for pairing and shared-key generation for wearable devices that leverages the wearer's respiration activity to ensure that the devices are part of the same body-area network. We assume that the devices exploit different types of sensors to extract and process the respiration signal. We illustrate B2P for the case of two devices that use respiratory inductance plethysmography (RIP) and accelerometer sensors, respectively. Allowing for different types of sensors in pairing allows us to include wearable devices that use a variety of different sensors. In practice, this form of sensor variety creates a number of challenges that limit the ability of the shared-key establishment algorithm to generate matching keys. The two main obstacles are the lack of synchronization across the devices and the need for correct noise-induced mismatches between the generated key bit-strings. B2P addresses the synchronization challenge by utilizing Change Point Detection (CPD) to detect abrupt changes in the respiration signal and consider their occurrences as synchronizing points. Any potential mismatches are handled by optimal quantization and encoding of the respiration signal in order to maximize the error correction rate and minimize the message overheads. Extensive evaluation on a dataset collected from 30 volunteers demonstrates that our protocol can generate a secure 256-bit key every 2.85 seconds (around one breathing cycle). Particular attention is given to secure B2P against device impersonation attacks.
Markov-modulated L\'evy processes lead to matrix integral equations of the kind $ A_0 + A_1X+A_2 X^2+A_3(X)=0$ where $A_0$, $A_1$, $A_2$ are given matrix coefficients, while $A_3(X)$ is a nonlinear function, expressed in terms of integrals involving the exponential of the matrix $X$ itself. In this paper we propose some numerical methods for the solution of this class of matrix equations, perform a theoretical convergence analysis and show the effectiveness of the new methods by means of a wide numerical experimentation.
Suppose we observe an infinite series of coin flips $X_1,X_2,\ldots$, and wish to sequentially test the null that these binary random variables are exchangeable. Nonnegative supermartingales (NSMs) are a workhorse of sequential inference, but we prove that they are powerless for this problem. First, utilizing a geometric concept called fork-convexity (a sequential analog of convexity), we show that any process that is an NSM under a set of distributions, is also necessarily an NSM under their "fork-convex hull". Second, we demonstrate that the fork-convex hull of the exchangeable null consists of all possible laws over binary sequences; this implies that any NSM under exchangeability is necessarily nonincreasing, hence always yields a powerless test for any alternative. Since testing arbitrary deviations from exchangeability is information theoretically impossible, we focus on Markovian alternatives. We combine ideas from universal inference and the method of mixtures to derive a "safe e-process", which is a nonnegative process with expectation at most one under the null at any stopping time, and is upper bounded by a martingale, but is not itself an NSM. This in turn yields a level $\alpha$ sequential test that is consistent; regret bounds from universal coding also demonstrate rate-optimal power. We present ways to extend these results to any finite alphabet and to Markovian alternatives of any order using a "double mixture" approach. We provide an array of simulations, and give general approaches based on betting for unstructured or ill-specified alternatives. Finally, inspired by Shafer, Vovk, and Ville, we provide game-theoretic interpretations of our e-processes and pathwise results.
We present a geometric framework for constructing additive and non-additive stabiliser codes which encompasses stabiliser codes and graphical non-additive stabiliser codes.
The distributed matrix multiplication problem with an unknown number of stragglers is considered, where the goal is to efficiently and flexibly obtain the product of two massive matrices by distributing the computation across N servers. There are up to N - R stragglers but the exact number is not known a priori. Motivated by reducing the computation load of each server, a flexible solution is proposed to fully utilize the computation capability of available servers. The computing task for each server is separated into several subtasks, constructed based on Entangled Polynomial codes by Yu et al. The final results can be obtained from either a larger number of servers with a smaller amount of computation completed per server or a smaller number of servers with a larger amount of computation completed per server. The required finite field size of the proposed solution is less than 2N. Moreover, the optimal design parameters such as the partitioning of the input matrices is discussed. Our constructions can also be generalized to other settings such as batch distributed matrix multiplication and secure distributed matrix multiplication.
Because of the superior feature representation ability of deep learning, various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies. To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently, which makes speeding up the training process an essential problem. Different from the models with dense training data, the training data for CTR models is usually high-dimensional and sparse. To transform the high-dimensional sparse input into low-dimensional dense real-value vectors, almost all deep CTR models adopt the embedding layer, which easily reaches hundreds of GB or even TB. Since a single GPU cannot afford to accommodate all the embedding parameters, when performing distributed training, it is not reasonable to conduct the data-parallelism only. Therefore, existing distributed training platforms for recommendation adopt model-parallelism. Specifically, they use CPU (Host) memory of servers to maintain and update the embedding parameters and utilize GPU worker to conduct forward and backward computations. Unfortunately, these platforms suffer from two bottlenecks: (1) the latency of pull \& push operations between Host and GPU; (2) parameters update and synchronization in the CPU servers. To address such bottlenecks, in this paper, we propose the ScaleFreeCTR: a MixCache-based distributed training system for CTR models. Specifically, in SFCTR, we also store huge embedding table in CPU but utilize GPU instead of CPU to conduct embedding synchronization efficiently. To reduce the latency of data transfer between both GPU-Host and GPU-GPU, the MixCache mechanism and Virtual Sparse Id operation are proposed. Comprehensive experiments and ablation studies are conducted to demonstrate the effectiveness and efficiency of SFCTR.
Interest point descriptors have fueled progress on almost every problem in computer vision. Recent advances in deep neural networks have enabled task-specific learned descriptors that outperform hand-crafted descriptors on many problems. We demonstrate that commonly used metric learning approaches do not optimally leverage the feature hierarchies learned in a Convolutional Neural Network (CNN), especially when applied to the task of geometric feature matching. While a metric loss applied to the deepest layer of a CNN, is often expected to yield ideal features irrespective of the task, in fact the growing receptive field as well as striding effects cause shallower features to be better at high precision matching tasks. We leverage this insight together with explicit supervision at multiple levels of the feature hierarchy for better regularization, to learn more effective descriptors in the context of geometric matching tasks. Further, we propose to use activation maps at different layers of a CNN, as an effective and principled replacement for the multi-resolution image pyramids often used for matching tasks. We propose concrete CNN architectures employing these ideas, and evaluate them on multiple datasets for 2D and 3D geometric matching as well as optical flow, demonstrating state-of-the-art results and generalization across datasets.