Secure multi-party computation provides a wide array of protocols for mutually distrustful parties be able to securely evaluate functions of private inputs. Within recent years, many such protocols have been proposed representing a plethora of strategies to securely and efficiently handle such computation. These protocols have become increasingly efficient, but their performance still is impractical in many settings. We propose new approaches to some of these problems which are either more efficient than previous works within the same security models or offer better security guarantees with comparable efficiency. The goals of this research are to improve efficiency and security of secure multi-party protocols and explore the application of such approaches to novel threat scenarios. Some of the novel optimizations employed are dynamically switching domains of shared secrets, asymmetric computations, and advantageous functional transformations, among others. Specifically, this work presents a novel combination of Shamir and Additive secret sharing to be used in parallel which allows for the transformation of efficient protocols secure against passive adversaries to be secure against active adversaries. From this set of primitives we propose the construction of a comparison protocol which can be implemented under that approach with a complexity which is more efficient than other recent works for common domains of interest. Finally, we present a system which addresses a critical security threat for the protection and obfuscation of information which may be of high consequence.
We propose a method for checking generalized reachability properties in Petri nets that takes advantage of structural reductions and that can be used, transparently, as a pre-processing step of existing model-checkers. Our approach is based on a new procedure that can project a property, about an initial Petri net, into an equivalent formula that only refers to the reduced version of this net. Our projection is defined as a variable elimination procedure for linear integer arithmetic tailored to the specific kind of constraints we handle. It has linear complexity, is guaranteed to return a sound property, and makes use of a simple condition to detect when the result is exact. Experimental results show that our approach works well in practice and that it can be useful even when there is only a limited amount of reductions.
The ability to derive useful information by asking clarifying questions (ACQ) is an important element of real life collaboration on reasoning tasks, such as question answering (QA). Existing natural language ACQ challenges, however, evaluate generations based on word overlap rather than the value of the information itself. Word overlap is often an inappropriate metric for question generation since many different questions could be useful in a given situation, and a single question can be phrased many different ways. Instead, we propose evaluating questions pragmatically based on the value of the information they retrieve. Here we present a definition and framework for natural language pragmatic asking of clarifying questions (PACQ), the problem of generating questions that result in answers useful for a reasoning task. We also present fact-level masking (FLM), a procedure for converting natural language datasets into self-supervised PACQ datasets by omitting particular critical facts. Finally, we generate a PACQ dataset from the HotpotQA dataset using FLM and evaluate several zero-shot language models on it. Our experiments show that current zero-shot models struggle to ask questions that retrieve useful information, as compared to human annotators. These results demonstrate an opportunity to use FLM datasets and the PACQ framework to objectively evaluate and improve question generation and other language models.
The main computational challenge in Bayesian inference is to compute integrals against a high-dimensional posterior distribution. In the past decades, variational inference (VI) has emerged as a tractable approximation to these integrals, and a viable alternative to the more established paradigm of Markov Chain Monte Carlo. However, little is known about the approximation accuracy of VI. In this work, we bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size. Our error analysis relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.
We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate. This issue can be lessened by the analytic spectrum estimation of the Laplacian or normalized Laplacian matrices in spectral clustering, making the proposed algorithm very efficient for spectral clustering. Second, to make the proposed algorithm capable of analyzing big data, a distributed and parallel version has been developed with attractive scalability. The speedup by parallel computing is approximately equivalent to $\sqrt{p}$, where $p$ denotes the number of processes. {Numerical results will be provided to demonstrate its efficiency in spectral clustering and scalability advantage over existing eigensolvers used for spectral clustering in parallel computing environments.}
sEMG pattern recognition algorithms have been explored extensively in decoding movement intent, yet are known to be vulnerable to changing recording conditions, exhibiting significant drops in performance across subjects, and even across sessions. Multi-channel surface EMG, also referred to as high-density sEMG (HD-sEMG) systems, have been used to improve performance with the information collected through the use of additional electrodes. However, a lack of robustness is ever present due to limited datasets and the difficulties in addressing sources of variability, such as electrode placement. In this study, we propose training on a collection of input channel subsets and augmenting our training distribution with data from different electrode locations, simultaneously targeting electrode shift and reducing input dimensionality. Our method increases robustness against electrode shift and results in significantly higher intersession performance across subjects and classification algorithms.
Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity. While gradient descent converges linearly under such conditions, it remains an open question whether Nesterov's momentum enjoys accelerated convergence under similar settings and assumptions. In this work, we consider a new class of objective functions, where only a subset of the parameters satisfies strong convexity, and show Nesterov's momentum achieves acceleration in theory for this objective class. We provide two realizations of the problem class, one of which is deep ReLU networks, which --to the best of our knowledge--constitutes this work the first that proves accelerated convergence rate for non-trivial neural network architectures.
We propose a data-driven approach for propagating uncertainty in stochastic power grid simulations and apply it to the estimation of transmission line failure probabilities. A reduced-order equation governing the evolution of the observed line energy probability density function is derived from the Fokker--Planck equation of the full-order continuous Markov process. Our method consists of estimates produced by numerically integrating this reduced equation. Numerical experiments for scalar- and vector-valued energy functions are conducted using the classical multimachine model under spatiotemporally correlated noise perturbation. The method demonstrates a more sample-efficient approach for computing probabilities of tail events when compared with kernel density estimation. Moreover, it produces vastly more accurate estimates of joint event occurrence when compared with independent models.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.