Algorithmic and statistical approaches to congressional redistricting are becoming increasingly valuable tools in courts and redistricting commissions for quantifying gerrymandering in the United States. While there is existing literature covering how various Markov chain Monte Carlo distributions differ in terms of projected electoral outcomes and geometric quantifiers of compactness, there is still work to be done on measuring similarities between different congressional redistricting plans. This paper briefly introduces an intuitive and interpretive measure of similarity, and a corresponding assignment matrix, that corresponds to the percentage of a state's area or population that stays in the same congressional district between two plans. We then show how to calculate this measure in polynomial time and briefly demonstrate some potential use-cases.
We give an axiomatic foundation to $\Lambda$-quantiles, a family of generalized quantiles introduced by Frittelli et al. (2014) under the name of Lambda Value at Risk. Under mild assumptions, we show that these functionals are characterized by a property that we call "locality", that means that any change in the distribution of the probability mass that arises entirely above or below the value of the $\Lambda$-quantile does not modify its value. We compare with a related axiomatization of the usual quantiles given by Chambers (2009), based on the stronger property of "ordinal covariance", that means that quantiles are covariant with respect to increasing transformations. Further, we present a systematic treatment of the properties of $\Lambda$-quantiles, refining some of the results of Frittelli et al. (2014) and Burzoni et al. (2017) and showing that in the case of a nonincreasing $\Lambda$ the properties of $\Lambda$-quantiles closely resemble those of the usual quantiles.
The extreme or maximum age of information (AoI) is analytically studied for wireless communication systems. In particular, a wireless powered single-antenna source node and a receiver (connected to the power grid) equipped with multiple antennas are considered when operated under independent Rayleigh-faded channels. Via the extreme value theory and its corresponding statistical features, we demonstrate that the extreme AoI converges to the Gumbel distribution whereas its corresponding parameters are obtained in straightforward closed-form expressions. Capitalizing on this result, the risk of the extreme AoI realization is analytically evaluated according to some relevant performance metrics, while some useful engineering insights are manifested.
Due to dependence between codeword elements, index modulation (IM) and related modulation techniques struggle to provide simple solutions for practical problems such as Gray coding between information bits and constellation points; and low-complexity log-likelihood ratio (LLR) calculations for channel-encoded information bits. In this paper, we show that a modulation technique based on a simple maximum distance separable (MDS) code, in other words, MDS modulation, can provide simple yet effective solutions to these problems, rendering the MDS techniques more beneficial in the presence of coding. We also compare the coded error performance of the MDS methods with that of the IM methods and demonstrate that MDS modulation outperforms IM.
The concept of median/consensus has been widely investigated in order to provide a statistical summary of ranking data, i.e. realizations of a random permutation $\Sigma$ of a finite set, $\{1,\; \ldots,\; n\}$ with $n\geq 1$ say. As it sheds light onto only one aspect of $\Sigma$'s distribution $P$, it may neglect other informative features. It is the purpose of this paper to define analogs of quantiles, ranks and statistical procedures based on such quantities for the analysis of ranking data by means of a metric-based notion of depth function on the symmetric group. Overcoming the absence of vector space structure on $\mathfrak{S}_n$, the latter defines a center-outward ordering of the permutations in the support of $P$ and extends the classic metric-based formulation of consensus ranking (medians corresponding then to the deepest permutations). The axiomatic properties that ranking depths should ideally possess are listed, while computational and generalization issues are studied at length. Beyond the theoretical analysis carried out, the relevance of the novel concepts and methods introduced for a wide variety of statistical tasks are also supported by numerous numerical experiments.
The state of art of electromagnetic integral equations has seen significant growth over the past few decades, overcoming some of the fundamental bottlenecks: computational complexity, low frequency and dense discretization breakdown, preconditioning, and so on. Likewise, the community has seen extensive investment in development of methods for higher order analysis, in both geometry and physics. Unfortunately, these standard geometric descriptors are continuous, but their normals are discontinuous at the boundary between triangular tessellations of control nodes, or patches, with a few exceptions; as a result, one needs to define additional mathematical infrastructure to define physical basis sets for vector problems. In stark contrast, the geometric representation used for design are second order differentiable almost everywhere on the surfaces. Using these description for analysis opens the door to several possibilities, and is the area we explore in this paper. Our focus is on Loop subdivision based isogeometric methods. In this paper, our goals are two fold: (i) development of computational infrastructure for isogeometric analysis of electrically large simply connected objects, and (ii) to introduce the notion of manifold harmonics transforms and its utility in computational electromagnetics. Several results highlighting the efficacy of these two methods are presented.
With the continuous rise of the COVID-19 cases worldwide, it is imperative to ensure that all those vulnerable countries lacking vaccine resources can receive sufficient support to contain the risks. COVAX is such an initiative operated by the WHO to supply vaccines to the most needed countries. One critical problem faced by the COVAX is how to distribute the limited amount of vaccines to these countries in the most efficient and equitable manner. This paper aims to address this challenge by first proposing a data-driven risk assessment and prediction model and then developing a decision-making framework to support the strategic vaccine distribution. The machine learning-based risk prediction model characterizes how the risk is influenced by the underlying essential factors, e.g., the vaccination level among the population in each COVAX country. This predictive model is then leveraged to design the optimal vaccine distribution strategy that simultaneously minimizes the resulting risks while maximizing the vaccination coverage in these countries targeted by COVAX. Finally, we corroborate the proposed framework using case studies with real-world data.
This study is to review the approaches used for measuring sentences similarity. Measuring similarity between natural language sentences is a crucial task for many Natural Language Processing applications such as text classification, information retrieval, question answering, and plagiarism detection. This survey classifies approaches of calculating sentences similarity based on the adopted methodology into three categories. Word-to-word based, structure based, and vector-based are the most widely used approaches to find sentences similarity. Each approach measures relatedness between short texts based on a specific perspective. In addition, datasets that are mostly used as benchmarks for evaluating techniques in this field are introduced to provide a complete view on this issue. The approaches that combine more than one perspective give better results. Moreover, structure based similarity that measures similarity between sentences structures needs more investigation.
Seam-cutting and seam-driven techniques have been proven effective for handling imperfect image series in image stitching. Generally, seam-driven is to utilize seam-cutting to find a best seam from one or finite alignment hypotheses based on a predefined seam quality metric. However, the quality metrics in most methods are defined to measure the average performance of the pixels on the seam without considering the relevance and variance among them. This may cause that the seam with the minimal measure is not optimal (perception-inconsistent) in human perception. In this paper, we propose a novel coarse-to-fine seam estimation method which applies the evaluation in a different way. For pixels on the seam, we develop a patch-point evaluation algorithm concentrating more on the correlation and variation of them. The evaluations are then used to recalculate the difference map of the overlapping region and reestimate a stitching seam. This evaluation-reestimation procedure iterates until the current seam changes negligibly comparing with the previous seams. Experiments show that our proposed method can finally find a nearly perception-consistent seam after several iterations, which outperforms the conventional seam-cutting and other seam-driven methods.
Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.