This study is inspired by those of Huang et al. (Soft Comput. 25, 2513--2520, 2021) and Wang et al. (Inf. Sci. 179, 3026--3040, 2009) in which some ranking techniques for interval-valued intuitionistic fuzzy numbers (IVIFNs) were introduced. In this study, we prove that the space of all IVIFNs with the relation in the method for comparing any two IVIFNs based on a score function and three types of entropy functions is a complete chain and obtain that this relation is an admissible order. Moreover, we demonstrate that IVIFNs are complete chains to the relation in the comparison method for IVIFNs on the basis of score, accuracy, membership uncertainty index, and hesitation uncertainty index functions.
Probabilistic databases (PDBs) model uncertainty in data in a quantitative way. In the established formal framework, probabilistic (relational) databases are finite probability spaces over relational database instances. This finiteness can clash with intuitive query behavior (Ceylan et al., KR 2016), and with application scenarios that are better modeled by continuous probability distributions (Dalvi et al., CACM 2009). We formally introduced infinite PDBs in (Grohe and Lindner, PODS 2019) with a primary focus on countably infinite spaces. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. We argue that finite point processes are an appropriate model from probability theory for dealing with general probabilistic databases. This allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and Datalog queries.
One of the main reasons for query model's prominence in quantum complexity is the presence of concrete lower bounding techniques: polynomial method and adversary method. There have been considerable efforts to not just give lower bounds using these methods but even to compare and relate them. We explore the value of these bounds on quantum query complexity for the class of symmetric functions, arguably one of the most natural and basic set of Boolean functions. We show that the recently introduced measure of spectral sensitivity give the same value as both these bounds (positive adversary and approximate degree) for every total symmetric Boolean function. We also look at the quantum query complexity of Gap Majority, a partial symmetric function. It has gained importance recently in regard to understanding the composition of randomized query complexity. We characterize the quantum query complexity of Gap Majority and show a lower bound on noisy randomized query complexity (Ben-David and Blais, FOCS 2020) in terms of quantum query complexity. In addition, we study how large certificate complexity and block sensitivity can be as compared to sensitivity (even up to constant factors) for symmetric functions. We show tight separations, i.e., give upper bound on possible separations and construct functions achieving the same.
Portnoy (2019) considered the problem of constructing an optimal confidence interval for the mean based on a single observation $\, X \sim {\cal{N}}(\mu , \, \sigma^2) \,$. Here we extend this result to obtaining 1-sample confidence intervals for $\, \sigma \,$ and to cases of symmetric unimodal distributions and of distributions with compact support. Finally, we extend the multivariate result in Portnoy (2019) to allow a sample of size $\, m \,$ from a multivariate normal distribution where $m$ may be less than the dimension.
The probabilistic method is a technique for proving combinatorial existence results by means of showing that a randomly chosen object has the desired properties with positive probability. A particularly powerful probabilistic tool is the Lov\'{a}sz Local Lemma (the LLL for short), which was introduced by Erd\H{o}s and Lov\'{a}sz in the mid-1970s. Here we develop a version of the LLL that can be used to prove the existence of continuous colorings. We then give several applications in Borel and topological dynamics. * Seward and Tucker-Drob showed that every free Borel action $\Gamma \curvearrowright X$ of a countable group $\Gamma$ admits an equivariant Borel map $\pi \colon X \to Y$ to a free subshift $Y \subset 2^\Gamma$. We give a new simple proof of this result. * We show that for a countable group $\Gamma$, $\mathrm{Free}(2^\Gamma)$ is weakly contained, in the sense of Elek, in every free continuous action of $\Gamma$ on a zero-dimensional Polish space. This fact is analogous to the theorem of Ab\'{e}rt and Weiss for probability measure-preserving actions and has a number of consequences in continuous combinatorics. In particular, we deduce that a coloring problem admits a continuous solution on $\mathrm{Free}(2^\Gamma)$ if and only if it can be solved on finite subgraphs of the Cayley graph of $\Gamma$ by an efficient deterministic distributed algorithm (this fact was also proved independently and using different methods by Seward). This establishes a formal correspondence between questions that have been studied independently in continuous combinatorics and in distributed computing.
A triangle in a hypergraph $\mathcal{H}$ is a set of three distinct edges $e, f, g\in\mathcal{H}$ and three distinct vertices $u, v, w\in V(\mathcal{H})$ such that $\{u, v\}\subseteq e$, $\{v, w\}\subseteq f$, $\{w, u\}\subseteq g$ and $\{u, v, w\}\cap e\cap f\cap g=\emptyset$. Johansson proved in 1996 that $\chi(G)=\mathcal{O}(\Delta/\log\Delta)$ for any triangle-free graph $G$ with maximum degree $\Delta$. Cooper and Mubayi later generalized the Johansson's theorem to all rank $3$ hypergraphs. In this paper we provide a common generalization of both these results for all hypergraphs, showing that if $\mathcal{H}$ is a rank $k$, triangle-free hypergraph, then the list chromatic number \[ \chi_{\ell}(\mathcal{H})\leq \mathcal{O}\left(\max_{2\leq \ell \leq k} \left\{\left( \frac{\Delta_{\ell}}{\log \Delta_{\ell}} \right)^{\frac{1}{\ell-1}} \right\}\right), \] where $\Delta_{\ell}$ is the maximum $\ell$-degree of $\mathcal{H}$. The result is sharp apart from the constant. Moreover, our result implies, generalizes and improves several earlier results on the chromatic number and also independence number of hypergraphs, while its proof is based on a different approach than prior works in hypergraphs (and therefore provides alternative proofs to them). In particular, as an application, we establish a bound on chromatic number of sparse hypergraphs in which each vertex is contained in few triangles, and thus extend results of Alon, Krivelevich and Sudakov, and Cooper and Mubayi from hypergraphs of rank 2 and 3, respectively, to all hypergraphs.
Computational feasibility is a widespread concern that guides the framing and modeling of biological and artificial intelligence. The specification of cognitive system capacities is often shaped by unexamined intuitive assumptions about the search space and complexity of a subcomputation. However, a mistaken intuition might make such initial conceptualizations misleading for what empirical questions appear relevant later on. We undertake here computational-level modeling and complexity analyses of segmentation - a widely hypothesized subcomputation that plays a requisite role in explanations of capacities across domains - as a case study to show how crucial it is to formally assess these assumptions. We mathematically prove two sets of results regarding hardness and search space size that may run counter to intuition, and position their implications with respect to existing views on the subcapacity.
In this paper, I consider a fine-grained dichotomy of Boolean counting constraint satisfaction problem (#CSP), under the exponential time hypothesis of counting version (#ETH). Suppose $\mathscr{F}$ is a finite set of algebraic complex-valued functions defined on Boolean domain. When $\mathscr{F}$ is a subset of either two special function sets, I prove that #CSP($\mathscr{F}$) is polynomial-time solvable, otherwise it can not be computed in sub-exponential time unless #ETH fails. I also improve the result by proving the same dichotomy holds for #CSP with bounded degree (every variable appears at most constant constraints), even for #R$_3$-CSP. An important preparation before proving the result is to argue that pinning (two special unary functions $[1,0]$ and $[0,1]$ are used to reduce arity) can also keep the sub-exponential lower bound of a Boolean #CSP problem. I discuss this issue by utilizing some common methods in proving #P-hardness of counting problems. The proof illustrates the internal correlation among these commonly used methods.
The ongoing NIST standardization process has shown that Proof of Knowledge (PoK) based signatures have become an important type of possible post-quantum signatures. Regarding code-based cryptography, the original approach for PoK based signatures is the Stern protocol which allows to prove the knowledge of a small weight vector solving a given instance of the Syndrome Decoding (SD) problem over F2. It features a soundness error equal to 2/3. This protocol was improved a few years later by V\'eron who proposed a variation of the scheme based on the General Syndrome Decoding (GSD) problem which leads to better results in term of communication. A few years later, the AGS protocol introduced a variation of the V\'eron protocol based on Quasi-Cyclic (QC) matrices. The AGS protocol permits to obtain an asymptotic soundness error of 1/2 and an improvement in term of communications. In the present paper, we introduce the Quasi-Cyclic Stern PoK which constitutes an adaptation of the AGS scheme in a SD context, as well as several new optimizations for code-based PoK. Our main optimization on the size of the signature can't be applied to GSD based protocols such as AGS and therefore motivated the design of our new protocol. In addition, we also provide a special soundness proof that is compatible with the use of the Fiat-Shamir transform for 5-round protocols. This approach is valid for our protocol but also for the AGS protocol which was lacking such a proof. We compare our results with existing signatures including the recent code-based signatures based on PoK leveraging the MPC in the head paradigm. In practice, our new protocol is as fast as AGS while reducing its associated signature length by 20%. As a consequence, it constitutes an interesting trade-off between signature length and execution time for the design of a code-based signature relying only on the difficulty of the SD problem.
In clinical trials, identification of prognostic and predictive biomarkers is essential to precision medicine. Prognostic biomarkers can be useful for the prevention of the occurrence of the disease, and predictive biomarkers can be used to identify patients with potential benefit from the treatment. Previous researches were mainly focused on clinical characteristics, and the use of genomic data in such an area is hardly studied. A new method is required to simultaneously select prognostic and predictive biomarkers in high dimensional genomic data where biomarkers are highly correlated. We propose a novel approach called PPLasso (Prognostic Predictive Lasso) integrating prognostic and predictive effects into one statistical model. PPLasso also takes into account the correlations between biomarkers that can alter the biomarker selection accuracy. Our method consists in transforming the design matrix to remove the correlations between the biomarkers before applying the generalized Lasso. In a comprehensive numerical evaluation, we show that PPLasso outperforms the traditional Lasso approach on both prognostic and predictive biomarker identification in various scenarios. Finally, our method is applied to publicly available transcriptomic data from clinical trial RV144. Our method is implemented in the PPLasso R package which will be soon available from the Comprehensive R Archive Network (CRAN).
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.