We present the results of a study done in order to validate concepts and methods that have been introduced in (Johansen and Fischer-Hubner, 2020. "Making GDPR Usable: A Model to Support Usability Evaluations of Privacy." in IFIP AICT 576, 275-291). We use as respondents in our interviews experts working across fields of relevance to these concepts, including law and data protection/privacy, certifications and standardization, and usability (as studied in the field of Human-Computer Interaction). We study the experts' opinions about four new concepts, namely: (i) a definition of Usable Privacy, (ii) 30 Usable Privacy Goals identified as excerpts from the GDPR (European General Data Protection Regulation), (iii) a set of 25 corresponding Usable Privacy Criteria together with their multiple measurable sub-criteria, and (iv) the Usable Privacy Cube model, which puts all these together with the EuroPriSe certification criteria, with the purpose of making explicit several aspects of certification processes such as orderings of criteria, interactions between these, different stakeholder perspectives, and context of use/processing. The expert opinions are varied, example-rich, and forward-looking, which gives a impressive list of open problems where the above four concepts can work as a foundation for further developments. We employed a critical qualitative research, using theory triangulation to analyze the data representing three groups of experts, categorized as 'certifications', 'law', and 'usability', coming both from industry and academia. The results of our analysis show agreement among the experts about the need for evaluations and measuring of usability of privacy in order to allow for exercising data subjects' rights and to evaluate the degree to which data controllers comply with the data protection principles.
Bipartite secret sharing schemes have a bipartite access structure in which the set of participants is divided into two parts and all participants in the same part play an equivalent role. Such a bipartite scheme can be described by a \emph{staircase}: the collection of its minimal points. The complexity of a scheme is the maximal share size relative to the secret size; and the $\kappa$-complexity of an access structure is the best lower bound provided by the entropy method. An access structure is $\kappa$-ideal if it has $\kappa$-complexity 1. Motivated by the abundance of open problems in this area, the main results can be summarized as follows. First, a new characterization of $\kappa$-ideal multipartite access structures is given which offers a straightforward and simple approach to describe ideal bipartite and tripartite access structures. Second, the $\kappa$-complexity is determined for a range of bipartite access structures, including those determined by two points, staircases with equal widths and heights, and staircases with all heights 1. Third, matching linear schemes are presented for some non-ideal cases, including staircases where all heights are 1 and all widths are equal. Finally, finding the Shannon complexity of a bipartite access structure can be considered as a discrete submodular optimization problem. An interesting and intriguing continuous version is defined which might give further insight to the large-scale behavior of these optimization problems.
A myriad of approaches have been proposed to characterise the mesoscale structure of networks - most often as a partition based on patterns variously called communities, blocks, or clusters. Clearly, distinct methods designed to detect different types of patterns may provide a variety of answers to the network's mesoscale structure. Yet, even multiple runs of a given method can sometimes yield diverse and conflicting results, producing entire landscapes of partitions which potentially include multiple (locally optimal) mesoscale explanations of the network. Such ambiguity motivates a closer look at the ability of these methods to find multiple qualitatively different 'ground truth' partitions in a network. Here, we propose the stochastic cross-block model (SCBM), a generative model which allows for two distinct partitions to be built into the mesoscale structure of a single benchmark network. We demonstrate a use case of the benchmark model by appraising the power of stochastic block models (SBMs) to detect implicitly planted coexisting bi-community and core-periphery structures of different strengths. Given our model design and experimental set-up, we find that the ability to detect the two partitions individually varies by SBM variant and that coexistence of both partitions is recovered only in a very limited number of cases. Our findings suggest that in most instances only one - in some way dominating - structure can be detected, even in the presence of other partitions. They underline the need for considering entire landscapes of partitions when different competing explanations exist and motivate future research to advance partition coexistence detection methods. Our model also contributes to the field of benchmark networks more generally by enabling further exploration of the ability of new and existing methods to detect ambiguity in the mesoscale structure of networks.
Recently, order-preserving pattern (OPP) mining, a new sequential pattern mining method, has been proposed to mine frequent relative orders in a time series. Although frequent relative orders can be used as features to classify a time series, the mined patterns do not reflect the differences between two classes of time series well. To effectively discover the differences between time series, this paper addresses the top-k contrast OPP (COPP) mining and proposes a COPP-Miner algorithm to discover the top-k contrast patterns as features for time series classification, avoiding the problem of improper parameter setting. COPP-Miner is composed of three parts: extreme point extraction to reduce the length of the original time series, forward mining, and reverse mining to discover COPPs. Forward mining contains three steps: group pattern fusion strategy to generate candidate patterns, the support rate calculation method to efficiently calculate the support of a pattern, and two pruning strategies to further prune candidate patterns. Reverse mining uses one pruning strategy to prune candidate patterns and consists of applying the same process as forward mining. Experimental results validate the efficiency of the proposed algorithm and show that top-k COPPs can be used as features to obtain a better classification performance.
We extend the recently introduced setting of coherent differentiation for taking into account not only differentiation, but also Taylor expansion in categories which are not necessarily (left)additive.The main idea consists in extending summability into an infinitary functor which intuitively maps any object to the object of its countable summable families.This functor is endowed with a canonical structure of bimonad.In a linear logical categorical setting, Taylor expansion is then axiomatized as a distributive law between this summability functor and the resource comonad (aka.~exponential), allowing to extend the summability functor into a bimonad on the Kleisli category of the resource comonad: this extended functor computes the Taylor expansion of the (nonlinear) morphisms of the Kleisli category.We also show how this categorical axiomatizations of Taylor expansion can be generalized to arbitrary cartesian categories, leading to a general theory of Taylor expansion formally similar to that of differential cartesian categories, although it does not require the underlying cartesian category to be left additive.We provide several examples of concrete categories which arise in denotational semantics and feature such analytic structures.
In recent research, non-orthogonal artificial noise (NORAN) has been proposed as an alternative to orthogonal artificial noise (AN). However, NORAN introduces additional noise into the channel, which reduces the capacity of the legitimate channel (LC). At the same time, selecting a NORAN design with ideal security performance from a large number of design options is also a challenging problem. To address these two issues, a novel NORAN based on a pilot information codebook is proposed in this letter. The codebook associates different suboptimal NORANs with pilot information as the key under different channel state information (CSI). The receiver interrogates the codebook using the pilot information to obtain the NORAN that the transmitter will transmit in the next moment, in order to eliminate the NORAN when receiving information. Therefore, NORAN based on pilot information codebooks can improve the secrecy capacity (SC) of the communication system by directly using suboptimal NORAN design schemes without increasing the noise in the LC. Numerical simulations and analyses show that the introduction of NORAN with a novel design using pilot information codebooks significantly enhances the security and improves the SC of the communication system.
This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches into three categories: ($i$) VLP for image-text tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding; ($ii$) VLP for core computer vision tasks, such as (open-set) image classification, object detection, and segmentation; and ($iii$) VLP for video-text tasks, such as video captioning, video-text retrieval, and video question answering. For each category, we present a comprehensive review of state-of-the-art methods, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies. In addition, for each category, we discuss advanced topics being actively explored in the research community, such as big foundation models, unified modeling, in-context few-shot learning, knowledge, robustness, and computer vision in the wild, to name a few.
In the past few years, the emergence of pre-training models has brought uni-modal fields such as computer vision (CV) and natural language processing (NLP) to a new era. Substantial works have shown they are beneficial for downstream uni-modal tasks and avoid training a new model from scratch. So can such pre-trained models be applied to multi-modal tasks? Researchers have explored this problem and made significant progress. This paper surveys recent advances and new frontiers in vision-language pre-training (VLP), including image-text and video-text pre-training. To give readers a better overall grasp of VLP, we first review its recent advances from five aspects: feature extraction, model architecture, pre-training objectives, pre-training datasets, and downstream tasks. Then, we summarize the specific VLP models in detail. Finally, we discuss the new frontiers in VLP. To the best of our knowledge, this is the first survey on VLP. We hope that this survey can shed light on future research in the VLP field.
We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). Our first contribution is to formulate geometric perception as an optimization problem that jointly optimizes the feature descriptor and the geometric models given a large corpus of visual measurements (e.g., images, point clouds). Under this optimization formulation, we show that two important streams of research in vision, namely robust model fitting and deep feature learning, correspond to optimizing one block of the unknown variables while fixing the other block. This analysis naturally leads to our second contribution -- the SGP algorithm that performs alternating minimization to solve the joint optimization. SGP iteratively executes two meta-algorithms: a teacher that performs robust model fitting given learned features to generate geometric pseudo-labels, and a student that performs deep feature learning under noisy supervision of the pseudo-labels. As a third contribution, we apply SGP to two perception problems on large-scale real datasets, namely relative camera pose estimation on MegaDepth and point cloud registration on 3DMatch. We demonstrate that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
We present a simple self-training method that achieves 87.4% top-1 accuracy on ImageNet, which is 1.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves ImageNet-A top-1 accuracy from 16.6% to 74.2%, reduces ImageNet-C mean corruption error from 45.7 to 31.2, and reduces ImageNet-P mean flip rate from 27.8 to 16.1. To achieve this result, we first train an EfficientNet model on labeled ImageNet images and use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher. During the generation of the pseudo labels, the teacher is not noised so that the pseudo labels are as good as possible. But during the learning of the student, we inject noise such as data augmentation, dropout, stochastic depth to the student so that the noised student is forced to learn harder from the pseudo labels.
We study the problem of embedding-based entity alignment between knowledge graphs (KGs). Previous works mainly focus on the relational structure of entities. Some further incorporate another type of features, such as attributes, for refinement. However, a vast of entity features are still unexplored or not equally treated together, which impairs the accuracy and robustness of embedding-based entity alignment. In this paper, we propose a novel framework that unifies multiple views of entities to learn embeddings for entity alignment. Specifically, we embed entities based on the views of entity names, relations and attributes, with several combination strategies. Furthermore, we design some cross-KG inference methods to enhance the alignment between two KGs. Our experiments on real-world datasets show that the proposed framework significantly outperforms the state-of-the-art embedding-based entity alignment methods. The selected views, cross-KG inference and combination strategies all contribute to the performance improvement.