Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These lingering issues with exposed plaintext domain names have led to the development of a new generation of protocols that keep accessed domain names hidden. DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) hide the domain names of DNS queries, while Encrypted Server Name Indication (ESNI) encrypts the domain name in the SNI extension. We present DNEye, a measurement system built on top of a network of distributed vantage points, which we used to study the accessibility of DoT/DoH and ESNI, and to investigate whether these protocols are tampered with by network providers (e.g., for censorship). Moreover, we evaluate the efficacy of these protocols in circumventing network interference when accessing content blocked by traditional DNS manipulation. We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. At the same time, we discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed.
Since the lack of IPv6 network development, China is currently accelerating IPv6 deployment. In this scenario, traffic and network structure show a huge shift. However, due to the long-term prosperity, we are ignorant of the problems behind such outbreak of traffic and performance improvement events in accelerating deployment. IPv6 development in some regions will still face similar challenges in the future. To contribute to solving this problem, in this paper, we produce a new measurement framework and implement a 5-month passive measurement on the IPv6 network during the accelerating deployment in China. We combine 6 global-scale datasets to form the normal status of IPv6 network, which is against to the accelerating status formed by the passive traffic. Moreover, we compare with the traffic during World IPv6 Day 2011 and Launch 2012 to discuss the common nature of accelerating deployment. Finally, the results indicate that the IPv6 accelerating deployment is often accompanied by an unbalanced network status. It exposes unresolved security issues including the challenge of user privacy and inappropriate access methods. According to the investigation, we point the future IPv6 development after accelerating deployment.
Continuous-time measurements are instrumental for a multitude of tasks in quantum engineering and quantum control, including the estimation of dynamical parameters of open quantum systems monitored through the environment. However, such measurements do not extract the maximum amount of information available in the output state, so finding alternative optimal measurement strategies is a major open problem. In this paper we solve this problem in the setting of discrete-time input-output quantum Markov chains. We present an efficient algorithm for optimal estimation of one-dimensional dynamical parameters which consists of an iterative procedure for updating a `measurement filter' operator and determining successive measurement bases for the output units. A key ingredient of the scheme is the use of a coherent quantum absorber as a way to post-process the output after the interaction with the system. This is designed adaptively such that the joint system and absorber stationary state is pure at a reference parameter value. The scheme offers an exciting prospect for optimal continuous-time adaptive measurements, but more work is needed to find realistic practical implementations.
The recent success of distributed word representations has led to an increased interest in analyzing the properties of their spatial distribution. Several studies have suggested that contextualized word embedding models do not isotropically project tokens into vector space. However, current methods designed to measure isotropy, such as average random cosine similarity and the partition score, have not been thoroughly analyzed and are not appropriate for measuring isotropy. We propose IsoScore: a novel tool that quantifies the degree to which a point cloud uniformly utilizes the ambient vector space. Using rigorously designed tests, we demonstrate that IsoScore is the only tool available in the literature that accurately measures how uniformly distributed variance is across dimensions in vector space. Additionally, we use IsoScore to challenge a number of recent conclusions in the NLP literature that have been derived using brittle metrics of isotropy. We caution future studies from using existing tools to measure isotropy in contextualized embedding space as resulting conclusions will be misleading or altogether inaccurate.
This paper makes the first attempt to apply newly developed upwind GFDM for the meshless solution of two-phase porous flow equations. In the presented method, node cloud is used to flexibly discretize the computational domain, instead of complicated mesh generation. Combining with moving least square approximation and local Taylor expansion, spatial derivatives of oil-phase pressure at a node are approximated by generalized difference operators in the local influence domain of the node. By introducing the first-order upwind scheme of phase relative permeability, and combining the discrete boundary conditions, fully-implicit GFDM-based nonlinear discrete equations of the immiscible two-phase porous flow are obtained and solved by the nonlinear solver based on the Newton iteration method with the automatic differentiation, to avoid the additional computational cost and possible computational instability caused by sequentially coupled scheme. Two numerical examples are implemented to test the computational performances of the presented method. Detailed error analysis finds the two sources of the calculation error, roughly studies the convergence order thus find that the low-order error of GFDM makes the convergence order of GFDM lower than that of FDM when node spacing is small, and points out the significant effect of the symmetry or uniformity of the node collocation in the node influence domain on the accuracy of generalized difference operators, and the radius of the node influence domain should be small to achieve high calculation accuracy, which is a significant difference between the studied hyperbolic two-phase porous flow problem and the elliptic problems when GFDM is applied.
Crowd-sourcing is a powerful solution for finding correct answers to expensive and unanswered queries in databases, including those with uncertain and incomplete data. Attempts to use crowd-sourcing to exploit human abilities to process these expensive queries using human workers have helped to provide accurate results by utilising the available data in the crowd. Crowd-sourcing database systems (CSDBs) combine the knowledge of the crowd with a relational database by using some variant of a relational database with minor changes. This paper surveys the leading studies conducted in the area of query processing with regard to both traditional and preference queries in CSDBs. The focus of this work is on highlighting the strengths and the weakness of each approach. A detailed discussion of current and future trends research associated with query processing in the area of CSDBs is also presented.
Attestation is a fundamental building block to establish trust over software systems. When used in conjunction with trusted execution environments, it guarantees that genuine code is executed even when facing strong attackers, paving the way for adoption in several sensitive application domains. This paper reviews existing remote attestation principles and compares the functionalities of current trusted execution environments as Intel SGX, Arm TrustZone and AMD SEV, as well as emerging RISC-V solutions.
In this paper we study the finite sample and asymptotic properties of various weighting estimators of the local average treatment effect (LATE), several of which are based on Abadie (2003)'s kappa theorem. Our framework presumes a binary endogenous explanatory variable ("treatment") and a binary instrumental variable, which may only be valid after conditioning on additional covariates. We argue that one of the Abadie estimators, which we show is weight normalized, is likely to dominate the others in many contexts. A notable exception is in settings with one-sided noncompliance, where certain unnormalized estimators have the advantage of being based on a denominator that is bounded away from zero. We use a simulation study and three empirical applications to illustrate our findings. In applications to causal effects of college education using the college proximity instrument (Card, 1995) and causal effects of childbearing using the sibling sex composition instrument (Angrist and Evans, 1998), the unnormalized estimates are clearly unreasonable, with "incorrect" signs, magnitudes, or both. Overall, our results suggest that (i) the relative performance of different kappa weighting estimators varies with features of the data-generating process; and that (ii) the normalized version of Tan (2006)'s estimator may be an attractive alternative in many contexts. Applied researchers with access to a binary instrumental variable should also consider covariate balancing or doubly robust estimators of the LATE.
Properly handling missing data is a fundamental challenge in recommendation. Most present works perform negative sampling from unobserved data to supply the training of recommender models with negative signals. Nevertheless, existing negative sampling strategies, either static or adaptive ones, are insufficient to yield high-quality negative samples --- both informative to model training and reflective of user real needs. In this work, we hypothesize that item knowledge graph (KG), which provides rich relations among items and KG entities, could be useful to infer informative and factual negative samples. Towards this end, we develop a new negative sampling model, Knowledge Graph Policy Network (KGPolicy), which works as a reinforcement learning agent to explore high-quality negatives. Specifically, by conducting our designed exploration operations, it navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender. We tested on a matrix factorization (MF) model equipped with KGPolicy, and it achieves significant improvements over both state-of-the-art sampling methods like DNS and IRGAN, and KG-enhanced recommender models like KGAT. Further analyses from different angles provide insights of knowledge-aware sampling. We release the codes and datasets at //github.com/xiangwang1223/kgpolicy.
While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.