Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at //github.com/KTH-RPL/deflow.
Although there is mounting empirical evidence for the increase in affective polarization, few mechanistic models can explain its emergence at the population level. The question of how such a phenomenon can emerge from divergent opinions of a population on an ideological issue is still an open issue. In this paper, we establish that human normativity, that is, individual expression of normative opinions based on beliefs about the population, can lead to population-level polarization when ideological institutions distort beliefs in accordance with their objective of moving expressed opinion to one extreme. Using a game-theoretic model, we establish that individuals with more extreme opinions will have more extreme rhetoric and higher misperceptions about their outgroup members. Our model also shows that when social recommendation systems mediate institutional signals, we can observe the formation of different institutional communities, each with its unique community structure and characteristics. Using the model, we identify practical strategies platforms can implement, such as reducing exposure to signals from ideological institutions and a tailored approach to content moderation, both of which can rectify the affective polarization problem within its purview.
The fractal dimension of a surface allows its degree of roughness to be characterized quantitatively. However, limited effort is attempted to calculate the fractal dimension of surfaces computed from precisely known atomic coordinates from computational biomolecular and nanomaterial studies. This work proposes methods to estimate the fractal dimension of the surface of any 3D object composed of spheres, by representing the surface as either a voxelized point cloud or a mathematically exact surface, and computing its box-counting dimension. Sphractal is published as a Python package that provides these functionalities, and its utility is demonstrated on a set of simulated palladium nanoparticle data.
Instead of testing solely a precise hypothesis, it is often useful to enlarge it with alternatives that are deemed to differ from it negligibly. For instance, in a bioequivalence study one might consider the hypothesis that the concentration of an ingredient is exactly the same in two drugs. In such a context, it might be more relevant to test the enlarged hypothesis that the difference in concentration between the drugs is of no practical significance. While this concept is not alien to Bayesian statistics, applications remain confined to parametric settings and strategies on how to effectively harness experts' intuitions are often scarce or nonexistent. To resolve both issues, we introduce PROTEST, an accessible nonparametric testing framework that seamlessly integrates with Markov Chain Monte Carlo (MCMC) methods. We develop expanded versions of the model adherence, goodness-of-fit, quantile and two-sample tests. To demonstrate how PROTEST operates, we make use of examples, simulated studies - such as testing link functions in a binary regression setting, as well as a comparison between the performance of PROTEST and the PTtest (Holmes et al., 2015) - and an application with data on neuron spikes. Furthermore, we address the crucial issue of selecting the threshold - which controls how much a hypothesis is to be expanded - even when intuitions are limited or challenging to quantify.
Augmented Reality (AR) provides a safe and low-cost option for hazardous safety training that allows for the visualization of aspects that may be invisible, such as radiation. Effectively visually communicating such threats in the environment around the user is not straightforward. This work describes visually encoding radiation using the spatial awareness mesh of an AR Head Mounted Display. We leverage the AR device's GPUs to develop a real time solution that accumulates multiple dynamic sources and uses stencils to prevent an environment being over saturated with a visualization, as well as supporting the encoding of direction explicitly in the visualization. We perform a user study (25 participants) of different visualizations and obtain user feedback. Results show that there are complex interactions and while no visual representation was statistically superior or inferior, user opinions vary widely. We also discuss the evaluation approaches and provide recommendations.
In this work, we introduce DeepIPC, a novel end-to-end model tailored for autonomous driving, which seamlessly integrates perception and control tasks. Unlike traditional models that handle these tasks separately, DeepIPC innovatively combines a perception module, which processes RGBD images for semantic segmentation and generates bird's eye view (BEV) mappings, with a controller module that utilizes these insights along with GNSS and angular speed measurements to accurately predict navigational waypoints. This integration allows DeepIPC to efficiently translate complex environmental data into actionable driving commands. Our comprehensive evaluation demonstrates DeepIPC's superior performance in terms of drivability and multi-task efficiency across diverse real-world scenarios, setting a new benchmark for end-to-end autonomous driving systems with a leaner model architecture. The experimental results underscore DeepIPC's potential to significantly enhance autonomous vehicular navigation, promising a step forward in the development of autonomous driving technologies. For further insights and replication, we will make our code and datasets available at //github.com/oskarnatan/DeepIPC.
Bilevel optimization refers to scenarios whereby the optimal solution of a lower-level energy function serves as input features to an upper-level objective of interest. These optimal features typically depend on tunable parameters of the lower-level energy in such a way that the entire bilevel pipeline can be trained end-to-end. Although not generally presented as such, this paper demonstrates how a variety of graph learning techniques can be recast as special cases of bilevel optimization or simplifications thereof. In brief, building on prior work we first derive a more flexible class of energy functions that, when paired with various descent steps (e.g., gradient descent, proximal methods, momentum, etc.), form graph neural network (GNN) message-passing layers; critically, we also carefully unpack where any residual approximation error lies with respect to the underlying constituent message-passing functions. We then probe several simplifications of this framework to derive close connections with non-GNN-based graph learning approaches, including knowledge graph embeddings, various forms of label propagation, and efficient graph-regularized MLP models. And finally, we present supporting empirical results that demonstrate the versatility of the proposed bilevel lens, which we refer to as BloomGML, referencing that BiLevel Optimization Offers More Graph Machine Learning. Our code is available at //github.com/amberyzheng/BloomGML. Let graph ML bloom.
Graph clustering, which aims to divide the nodes in the graph into several distinct clusters, is a fundamental and challenging task. In recent years, deep graph clustering methods have been increasingly proposed and achieved promising performance. However, the corresponding survey paper is scarce and it is imminent to make a summary in this field. From this motivation, this paper makes the first comprehensive survey of deep graph clustering. Firstly, the detailed definition of deep graph clustering and the important baseline methods are introduced. Besides, the taxonomy of deep graph clustering methods is proposed based on four different criteria including graph type, network architecture, learning paradigm, and clustering method. In addition, through the careful analysis of the existing works, the challenges and opportunities from five perspectives are summarized. At last, the applications of deep graph clustering in four domains are presented. It is worth mentioning that a collection of state-of-the-art deep graph clustering methods including papers, codes, and datasets is available on GitHub. We hope this work will serve as a quick guide and help researchers to overcome challenges in this vibrant field.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.
This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.