In this paper, we introduce the following new concept in graph drawing. Our task is to find a small collection of drawings such that they all together satisfy some property that is useful for graph visualization. We propose investigating a property where each edge is not crossed in at least one drawing in the collection. We call such collection uncrossed. Such property is motivated by a quintessential problem of the crossing number, where one asks for a plane drawing where the number of edge crossings is minimum. Indeed, if we are allowed to visualize only one drawing, then the one which minimizes the number of crossings is probably the neatest for the first orientation. However, a collection of drawings where each highlights a different aspect of a graph without any crossings could shed even more light on the graph's structure. We propose two definitions. First, the uncrossed number, minimizes the number of graph drawings in a collection, satisfying the uncrossed property. Second, the uncrossed crossing number, minimizes the total number of crossings in the collection that satisfy the uncrossed property. For both definitions, we establish initial results. We prove that the uncrossed crossing number is NP-hard, but there is an FPT algorithm parameterized by the solution size.
This paper presents a novel Learning from Demonstration (LfD) method that uses neural fields to learn new skills efficiently and accurately. It achieves this by utilizing a shared embedding to learn both scene and motion representations in a generative way. Our method smoothly maps each expert demonstration to a scene-motion embedding and learns to model them without requiring hand-crafted task parameters or large datasets. It achieves data efficiency by enforcing scene and motion generation to be smooth with respect to changes in the embedding space. At inference time, our method can retrieve scene-motion embeddings using test time optimization, and generate precise motion trajectories for novel scenes. The proposed method is versatile and can employ images, 3D shapes, and any other scene representations that can be modeled using neural fields. Additionally, it can generate both end-effector positions and joint angle-based trajectories. Our method is evaluated on tasks that require accurate motion trajectory generation, where the underlying task parametrization is based on object positions and geometric scene changes. Experimental results demonstrate that the proposed method outperforms the baseline approaches and generalizes to novel scenes. Furthermore, in real-world experiments, we show that our method can successfully model multi-valued trajectories, it is robust to the distractor objects introduced at inference time, and it can generate 6D motions.
In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which per-forms 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead. Code is at https: //github.com/Zeqiang-Lai/HSDT.
In this paper, we propose a new generic method for detecting the number and locations of structural breaks or change points in piecewise linear models under stationary Gaussian noise. Our method transforms the change point detection problem into identifying local extrema (local maxima and local minima) through kernel smoothing and differentiation of the data sequence. By computing p-values for all local extrema based on peak height distributions of smooth Gaussian processes, we utilize the Benjamini-Hochberg procedure to identify significant local extrema as the detected change points. Our method can distinguish between two types of change points: continuous breaks (Type I) and jumps (Type II). We study three scenarios of piecewise linear signals, namely pure Type I, pure Type II and a mixture of Type I and Type II change points. The results demonstrate that our proposed method ensures asymptotic control of the False Discover Rate (FDR) and power consistency, as sequence length, slope changes, and jump size increase. Furthermore, compared to traditional change point detection methods based on recursive segmentation, our approach only requires a single test for all candidate local extrema, thereby achieving the smallest computational complexity proportionate to the data sequence length. Additionally, numerical studies illustrate that our method maintains FDR control and power consistency, even in non-asymptotic cases when the size of slope changes or jumps is not large. We have implemented our method in the R package "dSTEM" (available from //cran.r-project.org/web/packages/dSTEM).
In this paper, we introduce weight prediction into the AdamW optimizer to boost its convergence when training the deep neural network (DNN) models. In particular, ahead of each mini-batch training, we predict the future weights according to the update rule of AdamW and then apply the predicted future weights to do both forward pass and backward propagation. In this way, the AdamW optimizer always utilizes the gradients w.r.t. the future weights instead of current weights to update the DNN parameters, making the AdamW optimizer achieve better convergence. Our proposal is simple and straightforward to implement but effective in boosting the convergence of DNN training. We performed extensive experimental evaluations on image classification and language modeling tasks to verify the effectiveness of our proposal. The experimental results validate that our proposal can boost the convergence of AdamW and achieve better accuracy than AdamW when training the DNN models.
Comparative visualization of scalar fields is often facilitated using similarity measures such as edit distances. In this paper, we describe a novel approach for similarity analysis of scalar fields that combines two recently introduced techniques: Wasserstein geodesics/barycenters as well as path mappings, a branch decomposition-independent edit distance. Effectively, we are able to leverage the reduced susceptibility of path mappings to small perturbations in the data when compared with the original Wasserstein distance. Our approach therefore exhibits superior performance and quality in typical tasks such as ensemble summarization, ensemble clustering, and temporal reduction of time series, while retaining practically feasible runtimes. Beyond studying theoretical properties of our approach and discussing implementation aspects, we describe a number of case studies that provide empirical insights into its utility for comparative visualization, and demonstrate the advantages of our method in both synthetic and real-world scenarios. We supply a C++ implementation that can be used to reproduce our results.
It is shown that every $n$-vertex graph that admits a 2-bend RAC drawing in the plane, where the edges are polylines with two bends per edge and any pair of edges can only cross at a right angle, has at most $24n-26$ edges for $n\geq 3$. This improves upon the previous upper bound of $74.2n$; this is the first improvement in more than 12 years. A crucial ingredient of the proof is an upper bound on the size of plane multigraphs with polyline edges in which the first and last segments are either parallel or orthogonal.
Many complex engineering systems can be represented in a topological form, such as graphs. This paper utilizes a machine learning technique called Geometric Deep Learning (GDL) to aid designers with challenging, graph-centric design problems. The strategy presented here is to take the graph data and apply GDL to seek the best realizable performing solution effectively and efficiently with lower computational costs. This case study used here is the synthesis of analog electrical circuits that attempt to match a specific frequency response within a particular frequency range. Previous studies utilized an enumeration technique to generate 43,249 unique undirected graphs presenting valid potential circuits. Unfortunately, determining the sizing and performance of many circuits can be too expensive. To reduce computational costs with a quantified trade-off in accuracy, the fraction of the circuit graphs and their performance are used as input data to a classification-focused GDL model. Then, the GDL model can be used to predict the remainder cheaply, thus, aiding decision-makers in the search for the best graph solutions. The results discussed in this paper show that additional graph-based features are useful, favorable total set classification accuracy of 80\% in using only 10\% of the graphs, and iteratively-built GDL models can further subdivide the graphs into targeted groups with medians significantly closer to the best and containing 88.2 of the top 100 best-performing graphs on average using 25\% of the graphs.
In this paper, we present an accurate and scalable approach to the face clustering task. We aim at grouping a set of faces by their potential identities. We formulate this task as a link prediction problem: a link exists between two faces if they are of the same identity. The key idea is that we find the local context in the feature space around an instance (face) contains rich information about the linkage relationship between this instance and its neighbors. By constructing sub-graphs around each instance as input data, which depict the local context, we utilize the graph convolution network (GCN) to perform reasoning and infer the likelihood of linkage between pairs in the sub-graphs. Experiments show that our method is more robust to the complex distribution of faces than conventional methods, yielding favorably comparable results to state-of-the-art methods on standard face clustering benchmarks, and is scalable to large datasets. Furthermore, we show that the proposed method does not need the number of clusters as prior, is aware of noises and outliers, and can be extended to a multi-view version for more accurate clustering accuracy.
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at //github.com/nlpyang/BertSum
Image segmentation is an important component of many image understanding systems. It aims to group pixels in a spatially and perceptually coherent manner. Typically, these algorithms have a collection of parameters that control the degree of over-segmentation produced. It still remains a challenge to properly select such parameters for human-like perceptual grouping. In this work, we exploit the diversity of segments produced by different choices of parameters. We scan the segmentation parameter space and generate a collection of image segmentation hypotheses (from highly over-segmented to under-segmented). These are fed into a cost minimization framework that produces the final segmentation by selecting segments that: (1) better describe the natural contours of the image, and (2) are more stable and persistent among all the segmentation hypotheses. We compare our algorithm's performance with state-of-the-art algorithms, showing that we can achieve improved results. We also show that our framework is robust to the choice of segmentation kernel that produces the initial set of hypotheses.