This is only a preview version of GauMesh. Recently, primitive-based rendering has been proven to achieve convincing results in solving the problem of modeling and rendering the 3D dynamic scene from 2D images. Despite this, in the context of novel view synthesis, each type of primitive has its inherent defects in terms of representation ability. It is difficult to exploit the mesh to depict the fuzzy geometry. Meanwhile, the point-based splatting (e.g. the 3D Gaussian Splatting) method usually produces artifacts or blurry pixels in the area with smooth geometry and sharp textures. As a result, it is difficult, even not impossible, to represent the complex and dynamic scene with a single type of primitive. To this end, we propose a novel approach, GauMesh, to bridge the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes. Given a sequence of tracked mesh as initialization, our goal is to simultaneously optimize the mesh geometry, color texture, opacity maps, a set of 3D Gaussians, and the deformation field. At a specific time, we perform $\alpha$-blending on the RGB and opacity values based on the merged and re-ordered z-buffers from mesh and 3D Gaussian rasterizations. This produces the final rendering, which is supervised by the ground-truth image. Experiments demonstrate that our approach adapts the appropriate type of primitives to represent the different parts of the dynamic scene and outperforms all the baseline methods in both quantitative and qualitative comparisons without losing render speed.
Measurement error occurs when a covariate influencing a response variable is corrupted by noise. This can lead to misleading inference outcomes, particularly in problems where accurately estimating the relationship between covariates and response variables is crucial, such as causal effect estimation. Existing methods for dealing with measurement error often rely on strong assumptions such as knowledge of the error distribution or its variance and availability of replicated measurements of the covariates. We propose a Bayesian Nonparametric Learning framework that is robust to mismeasured covariates, does not require the preceding assumptions, and can incorporate prior beliefs about the error distribution. This approach gives rise to a general framework that is suitable for both Classical and Berkson error models via the appropriate specification of the prior centering measure of a Dirichlet Process (DP). Moreover, it offers flexibility in the choice of loss function depending on the type of regression model. We provide bounds on the generalization error based on the Maximum Mean Discrepancy (MMD) loss which allows for generalization to non-Gaussian distributed errors and nonlinear covariate-response relationships. We showcase the effectiveness of the proposed framework versus prior art in real-world problems containing either Berkson or Classical measurement errors.
With recent advancements in the sixth generation (6G) communication technologies, more vertical industries have encountered diverse network services. How to reduce energy consumption is critical to meet the expectation of the quality of diverse network services. In particular, the number of base stations in 6G is huge with coupled adjustable network parameters. However, the problem is complex with multiple network objectives and parameters. Network intents are difficult to map to individual network elements and require enhanced automation capabilities. In this paper, we present a network intent decomposition and optimization mechanism in an energy-aware radio access network scenario. By characterizing the intent ontology with a standard template, we present a generic network intent representation framework. Then we propose a novel intent modeling method using Knowledge Acquisition in automated Specification language, which can model the network ontology. To clarify the number and types of network objectives and energy-saving operations, we develop a Softgoal Interdependency Graph-based network intent decomposition model, and thus, a network intent decomposition algorithm is presented. Simulation results demonstrate that the proposed algorithm outperforms without conflict analysis in intent decomposition time. Moreover, we design a deep Q-network-assisted intent optimization scheme to validate the performance gain.
We consider the estimation of rare-event probabilities using sample proportions output by naive Monte Carlo or collected data. Unlike using variance reduction techniques, this naive estimator does not have a priori relative efficiency guarantee. On the other hand, due to the recent surge of sophisticated rare-event problems arising in safety evaluations of intelligent systems, efficiency-guaranteed variance reduction may face implementation challenges which, coupled with the availability of computation or data collection power, motivate the use of such a naive estimator. In this paper we study the uncertainty quantification, namely the construction, coverage validity and tightness of confidence intervals, for rare-event probabilities using only sample proportions. In addition to the known normality, Wilson's and exact intervals, we investigate and compare them with two new intervals derived from Chernoff's inequality and the Berry-Esseen theorem. Moreover, we generalize our results to the natural situation where sampling stops by reaching a target number of rare-event hits. Our findings show that the normality and Wilson's intervals are not always valid, but they are close to the newly developed valid intervals in terms of half-width. In contrast, the exact interval is conservative, but safely guarantees the attainment of the nominal confidence level. Our new intervals, while being more conservative than the exact interval, provide useful insights in understanding the tightness of the considered intervals.
Autonomous driving has achieved significant milestones in research and development over the last two decades. There is increasing interest in the field as the deployment of autonomous vehicles (AVs) promises safer and more ecologically friendly transportation systems. With the rapid progress in computationally powerful artificial intelligence (AI) techniques, AVs can sense their environment with high precision, make safe real-time decisions, and operate reliably without human intervention. However, intelligent decision-making in such vehicles is not generally understandable by humans in the current state of the art, and such deficiency hinders this technology from being socially acceptable. Hence, aside from making safe real-time decisions, AVs must also explain their AI-guided decision-making process in order to be regulatory compliant across many jurisdictions. Our study sheds comprehensive light on the development of explainable artificial intelligence (XAI) approaches for AVs. In particular, we make the following contributions. First, we provide a thorough overview of the state-of-the-art and emerging approaches for XAI-based autonomous driving. We then propose a conceptual framework that considers the essential elements for explainable end-to-end autonomous driving. Finally, we present XAI-based prospective directions and emerging paradigms for future directions that hold promise for enhancing transparency, trustworthiness, and societal acceptance of AVs.
PageRank is a widely used centrality measure that "ranks" vertices in a graph by considering the connections and their importance. In this report, we first introduce one of the most efficient GPU implementations of Static PageRank, which recomputes PageRank scores from scratch. It uses a synchronous pull-based atomics-free PageRank computation, with the low and high in-degree vertices being partitioned and processed by two separate kernels. Next, we present our GPU implementation of incrementally expanding (and contracting) Dynamic Frontier with Pruning (DF-P) PageRank, which processes only a subset of vertices likely to change ranks. It is based on Static PageRank, and uses an additional partitioning between low and high out-degree vertices for incremental expansion of the set of affected vertices with two additional kernels. On a server with an NVIDIA A100 GPU, our Static PageRank outperforms Hornet and Gunrock's PageRank implementations by 31x and 5.9x respectively. On top of the above, DF-P PageRank outperforms Static PageRank by 2.1x on real-world dynamic graphs, and by 3.1x on large static graphs with random batch updates.
Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.
Object detectors usually achieve promising results with the supervision of complete instance annotations. However, their performance is far from satisfactory with sparse instance annotations. Most existing methods for sparsely annotated object detection either re-weight the loss of hard negative samples or convert the unlabeled instances into ignored regions to reduce the interference of false negatives. We argue that these strategies are insufficient since they can at most alleviate the negative effect caused by missing annotations. In this paper, we propose a simple but effective mechanism, called Co-mining, for sparsely annotated object detection. In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other. To enhance multi-view learning and better mine unlabeled instances, the original image and corresponding augmented image are used as the inputs of two branches of the Siamese network, respectively. Co-mining can serve as a general training mechanism applied to most of modern object detectors. Experiments are performed on MS COCO dataset with three different sparsely annotated settings using two typical frameworks: anchor-based detector RetinaNet and anchor-free detector FCOS. Experimental results show that our Co-mining with RetinaNet achieves 1.4%~2.1% improvements compared with different baselines and surpasses existing methods under the same sparsely annotated setting.
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks. Despite its powerful capacity to learn and generalize from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep, which limit the model scalability. In this work, we propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism, \ie node self-attention, neighborhood attention, and layer memory attention. We explain why the proposed attentive modules can improve GNN for few-shot learning with theoretical analysis and illustrations. Extensive experiments show that the proposed Attentive GNN outperforms the state-of-the-art GNN-based methods for few-shot learning over the mini-ImageNet and Tiered-ImageNet datasets, with both inductive and transductive settings.
Verifiability is one of the core editing principles in Wikipedia, where editors are encouraged to provide citations for the added statements. Statements can be any arbitrary piece of text, ranging from a sentence up to a paragraph. However, in many cases, citations are either outdated, missing, or link to non-existing references (e.g. dead URL, moved content etc.). In total, 20\% of the cases such citations refer to news articles and represent the second most cited source. Even in cases where citations are provided, there are no explicit indicators for the span of a citation for a given piece of text. In addition to issues related with the verifiability principle, many Wikipedia entity pages are incomplete, with relevant information that is already available in online news sources missing. Even for the already existing citations, there is often a delay between the news publication time and the reference time. In this thesis, we address the aforementioned issues and propose automated approaches that enforce the verifiability principle in Wikipedia, and suggest relevant and missing news references for further enriching Wikipedia entity pages.
Inspired by recent development of artificial satellite, remote sensing images have attracted extensive attention. Recently, noticeable progress has been made in scene classification and target detection.However, it is still not clear how to describe the remote sensing image content with accurate and concise sentences. In this paper, we investigate to describe the remote sensing images with accurate and flexible sentences. First, some annotated instructions are presented to better describe the remote sensing images considering the special characteristics of remote sensing images. Second, in order to exhaustively exploit the contents of remote sensing images, a large-scale aerial image data set is constructed for remote sensing image caption. Finally, a comprehensive review is presented on the proposed data set to fully advance the task of remote sensing caption. Extensive experiments on the proposed data set demonstrate that the content of the remote sensing image can be completely described by generating language descriptions. The data set is available at //github.com/2051/RSICD_optimal