The core network is experiencing bandwidth capacity constraints as internet traffic grows. As a result, the notion of a Multi-band flexible-grid optical network was established to increase the lifespan of an optical core network. In this paper, we use the C+L band for working traffic transmission and the S-band for protection against failure. Furthermore, we compare the proposed method with the existing ones.
The advancement of artificial intelligence (AI) for organ segmentation and tumor detection is propelled by the growing availability of computed tomography (CT) datasets with detailed, per-voxel annotations. However, these AI models often struggle with flexibility for partially annotated datasets and extensibility for new classes due to limitations in the one-hot encoding, architectural design, and learning scheme. To overcome these limitations, we propose a universal, extensible framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes (e.g., organs/tumors). Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models, enriching semantic encoding compared with one-hot encoding. Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors and ease the addition of new classes. We train our Universal Model on 3,410 CT volumes assembled from 14 publicly available datasets and then test it on 6,173 CT volumes from four external datasets. Universal Model achieves first place on six CT tasks in the Medical Segmentation Decathlon (MSD) public leaderboard and leading performance on the Beyond The Cranial Vault (BTCV) dataset. In summary, Universal Model exhibits remarkable computational efficiency (6x faster than other dataset-specific models), demonstrates strong generalization across different hospitals, transfers well to numerous downstream tasks, and more importantly, facilitates the extensibility to new classes while alleviating the catastrophic forgetting of previously learned classes. Codes, models, and datasets are available at //github.com/ljwztc/CLIP-Driven-Universal-Model
We consider the problem of ranking a set of objects based on their performance when the measurement of said performance is subject to noise. In this scenario, the performance is measured repeatedly, resulting in a range of measurements for each object. If the ranges of two objects do not overlap, then we consider one object as 'better' than the other, and we expect it to receive a higher rank; if, however, the ranges overlap, then the objects are incomparable, and we wish them to be assigned the same rank. Unfortunately, the incomparability relation of ranges is in general not transitive; as a consequence, in general the two requirements cannot be satisfied simultaneously, i.e., it is not possible to guarantee both distinct ranks for objects with separated ranges, and same rank for objects with overlapping ranges. This conflict leads to more than one reasonable way to rank a set of objects. In this paper, we explore the ambiguities that arise when ranking with ties, and define a set of reasonable rankings, which we call partial rankings. We develop and analyse three different methodologies to compute a partial ranking. Finally, we show how performance differences among objects can be investigated with the help of partial ranking.
While a number of promising uncertainty quantification methods have been proposed to address the prevailing shortcomings of deep neural networks like overconfidence and lack of explainability, quantifying predictive uncertainties in the context of joint semantic segmentation and monocular depth estimation has not been explored yet. Since many real-world applications are multi-modal in nature and, hence, have the potential to benefit from multi-task learning, this is a substantial gap in current literature. To this end, we conduct a comprehensive series of experiments to study how multi-task learning influences the quality of uncertainty estimates in comparison to solving both tasks separately.
The rapid growth of non-terrestrial communication necessitates its integration with existing terrestrial networks, as highlighted in 3GPP Releases 16 and 17. This paper analyses the concept of functional splits in 3D-Networks. To manage this complex structure effectively, the adoption of a Radio Access Network (RAN) architecture with Functional Split (FS) offers advantages in flexibility, scalability, and cost-efficiency. RAN achieves this by disaggregating functionalities into three separate units. Analogous to the terrestrial network approach, 3GPP is extending this concept to non-terrestrial platforms as well. This work presents a general analysis of the requested Fronthaul (FH) data rate on feeder link between a non-terrestrial platform and the ground-station. Each split option is a trade-of between FH data rate and the respected complexity. Since flying nodes face more limitations regarding power consumption and complexity on board in comparison to terrestrial ones, we are investigating the split options between lower and higher physical layer.
The study of behavioral diversity in Multi-Agent Reinforcement Learning (MARL) is a nascent yet promising field. In this context, the present work deals with the question of how to control the diversity of a multi-agent system. With no existing approaches to control diversity to a set value, current solutions focus on blindly promoting it via intrinsic rewards or additional loss functions, effectively changing the learning objective and lacking a principled measure for it. To address this, we introduce Diversity Control (DiCo), a method able to control diversity to an exact value of a given metric by representing policies as the sum of a parameter-shared component and dynamically scaled per-agent components. By applying constraints directly to the policy architecture, DiCo leaves the learning objective unchanged, enabling its applicability to any actor-critic MARL algorithm. We theoretically prove that DiCo achieves the desired diversity, and we provide several experiments, both in cooperative and competitive tasks, that show how DiCo can be employed as a novel paradigm to increase performance and sample efficiency in MARL. Multimedia results are available on the paper's website: //sites.google.com/view/dico-marl.
In fMRI, capturing brain activation during a task is dependent on how quickly k-space arrays are obtained. Acquiring full k-space arrays, which are reconstructed into images using the inverse Fourier transform (IFT), that make up volume images can take a considerable amount of scan time. Under-sampling k-space reduces the acquisition time, but results in aliased, or "folded," images. GeneRalized Autocalibrating Partial Parallel Acquisition (GRAPPA) is a parallel imaging technique that yields full images from subsampled arrays of k-space. GRAPPA uses localized interpolation weights, which are estimated per-scan and fixed over time, to fill in the missing spatial frequencies of the subsampled k-space. Hence, we propose a Bayesian approach to GRAPPA (BGRAPPA) where space measurement uncertainty are assessed from the a priori calibration k-space arrays. The prior information is utilized to estimate the missing spatial frequency values from the posterior distribution and reconstruct into full field-of-view images. Our BGRAPPA technique successfully reconstructed both a simulated and experimental single slice image with less artifacts, reduced noise leading to an increased signal-to-noise ratio (SNR), and stronger power of task detection.
In wireless networks assisted by intelligent reflecting surfaces (IRSs), jointly modeling the signal received over the direct and indirect (reflected) paths is a difficult problem. In this work, we show that the network geometry (locations of serving base station, IRS, and user) can be captured using the so-called triangle parameter $\Delta$. We introduce a decomposition of the effect of the combined link into a signal amplification factor and an effective channel power coefficient $G$. The amplification factor is monotonically increasing with both the number of IRS elements $N$ and $\Delta$. For $G$, since an exact characterization of the distribution seems unfeasible, we propose three approximations depending on the value of the product $N\Delta$ for Nakagami fading and the special case of Rayleigh fading. For two relevant models of IRS placement, we prove that their performance is identical if $\Delta$ is the same given an $N$. We also show that no gains are achieved from IRS deployment if $N$ and $\Delta$ are both small. We further compute bounds on the diversity gain to quantify the channel hardening effect of IRSs. Hence only with a judicious selection of IRS placement and other network parameters, non-trivial gains can be obtained.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.