Introduction
Recent advances in computing, caching, and communication have fueled a plethora of innovations, which have profound impact on the development of artificial intelligence (AI) applications, such as facial recognition, speech understanding, gaming, and industrial automation. Considering the long distance between the cloud and edge data, edge computing becomes a competitive computational paradigm for delay-sensitive AI applications. Pushing the resource-intensive AI frontier to the edge ecosystem, however, is highly challenging due to concerns about performance and costs. Recent breakthroughs in lightweight AI methods have given rise to a new research area, namely, edge intelligence (EI) [1], which makes the most of widespread edge resources to gain AI insight. Many enabling technologies have been widely utilized for model training (e.g., federated learning and DNN splitting) and inference (e.g., model partition and model compression) at edge devices (e.g., base stations, vehicles, and drones) [1].