

Digital twin technology has gradually gained widespread application in mechanical engineering fields such as crane operation and maintenance. We propose a digital twin-based crane cloud mobile virtual reality (VR) framework. This framework combines cloud rendering, digital twin technology, and panoramic stereo imaging technology to achieve high-quality interactive 3D crane simulation scenarios stereo display on smartphones. In the context of digital twins, enhancing the accuracy, real-time performance, and resource utilization efficiency of crane operation and maintenance scene simulations has become a crucial research issue. The study optimizes the digital twin model through a multi-scale feature extraction algorithm (PMFEA) and a pyramid convolution and grouped channel attention mechanism. Experimental results show that the improved PMFEA algorithm exhibits outstanding performance in multiple practical application scenarios. In static crane recognition scenarios, PMFEA achieves an accuracy of 0.93, significantly outperforming ResNet50’s 0.90 and VGG16’s 0.87. In dynamic crane management, PMFEA leads with an accuracy of 0.89, ahead of MobileNetV2’s 0.84 and InceptionV3’s 0.85. Additionally, PMFEA achieves an accuracy of 0.85 in crane driving simulation scenarios and 0.90 in smart grid management scenarios, both significantly outperforming other comparative algorithms. Notably, in equipment diagnosis, PMFEA stands out with the highest accuracy of 0.94, significantly ahead of other algorithms, demonstrating its immense potential in high-precision demanding fields. The results indicate that this framework can use digital twin technology to stereoscopically display high-precision and high-quality interactive crane VR 3D simulation scenarios on smartphone web browsers at low computational cost. Furthermore, it provides users with an excellent immersive experience through VR glasses.