Abstract:
In the field of image and video colorization, the existing research employs a CNN to extract information from each video frame. However, due to the local nature of a kern...Show MoreMetadata
Abstract:
In the field of image and video colorization, the existing research employs a CNN to extract information from each video frame. However, due to the local nature of a kernel, it is challenging for CNN to capture the relationships between each pixel and others in an image, leading to inaccurate colorization. To solve this issue, we introduce an end-to-end network called Vitexco for colorizing videos. Vitexco utilizes the power of the Vision Transformer (ViT) to capture the relationships among all pixels in a frame with each other, providing a more effective method for colorizing video frames. We evaluate our approach on DAVIS datasets and demonstrate that it outperforms the state-of-the-art methods regarding color accuracy and visual quality. Our findings suggest that using a ViT can significantly enhance the performance of video colorization.
Published in: 2023 14th International Conference on Information and Communication Technology Convergence (ICTC)
Date of Conference: 11-13 October 2023
Date Added to IEEE Xplore: 23 January 2024
ISBN Information: