research-article

Joint Clothes Detection and Attribution Prediction via Anchor-free Framework with Decoupled Representation Transformer

Authors:

Lu ChengAuthors Info & Claims

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 2444 - 2454

https://doi.org/10.1145/3511808.3557369

Published: 17 October 2022 Publication History

Get Access

Abstract

Clothes attribution prediction is the key technology for users to automatically describe clothing characteristics. Most current methods are first to detect the multiple clothes, and then crop out the clothes and feed to a certain network for clothes attribution prediction. But this two-stage approach is time- and resource- consuming; on the other hand, one-stage approach can provide an effective and efficient solution by integrating clothes detection and attribution prediction into an end-to-end framework. But the one-stage approach tends to explore anchor-based detectors causing high sensitivity to the hyperparameters and high computational complexity from dense anchors. In addition, it may also confront with optimization contradiction problem in the training procedure, as the clothes detection and attribution prediction branches demand diverse optimization. In this work, to handle the above problems, we aim to develop an end-to-end anchor-free framework by involving an additional branch for joint clothes detection and attribution prediction. To handle the optimization contradiction in two branches, we encode the backbone feature map as pixel-level dense queries and decode them via deformable transformer as the output features that are fed into detection and prediction branches, respectively. In this way, the features of detection and prediction branches can be decoupled and the optimization contradiction can be naturally solved. To further enhance the prediction accuracy, we in the prediction branch also develop a special attention strategy and loss function to adaptively integrate the peer attribution relationships into feature learning as well as to avoid mutual suppression for hierarchical attributions. Extensive simulation results verify the effectiveness of the proposed work.

Supplementary Material

MP4 File (CIKM22-fp0457.mp4)

This video will illustrate our work on clothes attribute prediction, which is an important basic work for automatic description of clothing and plays an important role in clothing analysis, retrieval, recommendation, voice interaction, etc. We abandon the traditional two-stage approach, i.e. detection first, then cropping, and finally attribute prediction. Through a series of improvements, such as adopting the anchor-free paradigm, introducing feature decoupling, and adding an attention mechanism, we successfully applied an end-to-end network to this task and achieved good results. In the video, we will show you the results of our quantitative and qualitative experiments.

Download
24.82 MB

References

[1]

Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020).

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Fashion Image Search via Anchor-Free Detector

Joint clothes image detection and search via anchor free framework

Fashion Detection and Search via Decoupled Anchor-free Framework

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations