Abstract
The paper focuses on implementation details of ALFA – an agglomerative late fusion algorithm for object detection. ALFA agglomeratively clusters detector predictions while taking into account bounding box locations and class scores. We discuss the source code of ALFA and another late fusion algorithm – Dynamic Belief Fusion (DBF). The workflow and the hyperparameters necessary to reproduce the published results are presented. We also provide a framework for evaluation of late fusion algorithms like ALFA, DBF and Non-Maximum Suppression with arbitrary object detectors.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Object detection is an important and challenging computer vision problem. State of the art object detectors, such as Faster R-CNN, YOLO, SSD and DeNet, rely on deep convolutional neural networks and show remarkable results in terms of accuracy and speed. Fusing results of several object detection methods is a common way to increase accuracy of object detection. In the companion paper [1] a new late fusion algorithm for object detection called ALFA was proposed. ALFA relies on agglomerative clustering and shows state of the art results on PASCAL VOC 2007 and 2012 object detection datasets.
We also implemented Dynamic Belief Fusion – state of the art late fusion algorithm for object detection proposed in [2] – as our baseline, since the implementation from authors is not available.
Here we describe our implementation of ALFA and DBF providing pseudocode for the key functions of these methods. We also provide hyperparameter values required to reproduce results from [1] on PASCAL VOC 2012 dataset. Results on PASCAL VOC 2007 are not reproducible due to randomness of a cross-validation procedure.
Link to our implementation: http://github.com/IuliiaSaveleva/ALFA. All the details required to successfully run the code are provided in README.md.
2 Implementation
Assume object detection task for K classes and N trained object detectors \(D_1, D_2, ..., D_N\). Given an image I object detector produces a set of predictions:
where \(m_i\) is the number of detected objects, r represents four coordinates of the axis-aligned bounding box and c is class scores tuple of size \((K + 1)\), including “no object” score \(c^{(0)}\).
2.1 ALFA Implementation
The steps of ALFA are given below.
2.1.1 Agglomerative Clustering of Base Detectors Predictions
We assume that prediction bounding box \(r_i\) and class scores \(c_i\) should be similar to other prediction bounding box \(r_j\) and class scores \(c_j\) if they correspond to the same object. Let \(C_i\) and \(C_j\) be two clusters and \(\sigma (p, \tilde{p})\) – similarity score function between predictions p and \(\tilde{p}\). We define the following similarity score function with hyperparameter \(\tau \) for prediction clusters:
We propose the following measure of similarity between predictions:
where \(\gamma \in [0, 1]\) is a hyperparameter, BC – Bhattacharyya coefficient as a measure of similarity between class scores (\(\bar{c}\) is obtained from class score tuple c by omitting the zeroth “no object” component and renormalizing):
IoU – intersection over union coefficient which is widely used as a measure of similarity between bounding boxes:
See Algorithm 1.
2.1.2 Class Scores Aggregation
Assume that predictions from detectors \(D_{i_1}, D_{i_2}, ..., D_{i_s}\) were assigned to object proposal \(\pi \). We assign an additional low-confidence class scores tuple to this object proposal for every detector that missed:
where \(\varepsilon \) is a hyperparameter.
Each method uses one of two class scores aggregation strategies:
-
Averaging fusion:
$$\begin{aligned} c_{\pi }^{(k)} = \frac{1}{N} \left( \sum _{d = 1}^s c_{i_d}^{(k)} + (N - s)\cdot c_{lc}^{(k)} \right) , k = 0, ..., K. \end{aligned}$$(6) -
Multiplication fusion:
$$\begin{aligned} c_{\pi }^{(k)} = \frac{\tilde{c}_{\pi }^{(k)}}{\sum _{i} \tilde{c}_{\pi }^{(i)}}, \quad \tilde{c}_{\pi }^{(k)} = \left( c_{lc}^{(k)} \right) ^{N - s} \prod _{d = 1}^s c_{i_d}^{(k)}, \quad k = 0, ..., K. \end{aligned}$$(7)
2.1.3 Bounding Box Aggregation
All methods have the same bounding box aggregation strategy:
Best ALFA parameters are provided in Table 1:

2.2 DBF Implementation
Our implementation of DBF consists of the following steps:
-
1.
Compute PR-curves \(PR^k_i\) for each class k and each detector \(D_i\), \(i = 1, ..., N\);
-
2.
Construct detection vectors for each \(p \in D_i(I)\), \(i = 1, ..., N\), and calculation of basic probabilities of hypothesis according to label l and \(PR^k_i\). See Algorithm 2;
-
3.
Join basic probabilities by Dempster-Shaffer combination rule:
$$ m_f(A) = \frac{1}{N}\sum _{X_1 \cap {X_2 ... \cap {X_K}} = A} \prod _{i = 1}^{K}m_i(X_i), $$where \(N = \sum _{X_1 \cap {X_2 ... \cap {X_K}} \ne \varnothing } \prod _{i = 1}^{K}m_i(X_i)\), to determine fused basic probabilities \(m_f(T)\) and \(m_f(\lnot {T})\);
-
4.
Get fused score as \(\bar{s} = m_f(T) - m_f(\lnot {T})\);
-
5.
Apply NMS to bounding boxes r and scores \(\bar{s}\). In order to help DBF more on NMS step we sort detections by score \(\bar{s}\) and precision from \(PR^k_i\), k = l, if detections had equal \(\bar{s}\) values.

Best DBF parameters are provided in Table 2:
3 Conclusion
This paper had presented implementation details of ALFA and DBF late fusion methods for object detection. We provide source code and hyperparameter values that allow one to reproduce results from [1] on PASCAL VOC 2012.
References
Razinkov, E., Saveleva, I., Matas, J.: ALFA: agglomerative late fusion algorithm for object detection. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2594–2599. IEEE, August 2018
Lee, H., Kwon, H., Robinson, R., Nothwang, W., Marathe, A.: Dynamic belief fusion for object detection. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016)
Acknowledgment
I. Saveleva was funded by the Russian Government support of the Program of Competitive Growth of Kazan Federal University among World’s Leading Academic Centers and by Russian Foundation of Basic Research, project number 16-01-00109a.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Saveleva, I., Razinkov, E. (2019). On the Implementation of ALFA – Agglomerative Late Fusion Algorithm for Object Detection. In: Kerautret, B., Colom, M., Lopresti, D., Monasse, P., Talbot, H. (eds) Reproducible Research in Pattern Recognition. RRPR 2018. Lecture Notes in Computer Science(), vol 11455. Springer, Cham. https://doi.org/10.1007/978-3-030-23987-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-23987-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23986-2
Online ISBN: 978-3-030-23987-9
eBook Packages: Computer ScienceComputer Science (R0)