On the Implementation of ALFA – Agglomerative Late Fusion Algorithm for Object Detection

Saveleva, Iuliia; Razinkov, Evgenii

doi:10.1007/978-3-030-23987-9_9

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11455))

Included in the following conference series:

International Workshop on Reproducible Research in Pattern Recognition

456 Accesses

Abstract

The paper focuses on implementation details of ALFA – an agglomerative late fusion algorithm for object detection. ALFA agglomeratively clusters detector predictions while taking into account bounding box locations and class scores. We discuss the source code of ALFA and another late fusion algorithm – Dynamic Belief Fusion (DBF). The workflow and the hyperparameters necessary to reproduce the published results are presented. We also provide a framework for evaluation of late fusion algorithms like ALFA, DBF and Non-Maximum Suppression with arbitrary object detectors.

You have full access to this open access chapter, Download conference paper PDF

Non-maximum Suppression for Object Detection by Passing Messages Between Windows

Evaluating Zero-Cost Active Learning for Object Detection

Learning to Filter Object Detections

Keywords

1 Introduction

Object detection is an important and challenging computer vision problem. State of the art object detectors, such as Faster R-CNN, YOLO, SSD and DeNet, rely on deep convolutional neural networks and show remarkable results in terms of accuracy and speed. Fusing results of several object detection methods is a common way to increase accuracy of object detection. In the companion paper [1] a new late fusion algorithm for object detection called ALFA was proposed. ALFA relies on agglomerative clustering and shows state of the art results on PASCAL VOC 2007 and 2012 object detection datasets.

We also implemented Dynamic Belief Fusion – state of the art late fusion algorithm for object detection proposed in [2] – as our baseline, since the implementation from authors is not available.

Here we describe our implementation of ALFA and DBF providing pseudocode for the key functions of these methods. We also provide hyperparameter values required to reproduce results from [1] on PASCAL VOC 2012 dataset. Results on PASCAL VOC 2007 are not reproducible due to randomness of a cross-validation procedure.

Link to our implementation: http://github.com/IuliiaSaveleva/ALFA. All the details required to successfully run the code are provided in README.md.

2 Implementation

Assume object detection task for K classes and N trained object detectors $D_1, D_2, ..., D_N$. Given an image I object detector produces a set of predictions:

$$ D_i(I) = \{p_1, ..., p_{m_i}\}, \quad p = (r, c), $$

where $m_i$ is the number of detected objects, r represents four coordinates of the axis-aligned bounding box and c is class scores tuple of size $(K + 1)$, including “no object” score $c^{(0)}$.

2.1 ALFA Implementation

The steps of ALFA are given below.

2.1.1 Agglomerative Clustering of Base Detectors Predictions

We assume that prediction bounding box $r_i$ and class scores $c_i$ should be similar to other prediction bounding box $r_j$ and class scores $c_j$ if they correspond to the same object. Let $C_i$ and $C_j$ be two clusters and $\sigma (p, \tilde{p})$ – similarity score function between predictions p and $\tilde{p}$. We define the following similarity score function with hyperparameter $\tau $ for prediction clusters:

$$\begin{aligned} \sigma (C_i, C_j) = \min _{p \in C_i, \tilde{p} \in C_j} \sigma (p, \tilde{p}), \quad \text {while} \quad max_{i, j} \sigma (C_i, C_j) \ge \tau . \end{aligned}$$

(1)

We propose the following measure of similarity between predictions:

$$\begin{aligned} \sigma (p_i, p_j) = IoU(r_i, r_j)^\gamma \cdot BC(\bar{c}_i, \bar{c}_j) ^{1 - \gamma }, \end{aligned}$$

(2)

where $\gamma \in [0, 1]$ is a hyperparameter, BC – Bhattacharyya coefficient as a measure of similarity between class scores ($\bar{c}$ is obtained from class score tuple c by omitting the zeroth “no object” component and renormalizing):

$$\begin{aligned} BC(\bar{c}_i, \bar{c}_j) = \sum _{k = 1}^K \sqrt{\bar{c}_i^{(k)}\bar{c}_j^{(k)}}, \quad \bar{c}^{(k)} = \frac{c^{(k)}}{1 - c^{(0)}}, \quad k = 1, ... K, \end{aligned}$$

(3)

IoU – intersection over union coefficient which is widely used as a measure of similarity between bounding boxes:

$$\begin{aligned} IoU(r_i, r_j) = \frac{r_i \cap r_j}{r_i \cup r_j}. \end{aligned}$$

(4)

See Algorithm 1.

2.1.2 Class Scores Aggregation

Assume that predictions from detectors $D_{i_1}, D_{i_2}, ..., D_{i_s}$ were assigned to object proposal $\pi $. We assign an additional low-confidence class scores tuple to this object proposal for every detector that missed:

$$\begin{aligned} c_{lc} = \left( 1 - \varepsilon , \frac{\varepsilon }{K}, \frac{\varepsilon }{K}, ..., \frac{\varepsilon }{K} \right) , \end{aligned}$$

(5)

where $\varepsilon $ is a hyperparameter.

Each method uses one of two class scores aggregation strategies:

Averaging fusion:
$$\begin{aligned} c_{\pi }^{(k)} = \frac{1}{N} \left( \sum _{d = 1}^s c_{i_d}^{(k)} + (N - s)\cdot c_{lc}^{(k)} \right) , k = 0, ..., K. \end{aligned}$$
(6)
Multiplication fusion:
$$\begin{aligned} c_{\pi }^{(k)} = \frac{\tilde{c}_{\pi }^{(k)}}{\sum _{i} \tilde{c}_{\pi }^{(i)}}, \quad \tilde{c}_{\pi }^{(k)} = \left( c_{lc}^{(k)} \right) ^{N - s} \prod _{d = 1}^s c_{i_d}^{(k)}, \quad k = 0, ..., K. \end{aligned}$$
(7)

2.1.3 Bounding Box Aggregation

All methods have the same bounding box aggregation strategy:

$$\begin{aligned} r_{\pi } = \frac{1}{\sum _{i \in \pi } c_{i}^{(l)}} \sum _{i \in \pi } c_{i}^{(l)} \cdot r_{i}, \quad \text {where} \quad l = \displaystyle \mathop {\text {argmax}}_{k \ge 1} c_{\pi }^{(k)}. \end{aligned}$$

(8)

Best ALFA parameters are provided in Table 1:

Table 1. Best ALFA parameters.

Full size table

2.2 DBF Implementation

Our implementation of DBF consists of the following steps:

1.
Compute PR-curves $PR^k_i$ for each class k and each detector $D_i$, $i = 1, ..., N$;
2.
Construct detection vectors for each $p \in D_i(I)$, $i = 1, ..., N$, and calculation of basic probabilities of hypothesis according to label l and $PR^k_i$. See Algorithm 2;
3.
Join basic probabilities by Dempster-Shaffer combination rule:
$$ m_f(A) = \frac{1}{N}\sum _{X_1 \cap {X_2 ... \cap {X_K}} = A} \prod _{i = 1}^{K}m_i(X_i), $$
where $N = \sum _{X_1 \cap {X_2 ... \cap {X_K}} \ne \varnothing } \prod _{i = 1}^{K}m_i(X_i)$, to determine fused basic probabilities $m_f(T)$ and $m_f(\lnot {T})$;
4.
Get fused score as $\bar{s} = m_f(T) - m_f(\lnot {T})$;
5.
Apply NMS to bounding boxes r and scores $\bar{s}$. In order to help DBF more on NMS step we sort detections by score $\bar{s}$ and precision from $PR^k_i$, k = l, if detections had equal $\bar{s}$ values.

Best DBF parameters are provided in Table 2:

Table 2. Best DBF parameters.

Full size table

3 Conclusion

This paper had presented implementation details of ALFA and DBF late fusion methods for object detection. We provide source code and hyperparameter values that allow one to reproduce results from [1] on PASCAL VOC 2012.

References

Razinkov, E., Saveleva, I., Matas, J.: ALFA: agglomerative late fusion algorithm for object detection. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2594–2599. IEEE, August 2018
Google Scholar
Lee, H., Kwon, H., Robinson, R., Nothwang, W., Marathe, A.: Dynamic belief fusion for object detection. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016)
Google Scholar

Download references

Acknowledgment

I. Saveleva was funded by the Russian Government support of the Program of Competitive Growth of Kazan Federal University among World’s Leading Academic Centers and by Russian Foundation of Basic Research, project number 16-01-00109a.

Author information

Authors and Affiliations

Institute of Computational Mathematics and Information Technologies, Kazan Federal University, Kazan, Russia
Iuliia Saveleva & Evgenii Razinkov

Authors

Iuliia Saveleva
View author publications
You can also search for this author in PubMed Google Scholar
Evgenii Razinkov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Iuliia Saveleva .

Editor information

Editors and Affiliations

LIRIS, Université de Lyon 2, Bron, France
Bertrand Kerautret
CMLA, ENS Cachan, CNRS, Université Paris-Saclay, Cachan, France
Miguel Colom
Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Laboratoire d'Informatique Gaspard-Monge, Ecole des Ponts Paristech, Marne-la-Vallée, France
Pascal Monasse
CentraleSupelec, Universite Paris-Saclay, Gif-sur-Yvette, France
Hugues Talbot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saveleva, I., Razinkov, E. (2019). On the Implementation of ALFA – Agglomerative Late Fusion Algorithm for Object Detection. In: Kerautret, B., Colom, M., Lopresti, D., Monasse, P., Talbot, H. (eds) Reproducible Research in Pattern Recognition. RRPR 2018. Lecture Notes in Computer Science(), vol 11455. Springer, Cham. https://doi.org/10.1007/978-3-030-23987-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-23987-9_9
Published: 29 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23986-2
Online ISBN: 978-3-030-23987-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)