A design-of-experiment based statistical technique for detection of key-frames

Mukherjee, Snehasis; Mukherjee, Dipti Prasad

doi:10.1007/s11042-011-0882-2

A design-of-experiment based statistical technique for detection of key-frames

Published: 01 October 2011

Volume 62, pages 847–877, (2013)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Snehasis Mukherjee¹ &
Dipti Prasad Mukherjee¹

339 Accesses
4 Citations
Explore all metrics

Abstract

In this paper decision variables for the key-frame detection problem in a video are evaluated using statistical tools derived from the theory of design of experiments. The pixel-by-pixel intensity difference of consecutive video frames is used as the factor or decision variable for designing an experiment for key-frame detection. The determination of a key-frame is correlated with the different values of the factor. A novel concept of meaningfulness of a video key-frame is also introduced to select the representative key-frame from a set of possible key-frames. The use of the concepts of design of experiments and the meaningfulness property to summarize a video is tested using a number of videos taken from MUSCLE-VCD-2007 dataset. The performance of the proposed approach in detecting key-frames is found to be superior in comparison to the competing approaches like PME based method (Liu et al., IEEE Trans Circuits Syst Video Technol 13(10):1006–1013, 2003; Mukherjee et al., IEEE Trans Circuits Syst Video Technol 17(5):612–620, 2007; Panagiotakis et al., IEEE Trans Circuits Syst Video Technol 19(3):447–451, 2009).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Cascaded Approach for Keyframes Extraction from Videos

An Innovative Method for Key Frames Extraction in News Videos

Selection of keyframes for video colourization using steerable filtering

Article 08 September 2017

References

Adjeroh D, Lee MC, Banda N, Kandaswamy U (2009) Adaptive edge-oriented shot boundary detection. J Image Video Process 2009(5):5:1–5:13
Article Google Scholar
Calic J, Izquierdo E (2002) Efficient key-frame extraction and video analysis. In: Proc. IEEE international conference on information technology: coding and computing, Washington, DC, USA, pp 28–33
Chasanis VT, Likas AC, Galatsanos NP (2009) Scene detection in videos using shot clustering and sequence alignment. IEEE Trans Multimedia 11(1):89–100
Article Google Scholar
Desolneux A, Moisan L, Morel J (2003) A grouping principle and four applications. IEEE Trans Pattern Anal Mach Intell 25(4):508–513
Article Google Scholar
Desolneux A, Moisan L, Morel J (2008) From gestalt theory to image analysis: a probabilistic approach. In: Interdisciplinary applied mathematics, vol 34. Springer, New York
Google Scholar
Gao Y, Tang J, Xie X (2009) Key frame vector and its application to shot retrieval. In: Proc. 1st international workshop on interactive multimedia for consumer electronics, Beijing, China, pp 27–34
Hoeffding W (1963) Probability inequalities for sum of bounded random variables. J Am Stat Assoc 58(301):13–30
Article MathSciNet MATH Google Scholar
Law-To J, Joly A, Boujemaa N (2007) Muscle-VCD-2007: a live benchmark for video copy detection. http://www-rocq.inria.fr/imedia/civr-bench/. Accessed May 2010
Lienhart R (2001) Reliable transition detection in videos: a survey and practitioner’s guide. Int J Image Graph 1(3):469–486
Article Google Scholar
Liu TM, Zhang HJ, Qi FH (2003) A novel key-frame extraction algorithm based on perceived motion energy model. IEEE Trans Circuits Syst Video Technol 13(10):1006–1013
Article Google Scholar
Mills M (1992) A magnifier tool for video data. In: Proc. ACM conference on human factors in computing systems, Monterey, California, USA, pp 93–98
Mukherjee DP, Das SK, Saha S (2007) Key-frame estimation in video using randomness measure of feature point pattern. IEEE Trans Circuits Syst Video Technol 17(5):612–620
Article Google Scholar
Ouyang J, Li J, Tang H (2006) Interactive key frame selection model. J Vis Commun Image Represent 17(6):1145–1163
Article Google Scholar
Panagiotakis C, Doulamis A, Tziritas G (2009) Equivalent key frames selection based on iso-content principles. IEEE Trans Circuits Syst Video Technol 19(3):447–451
Article Google Scholar
Park SH (1996) Robust design and analysis for quality engineering. Chapman & Hall, London
Google Scholar
Pickering MJ, Rüger SM, Sinclair D (2002) Video retrieval by feature learning in key frames. In: Proc. International Conference on Image and Video Retrieval, pp 309–317
Pye D, Hollinghurst NJ, Mills TJ, Wood KR (1998) Audio-visual segmentation for content-based retrieval. In: Proc. international conference on spoken language processing
Rasheed Z, Shah M (2005) Detection and Representation of scenes in videos. IEEE Trans Multimedia 7(6):1097–1105
Article Google Scholar
Richard GL (2007) Statistical concepts: a second course. Lawrence Erlbaum Associates, Mahwah
Google Scholar
Roy RK (2001) Design of experiments using the Taguchi approach. Wile, New York
Google Scholar
Smeaton AF, Over P, Doherty AR (2010) Video shot boundary detection: seven years of TRECVid activity. Comput Vis Image Underst 114(4):411–418
Article Google Scholar
Song X, Fan G (2005) Joint key-frame extraction and object-based video segmentation. In: Proc. IEEE workshop on motion and video computing, vol 2. Breckenridge, Colorado, pp 126–131
Spyrou E, Tolias G, Mylonas P, Avrithis Y (2009) Concept detection and keyframe extraction using a visual thesaurus. Multimedia Tools Appl 41(3):337–373
Article Google Scholar
Valdes V, Martinez JM (2010) A framework for video abstraction systems analysis and modelling from an operational point of view. Multimedia Tools Appl 49(1):7–35
Article Google Scholar
Wolf W (1996) Key frame selection by motion analysis. In: Proc. IEEE international conference on acoustics, speech and signal processing, vol 2, Washington, DC, USA pp 1228–1231
Yeung MM, Yeo BL (1997) Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans Circuits Syst Video Technol 7(5):771–785
Article Google Scholar
Zhuang Y, Rui Y, Huang TS, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering. In: Proc. IEEE international conference on image processing, Chicago, USA, pp 866–870

Download references

Author information

Authors and Affiliations

Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, India
Snehasis Mukherjee & Dipti Prasad Mukherjee

Authors

Snehasis Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Dipti Prasad Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Snehasis Mukherjee.

Appendices

Appendix A: Obtaining (13) from [7]

Hoeffding’s inequality [ (7)] In our problem, m _i is the number of frames in the ith unit. Then we can formulate the problem by a sequence of i.i.d. random variables $\{X_q\}_{q=1,2,3,...,m_i}$, such that 0 ≤ X _q ≤ 1. Let us define X _q as,

$$ X_q =\left\{ \begin{array}{ll} 1 & \mathrm{when}~~ l_q < \eta\\ 0 & \mathrm{otherwise}, \end{array}\right. $$

(19)

for a given η, where l _q is the l-ratio value of the qth frame of the ith unit. We set $S_{m_i}=\sum_{q=1}^{m_i}X_q$ (i.e., the number of frames of ith unit having l-ratio greater than η) and $\nu m_i=E\left[S_{m_i}\right]$. Then for νm _i < t < m _i (since ν is a probability value less than 1), putting $\sigma=\frac{t}{m_i}$ as in [5], according to Hoeffding’s inequality,

$$P_i^{\eta}=P\left(S_{m_i}\geq t\right)\leq e^{-m_i\left(\sigma \log {\frac{\sigma}{\nu}}+(1-\sigma) \log {\frac{1-\sigma}{1-\nu}}\right)}. $$

(20)

In addition, the right hand term of this inequality satisfies,

$$ e^{-m_i\left(\sigma \log {\frac{\sigma}{\nu}}+(1-\sigma) \log {\frac{1-\sigma}{1-\nu}}\right)}\leq e^{-m_i\left(\sigma-\nu\right)^2H(\nu)}\leq e^{-2m_i\left(\sigma-\nu\right)^2}, $$

(21)

where

$$ H(\nu) =\left\{ \begin{array}{ll} \dfrac{1}{1-2\nu}\log {\dfrac{1-\nu}{\nu}} & \mathrm{when}~~ 0<\nu<\dfrac{1}{2}\\\\ \dfrac{1}{2\nu(1-\nu)} & \mathrm{when}~~ \dfrac{1}{2}\leq \nu<1 \end{array}\right. $$

(22)

This is Hoeffding’s inequality. We then apply this for finding the sufficient condition of ϵ-meaningfulness. If $t\geq \nu m_i+\sqrt{\frac{\log {\lambda} - \log {\epsilon}}{H(\nu)}}\sqrt{m_i}$, then using (20) and (21) and putting $\sigma=\frac{t}{m_i}$ we get

$$ m_i\left(\sigma-\nu\right)^2 \geq \frac{\log {\lambda}-\log {\epsilon}}{H(\nu)}. $$

(23)

Then using (20) and (23) we get,

$$ P_i^{\eta}\leq e^{-m_i(\sigma-\nu)^2H(\nu)}\leq e^{-\log {\lambda}+\log{\epsilon}}=\frac{\epsilon}{\lambda}. $$

(24)

This means by definition of meaningfulness, the cut-off η is meaningful (according to (11)).

Since for ν in (0,1), H(ν) ≥ 2 (according to (22)) so from (24) we get the sufficient condition of meaningfulness as (13).

Appendix B: Algorithm of the proposed approach

(1)
Input the video sequence with speed X fps.
(2)
Find the Euclidean distance of color values of each pair of consecutive frames.
(3)
For γ = 1 to R (R is the maximum bound of color values) do
1. (a)
  Give a binary value to each pixel using (2).
2. (b)
  Find the matrix β _d for each frame d.
3. (c)
  Calculate p-ratio p _d using (3).
4. (d)
  For κ = 0 to 1 step δ _κ do
  1. (i)
    Find all the frames with p _d > κ.
  2. (ii)
    If (any selected frame f _i is less than X frame apart from f _i + 1) do
    1. (A)
      Find the temporal distance between f _i + 1 and f _i + 2, f _i + 2 and f _i + 3, and so on until the temporal distance between f _i + m and f _{i + m + 1} is greater than X.
    2. (B)
      Take the f _i + m frame as the boundary of the group starting at frame f _i − 1.
  3. (iii)
    End if
  4. (iv)
    Find F-ratio using (4)–(6).
5. (e)
  End for κ
(4)
End for γ
(5)
Find F _max = max (F-ratio) and corresponding value of γ and κ.
(6)
If F _max < F _critical
1. (a)
  Consider the set of frames having p-ratio greater than κ as unit boundaries. else
2. (b)
  Consider whole video as a single unit.
(7)
End if
(8)
Find l-ratio of each frame by (8).
(9)
For each unit i do
1. (a)
  For $\eta=0 ~ \mathrm{to} ~ 1 ~ \mathrm{step} ~ \frac{1}{\lambda}$ do
  1. (i)
    Find probability ν using (9).
  2. (ii)
    Find probability $P_{i}^{\eta}$ using (10).
  3. (iii)
    Find NFA using (11).
  4. (iv)
    If $\emph{NFA}<(\epsilon = 1)$,
    1. (A)
      η ^′ = η.
    2. (B)
      Exit from the for loop of η. else
    3. (C)
      Continue the for loop of η.
  5. (v)
    End if
2. (b)
  End for η
3. (c)
  For $\xi=\eta^{\prime} ~ \mathrm{to} ~ 1 ~ \mathrm{step} ~ \frac{1}{\lambda}$ do
  1. (i)
    Find $r_{i}(\eta^{\prime})$ using (14).
  2. (ii)
    Find c-value using (15).
4. (d)
  End for ξ
5. (e)
  Find the ξ satisfying (16).
6. (f)
  Select the frames having l-ratio greater than ξ as key-frames.
(10)
End for unit
(11)
Display all the selected frames as key-frames.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mukherjee, S., Mukherjee, D.P. A design-of-experiment based statistical technique for detection of key-frames. Multimed Tools Appl 62, 847–877 (2013). https://doi.org/10.1007/s11042-011-0882-2

Download citation

Published: 01 October 2011
Issue Date: February 2013
DOI: https://doi.org/10.1007/s11042-011-0882-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A design-of-experiment based statistical technique for detection of key-frames

Abstract

Access this article

Similar content being viewed by others

A Cascaded Approach for Keyframes Extraction from Videos

An Innovative Method for Key Frames Extraction in News Videos

Selection of keyframes for video colourization using steerable filtering

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Obtaining (13) from [7]

Appendix B: Algorithm of the proposed approach

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A design-of-experiment based statistical technique for detection of key-frames

Abstract

Access this article

Similar content being viewed by others

A Cascaded Approach for Keyframes Extraction from Videos

An Innovative Method for Key Frames Extraction in News Videos

Selection of keyframes for video colourization using steerable filtering

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Obtaining (13) from [7]

Appendix B: Algorithm of the proposed approach

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation