Skip to main content

On the Relation Between the Relative Earth Mover Distance and the Variation Distance (an Exposition)

  • Chapter
  • First Online:
Computational Complexity and Property Testing

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12050))

  • 487 Accesses

Abstract

The “relative earth mover distance” is a technical term introduced by Valiant and Valiant (43rd STOC, 2011), and extensively used in their work. They claimed that, for every two distributions, the relative earth mover distance upper-bounds the variation distance up to relabeling, but this claim was not used in their work. The claim appears as a special case of a result proved by Valiant and Valiant in a later work (48th STOC, 2016), but we found their proof too terse. The proof presented here is merely an elaboration of (this special case of) their proof.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See http://www.wisdom.weizmann.ac.il/~oded/p_remd.html.

  2. 2.

    Specifically, for two distributions presented by their probability functions \(p,q:D\!\rightarrow \![0,1]\), their variation distance equals \(0.5\cdot \sum _{i\in D}|p(i)-q(i)|\), which in turn equals \(\min _{S\subseteq D}\{p(S)-q(S)\}\), where \(p(S)=\sum _{i\in S}p(i)\). The set S may be viewed as the set of samples on which an observer (as discussed next) outputs the verdict 1.

  3. 3.

    Here and in the sequel, the logarithm is to base 2. The proof of Theorem 3.1 as presented in Sect. 3 remains valid for any base \(b\in (1,e]\); our only reference to this base is that it (i.e., b) should satisfy \(\log _bz>1-(1/z)\) for every \(z>1\). It seems that Valiant and Valiant do mean to take \(b=2\) (although other parts of their text suggest \(b=e\)). Indeed, both \(b=2\) and \(b=e\) seems natural choices.

  4. 4.

    As stated in Footnote 3, we assume that the logarithm is to base \(b\in (1,e]\). Indeed, here we use \(\log _b z+(1/z)>1\) for all \(z>1\), and this is the only place in the proof in which the choice of b matters.

  5. 5.

    That is, letting \(\pi _p\) and \(\pi _q\) be permutations over [n] such that \(p(\pi _p(j))\le p(\pi _p(j+1))\) and \(q(\pi _q(j))\le q(\pi _q(j+1))\) for every \(j\in [n-1]\), in the \(i^\mathrm{th}\) iteration we transport one unit from location \(p(\pi _p(i))\) of \(h_p\) to location \(q(\pi _q(i))\) of \(h_q\).

  6. 6.

    To see that (1) holds, note that the cost of \(\ell '\) equals the cost of \(\ell \) plus \(c\cdot |x^*-y^*|-c\cdot |x'-y^*|-c\cdot |x^*-y'|+c\cdot |x'-y'|\). Hence, we need to verify that the added value is not positive; equivalently, that \(|x^*-y^*|-|x^*-y'|\le |x'-y^*|-|x'-y'|\). Consider the following cases:

    1. 1.

      The diagonal line \(y=x\) does not cross the rectangle spanned by \((x^*,y^*)\) (i.e., either \(y'\le x^*\) or \(y^*\ge x'\)). If \(y'\le x^*\), then \(|y^*-x^*|-|y'-x^*|=y'-y^*=|y^*-x'|-|y'-x'|\), and otherwise \(|y^*-x^*|-|y'-x^*|=-(y'-y^*)=|y^*-x'|-|y'-x'|\).

    2. 2.

      The diagonal line \(y=x\) separates one corner-point of the rectangle from the other three corner-points (e.g., \(y'>x^*\) but \(y<x\) for \((x,y)\in \{(x^*,y^*),(x',y^*),(x',y')\}\)). If \(y'>x^*\), then \(|y^*-x^*|-|y'-x^*| < y'-y^* = |y^*-x'|-|y'-x'|\), and similarly for the case that \((x',y^*)\) is separated.

    3. 3.

      The diagonal line \(y=x\) crosses both horizontal lines of the rectangle (i.e., \(y^*,y'\in [x^*,x']\)). In this case, \(|y^*-x^*|-|y'-x^*|=-(y'-y^*)\) and \(|y^*-x'|-|y'-x'|=y'-y^*\).

    4. 4.

      The diagonal line \(y=x\) crosses both vertical lines of the rectangle (i.e., \(x^*,x'\in [y^*,y']\)). In this case \(|y^*-x^*|-|y'-x^*|<|y^*-x'|-|y'-x'|\), since \(|y^*-x^*|<|y^*-x'|\) and \(|y'-x^*|>|y'-x'|\).

    To see that (2) holds, recall that \(\ell '(x,y)=\ell (x,y)=m(x,y)\) for every \((x,y)<(x^*,y^*)\).

References

  1. Goldreich, O.: Introduction to Property Testing. Cambridge University Press, Cambridge (2017)

    Book  Google Scholar 

  2. Goldreich, O., Ron, D.: On sample-based testers. In: 6th Innovations in Theoretical Computer Science, pp. 337–345 (2015)

    Google Scholar 

  3. Valiant, G., Valiant, P.: Estimating the unseen: an \(n/\log (n)\)-sample estimator for entropy and support size, shown optimal via new CLTs. In: 43rd ACM Symposium on the Theory of Computing, pp. 685–694 (2011). See ECCC TR10-180 for the algorithm, and TR10-179 for the lower bound

    Google Scholar 

  4. Valiant, G., Valiant, P.: Instance optimal learning. CoRR abs/1504.05321 (2015)

    Google Scholar 

  5. Valiant, G., Valiant, P.: Instance optimal learning of discrete distributions. In: 48th ACM Symposium on the Theory of Computing, pp. 142–155 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Oded Goldreich or Dana Ron .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Goldreich, O., Ron, D. (2020). On the Relation Between the Relative Earth Mover Distance and the Variation Distance (an Exposition). In: Goldreich, O. (eds) Computational Complexity and Property Testing. Lecture Notes in Computer Science(), vol 12050. Springer, Cham. https://doi.org/10.1007/978-3-030-43662-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43662-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43661-2

  • Online ISBN: 978-3-030-43662-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics