Elsevier

Information Processing Letters

Volume 142, February 2019, Pages 27-29
Information Processing Letters

Sets of binary sequences with small total Hamming distances

https://doi.org/10.1016/j.ipl.2018.10.005Get rights and content

Highlights

  • Sets of binary sequences that are close in terms of Hamming distance.

  • Minimum total Hamming distance of sets of binary sequences.

  • Explicit sets of n sequences, for each positive integer n, that are as close as possible.

Abstract

The sum of the Hamming distances between pairs of binary sequences in a set is considered. It is shown that this sum is at least 8 and 48 for sets of four and eight sequences, respectively, and is at least (n1)2 for sets of n sequences where n is not equal to 4 or 8. Sets meeting this minimum are explicitly specified.

Introduction

In order to save power and increase speed of data processing, it is preferable to represent data in the form of sequences that are close together. To quantify closeness, we may consider the number of pairs of sequences at distance 1 from each other [1]. Other criteria for closeness such as the diameter, connectivity, and neighborhood of a hypercube representing the data are also considered [2], [3], [4]. In this paper, we consider a different criterion, namely, the total Hamming distance between all pairs of sequences and specify sets of binary sequences as close as possible under this criterion. Although in practice, data is represented by finite sequences, in order to simplify notation and not restrict the lengths of sequences, we consider infinite sequences. Actually, to construct a set of n binary sequences with minimum total Hamming distance, it suffices to consider sequences of lengths no more than n1.

Let s=(s1s2) and s=(s1s2) be binary sequences. The Hamming distance, d(s,s), between s and s is the number of positions i for which sisi [3]. Henceforth we refer to the Hamming distance simply as distance. Since s and s are binary sequences, d(s,s)=i|sisi|.

Let S be a finite set of binary sequences. We define the total distance of S, d(S), to be the sum of the distances between pairs of sequences in S, i.e.,d(S)={s,s}Sd(s,s).

Given a positive integer n, we are interested in finding the minimum possible total distance, denoted by dmin(n), among all sets of n binary sequences and sets that achieve this minimum. For each n1, we define a set of n binary sequences, Sn, which plays an important role in our investigation. This set consists of the all-0's sequence and the n1 sequences, each having a single 1 and this 1 is in one of the first n1 positions.

The main result is stated in the next section which gives an expression for dmin(n) and sets that achieve this minimum total distance. The proof is provided in Section 3.

Section snippets

Result

Our main result is the following theorem.

Theorem 1

For n=4, dmin(4)=8 which is achieved by the set composed of the four sequences (000), (100), (010), and (110), where the dots stand for 0's. For n=8, dmin(8)=48 which is achieved by the set composed of the eight sequences (0000), (1000), (0100), (0010), (1100), (1010), (0110), and (1110). For all other values of n1, dmin(n)=(n1)2 which is achieved by Sn.

Proof of Theorem 1

Interestingly, as shown next, the total distance of a set S of binary sequences can be determined easily from the number of sequences and their sum over the real numbers. Define the set sum of S to be σ(S)=sSs.

Lemma 1

Let S be a set of n binary sequences with set sum σ(S)=(σ1σ2). Then, the total distance of S is given byd(S)=iσi(nσi).

Proof

From the definitions of the distance between two sequences and the total distance of a set, we haved(S)={(s1s2),(s1s2)}Si|sisi|=i{(s1s2),(s1s2)}S|sisi

References (4)

There are more references available in the full text version of this article.

Cited by (0)

View full text