Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.

The term feature is used alternately with the term attribute in this paper.
Recall refers to classification or prediction in this context.
The input for generating a bias identifier at a leaf node is a raw sub-pattern, while the input at an internal node and the rootnode is a sequence of combined identifiers of its child nodes.
The lookup process only involves a single hop message.
True positive rate is equivalent to recall in the content retrieval context. However, we do not use the term recall here since, in the context of this paper, recall refers to prediction.
F-measure is also known as F-score or F 1 score.
Every peer knows the location of all the others, so that direct connections among them can be established.
The research reported in this paper is supported by Research Acculturation Grant Scheme (RAGS) 9018-00080. The authors would also like to express gratitude to the Malaysian Ministry of Higher Education (MOHE) and University Malaysia Perlis (UniMAP) for the facilities provided.
Appendix A: Other Algorithms
1.1 A.1 Tree construction
Algorithm 3 are executed to generate a logical DASMET tree. The process of constructing the logical tree is recursive and it starts from a root node. Let level be 0, \(\widehat {X}=\{\hat {x}_{i}\}_{i=1}^{n_{h}}\) be a set of sub-patterns, w be the number of sub-patterns (n H ), m be the maximum number of children of each node and d s be the segment size at a leaf node. Note that m in this algorithm is equal to φ s . All segments in \(\widehat {X}\) are initially assigned to the root node V where the following steps in function constructTree(level, \(w,\widehat {X},m\), H) as explained in Algorithm 3 are executed.

The node firstly determines whether it should expand the tree or not. In case that w is less or equal to m, then the node creates w leaf nodes and assigns one segment per leaf node; this completes the process. Otherwise, it determines the number of children n c using (9) as below.
Next, it creates n c child nodes and distributes the available segments to these child nodes using greedy approach. Upon receiving w segments from its parent, every child node then executes Algorithm 3. This process is executed recursively until w ≤ m.
1.2 A.2 Generate Identifier

