Nowadays, the collaborative filtering becomes popular for recommendation systems. However, as the volume of data increases expansively, the construction of a similarity matrix becomes a performance bottleneck in recommendation systems. The MapReduce framework proposed by Google has been widely used for data-intensive application recently. Thus, in this work, we propose an efficient parallel algorithm ConSimMR for constructing a similarity matrix using MapReduce. We first partition a set of items into disjoint groups in each of which items rated by similar users tend to be located. We next compute the similarity of every pair of items belonging to the same group. Finally, we calculate the similarity of every item pair included in different groups. At this step, by using the rating list of each user rather than that of each item, we can compute the similarities in parallel resulting in the performance improvement. We conducted experiments to compare our parallel algorithm ConSimMR with the previous algorithms on real-life data sets and confirmed the efficiency as well as scalability of ConSimMR.

Kim, S., Kim, H. & Min, JK. An efficient parallel similarity matrix construction on MapReduce for collaborative filtering. J Supercomput 75, 123–141 (2019). https://doi.org/10.1007/s11227-018-2271-3
