Abstract
Many mal-practices in stock market trading—e.g., circular trading and price manipulation—use the modus operandi of collusion. Informally, a set of traders is a candidate collusion set when they have “heavy trading” among themselves, as compared to their trading with others. We formalize the problem of detection of collusion sets, if any, in the given trading database. We show that naïve approaches are inefficient for real-life situations. We adapt and apply two well-known graph clustering algorithms for this problem. We also propose a new graph clustering algorithm, specifically tailored for detecting collusion sets. A novel feature of our approach is the use of Dempster–Schafer theory of evidence to combine the candidate collusion sets detected by individual algorithms. Treating individual experiments as evidence, this approach allows us to quantify the confidence (or belief) in the candidate collusion sets. We present detailed simulation experiments to demonstrate effectiveness of the proposed algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bapeswara Rao VV and Sankara Rao K (1985). Enumeration of Hamiltonian circuits in digraphs. Proc. IEEE 73: 1524–1525
Gowda KC and Krishna G (1978). Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recogn 10: 105–112
Honkanen PA (1978) Circuit enumeration in an undirected graph. In: Proceedings of the 16th ACM southeast regional conference, pp 49–53
Jain AK, Duin RPW and Mao J (2000). Statistical pattern recognition: a review. IEEE Trans Pattern Anal Machine Intelligence 22(1): 4–37
Jain AK, Murty MN and Flynn PJ (1999). Data clustering: a review. ACM Comput Surv 31(3): 264–323
Jarvis RA and Patrick EA (1973). Clustering using a similarity measure based on shared nearest neighbors. IEEE Trans Comput C-22(11): 1025–1034
Le Hegarat-Mascle S, Richard D and Ottle C (2003). Multi-scale data fusion using Dempster–Shafer evidence theory. Integr Comput-Aid Eng 10: 9–22
Palshikar GK, Bahulkar A (2000) Fuzzy temporal patterns for analysing stock market databases. In Proceedings of the international conference on advances in data management (COMAD-2000), Pune, India, Tata-McGraw Hill, pp 135–142
Palshikar GK, Apte MM (2005) Collusion set detection using graph clustering. In: Proceedings of the conference management of data (COMAD 2005b), Hyderabad, India, Computer Society of India, pp 101–111
Rich E, Knight D (1995) Artificial intelligence, 2/e. McGraw-Hill
Rubin F (1974). A search procedure for Hamilton paths and circuits. J ACM 21(4): 576–580
SEBI Order against DSQ Holdings dated 10th December 2004. Order No. CO/109/ISD/12/2004. http://www.sebi.gov.in
Shafer G (1976) A mathematical theory of evidence. Princeton University Press
Tarjan RE and Read RC (1975). Bounds on backtrack algorithm for listing cycles, paths and spanning trees. Networks 5: 237–252
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charu Aggarwal.
A preliminary version of this paper was published as Palshikar and Apte (2005).
Rights and permissions
About this article
Cite this article
Palshikar, G.K., Apte, M.M. Collusion set detection using graph clustering. Data Min Knowl Disc 16, 135–164 (2008). https://doi.org/10.1007/s10618-007-0076-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-007-0076-8