Algorithms for the boundary selection problem

Bhuyan, J. N.; Deogun, J. S.; Raghavan, V. V.

doi:10.1007/BF02522823

Algorithms for the boundary selection problem

Published: February 1997

Volume 17, pages 133–161, (1997)
Cite this article

Algorithmica Aims and scope Submit manuscript

J. N. Bhuyan¹,
J. S. Deogun² &
V. V. Raghavan³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

User-oriented clustering schemes enable the classification of documents based upon the user's perception of the similarity between documents, rather than on some similarity function presumed by the designer to represent the user's criteria. In an earlier paper it was shown that such a classification scheme can be developed in two stages. The first stage involves the accumulation of relevance judgements provided by users,vis-à-vis past query instances, into a suitable structure. The second stage consists of cluster identification. When the structure chosen, in the first stage, for the accumulation of corelevance characteristics of documents is a straight line, the second stage can be formulated as a function optimization problem termed the Boundary Selection Problem (BSP). A branch-and-bound algorithm with a good bounding function is developed for the BSP. Although significant pruning is achieved due to the bounding function, the complexity is still high for a problem of a large size. For such a problem a heuristic that divides it into a number of subproblems, each being solved by a branch-and-bound approach, is developed. Then the overall problem is mapped to an integer knapsack problem and solved by the use of dynamic programming. The tradeoff between accuracy and complexity can be controlled, giving the user a preference of one over the other. Assuming that the heuristic which divides the overall problem introduces no errors and is given sufficient time, the branch and bound with dynamic programming (BBDP) approach will converge to the optimal solution. Two other heuristic approaches, one with the application of a polynomial dynamic programming algorithm and the other which works in a greedy way, are also proposed for the BSP and an experimental comparison of all these approaches is provided. Experimental results indicate that all proposed algorithms show better performance compared with the existing algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

[Da] Dattola, R. T., Experiments with a Fast Algorithm for Automatic Classification, inThe SMART Retrieval System—Experiments in Automatic Document Processing, pp. 265–297, G. Salton, ed., Prentice-Hall, Englewood Cliffs, NJ, 1971.
Google Scholar
[Do] Doyle, L. B., Breaking the Cost Barrier in Automatic Classification, Report No. SP-2516, System Development Corp. Santa Monica, CA, 1966.
Google Scholar
[DR] Deogun, J. S., and Raghavan, V. V., User-Oriented Document Clustering: A Framework for Learning in Information Retrieval,Proc. ACM Conf. on Research and Development in Information Retrieval, pp. 157–163, Pisa, 1986.
[F] Fisher, W. D., On Grouping for Maximum Homogeneity,J. Amer. Statist. Assoc., Vol. 53, pp. 789–798, 1958.
Article MATH MathSciNet Google Scholar
[H] Hartigan, J. A.,Clustering Algorithms, Wiley, New York, 1975.
MATH Google Scholar
[J] Jardine, N., and Sibson, R., A Model for Taxonomy,Math. Biosci., Vol. 15, pp. 493–513, 1968.
Google Scholar
[P] Pawlak, Z.,On Learning— a Rough Set Approach, Lecture Notes in Computer Science, Vol. 208, Springer-Verlag, Berlin, 1986.
Google Scholar
[RA] Raghavan, V. V., and Agarwal, B., Optimal Determination of User-Oriented Clusters: An Application for the Productive Plan. Genetic Algorithms and Their Application,Proc. Second Internat. Conf. on Genetic Algorithms, pp. 241–246.
[RD] Raghavan, V. V., and Deogun, J. S., Optimal Determination of User-Oriented Clusters,Proc. Tenth International ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 140–146, New Orleans, 1987.
[S] Salton, G.The SMART Retrieval System— Experiment in Automatic Document Processing, Prentice-Hall, Englewood Cliffs, NJ, 1971.
Google Scholar
[SW] Salton, G., and Wong, A., Generation and Search of Clustered Files,ACM Trans. Database Systems, Vol. 3, pp. 321–346, 1978.
Article Google Scholar
[VC] Van Rijsbergen, C. J., and Croft, W. B., Document Clustering: An Evaluation of Some Experiments with the Cranfield 1400 Collection,Inform. Process. Manag., Vol. 11, pp. 171–182, 1975.
Article Google Scholar
[VS] Van Rijsbergen, C. J., and Sparck Jones, K., A Test for the Separation of Relevant and Non-Relevant Documents in Experimental Retrieval Collections,J. Documentation, Vol. 29, pp. 251–257, 1973.
Article Google Scholar
[V] Voorhees, E. M., The Cluster Hypothesis Revisited,Proc. Eighth Annual Internat. ACM SIGIR Conf. on Research and Development in Information Retrieval, Montreal, Quebec, pp. 188–196, 1985.
[Y] Yu, C. T., A Clustering Algorithm Based on User Queries,J. Amer. Soc. Inform. Sci., Vol. 25, pp. 218–226, 1974.
Google Scholar
[YWC] Yu, C. T., Wang, Y. T., and Chen, C. H., Adaptive Document Clustering,Proc. Eighth Annual Internat. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 197–203, Montreal, Quebec, 1985.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tuskegee University, 36088, Tuskegee, AL, USA
J. N. Bhuyan
Department of Computer Science and Engineering, University of Nebraska-Lincoln, 68588-0115, Lincoln, NE, USA
J. S. Deogun
The Center for Advanced Computer Studies, University of Southwestern Louisiana, 70504-4330, Lafayatte, LA, USA
V. V. Raghavan

Authors

J. N. Bhuyan
View author publications
You can also search for this author inPubMed Google Scholar
J. S. Deogun
View author publications
You can also search for this author inPubMed Google Scholar
V. V. Raghavan
View author publications
You can also search for this author inPubMed Google Scholar

Additional information

Communicated by C. L. Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhuyan, J.N., Deogun, J.S. & Raghavan, V.V. Algorithms for the boundary selection problem. Algorithmica 17, 133–161 (1997). https://doi.org/10.1007/BF02522823

Download citation

Received: 28 September 1994
Revised: 10 April 1995
Issue Date: February 1997
DOI: https://doi.org/10.1007/BF02522823

Key Words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithms for the boundary selection problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Partition-Based Clustering Using Constraint Optimization

Proposal of Three Algorithms Improving the DENCLUE Algorithm for Data Clustering

Algorithm for Clustering of Web Search Results from a Hyper-heuristic Approach

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key Words

Subscribe and save

Buy Now

Algorithms for the boundary selection problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Partition-Based Clustering Using Constraint Optimization

Proposal of Three Algorithms Improving the DENCLUE Algorithm for Data Clustering

Algorithm for Clustering of Web Search Results from a Hyper-heuristic Approach

Explore related subjects

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Subscribe and save

Buy Now