Skip to main content
Log in

High-dimensionality priority selection scheme of bioinformatics information using Bernoulli distribution

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Recently, as the amount of genetic information has been increasing following the completion of the human genome project, bioinformatics information management has been coming to the fore. However, since bioinformatics information is composed of diverse kinds of genetic information, users cannot easily approach and use it. In the present paper, a high-dimensionality information management scheme is proposes that enables users to select those pieces of bioinformatics information that are highly frequently used using the Bernoulli distribution so that users can easily approach those pieces of bioinformatics information that are preferred by them. The proposed scheme is an approach to high-dimensionality priority selection that requires the presentation of two or more pieces of bioinformatics information. In addition, in the case of the proposed scheme, since the order of priority of information is determined based on the kinds, functions, and characteristics of bioinformatics information, users can easily approach bioinformatics information according to their purpose of use of the information. According to the results of experiments, the proposed scheme showed a success rate 11.6 % higher than that of existing schemes in terms of bioinformatics information searches and the delay time of bioinformatics information services used by independent users was shown to be 17.3 % shorter than that of existing schemes .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wang, M.D.: In the spotlight: bioinformatics. IEEE Rev. Biomed. Eng. 6, 3–8 (2013)

    Article  Google Scholar 

  2. Irsoy, O., Yildiz, O.T., Alpaydin, E.: Design and analysis of classifier learning experiments in bioinformatics: survey and case studies. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(6), 1663–1675 (2012)

    Article  Google Scholar 

  3. Chen, Y.-P.P.: Guest editorial: advanced algorithms of bioinformatics. IEEE Trans. Comput. Biol. Bioinform. 10(2), 273 (2013)

    Article  Google Scholar 

  4. Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans. Knowl. Discov. Data 3(1), 1–58 (2009)

    Article  Google Scholar 

  5. Houle, M.E., Kriegel, H.P., Kröger, P., Schubert, E., Zimek, A.: Can shared-neighbor distances defeat the curse of dimensionality? Lecture notes in computer science. Sci. Stat. Database Manag. 6187, 482–500 (2010)

    Article  Google Scholar 

  6. Agrawal, R., Gehrke, J., Gunopulos, P., Raghavan, P.: Automatic subspace clustering of high dimensional data. Data Min. Knowl. Discov. 11, 5–33 (2005)

    Article  MathSciNet  Google Scholar 

  7. K. Kailing, H. P. Kriegel, P. Kröger, “Density-Connected Subspace Clustering for High-Dimensional Data,” In Proc. of the 2004 SIAM International Conference on Data Mining, pp. 246, 2004

  8. Cordeiro De Amorim, R., Mirkin, B.: Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering. Pattern Recognition 45(3), 1061 (2012)

    Article  Google Scholar 

  9. Böhm, C., Kailing, K., Kriegel, H.-P., Kröger, P.: Density connected clustering with local subspace preferences. In: Proceeeding of Fourth IEEE International Conference on Data Mining (ICDM’04), p. 27 (2004)

  10. Aggarwal, C.C., Wolf, J.L., Yu, P.S., Procopiuc, C., Park, J.S.: Fast algorithms for projected clustering. ACM SIGMOD Record, p. 61. ACM, New York (1999)

    Google Scholar 

  11. Kriegel, H., Kröger, P., Renz, M., Wurst S.: A generic framework for efficient subspace clustering of high-dimensional data. In: Proceeding of Fifth IEEE International Conference on Data Mining (ICDM’05), pp. 250–257 (2005)

  12. Andersson, T., Handel, P.: Multiple-tone estimation by IEEE standard 1057 and the expectation-maximization algorithm. In: Proceeding of the 20th IEEE Instrumentation and Measurement Technology Conference, vol. 1, pp. 739–742 (2003)

  13. Wang, W.: Big data, big challenges. In: Proceeding of 2014 IEEE International Conference on Semantic Computing (ICSC), p. 6 (2014)

  14. Sowe, S.K., Kimata, T., Dong, M., Zettsu, K.: Managing heterogeneous sensor data on a big data platform: IoT services for data-intensive science. In: Proceeding of 2014 IEEE 38th International Computer Software and Applications Conference Workshops (COMPSACW), pp. 295–300 (2014)

  15. Kashlev, A., Lu, S.: A system architecture for running big data workflows in the cloud. In: Proceeding of 2014 IEEE International Conference on Services Computing (SCC), pp. 51–58 (2014)

  16. Fang, C., Yang, F., Zeng, X., Li, X.: BMF-BD: Bayesian model fusion on Bernoulli distribution for efficient yield estimation of integrated circuits. In: Proceeding of 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6 (2014)

  17. Sagiroglu S., Sinanc, D.: Big datga: a review. In: Proceeding of 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 42–47 (2013)

  18. Katal, A., Wazid, M., Goudar, R.H.: Big data: issues, challenges, tools and good practices. In: Proceeding of 2013 Sixth International Conference on Contemporary Computing (IC3), pp. 404–409 (2013)

  19. Hansmann, T., Niemeyer, P.: Big data—characterizing an emerging research field using topic models. In: Proceeding of 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence(WI) aqnd Intelligent Agent Technologies (IAT), pp. 43–51 (2014)

Download references

Acknowledgments

This Research was supported by the Tongmyong University Research Grants 2016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seung-Soo Shin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jeong, YS., Shin, SS. & Han, KH. High-dimensionality priority selection scheme of bioinformatics information using Bernoulli distribution. Cluster Comput 20, 539–546 (2017). https://doi.org/10.1007/s10586-016-0622-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-016-0622-5

Keywords

Navigation