skip to main content
10.1145/3162957.3163026acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

RT-PUSH: a VM fault detector for deadline-based tasks in cloud

Published:24 November 2017Publication History

ABSTRACT

Nowadays, the cloud is becoming an important paradigm due to cost-efficiency, scalability, availability and high resource utilization. An increasing number of deadline-based (real-time) applications are moving to cloud for high availability and low latency. Applications like financial transactions, health-care system, scientific workflows with their real-time nature, i.e., deadline bound, need provisioning of computing services, despite the presence of failure. So, cloud running deadline-based applications need a fault-tolerant framework that can assure availability and responsiveness. Fault can occur in any layer of cloud; physical, virtual, or application and can be handled by various fault management methods, which include fault avoidance, fault detection, and fault recovery. Here, our primary focus is on a Virtual Machine (VM) failure and its detection. We showed the VM behavior in normal working condition and with the failed state through state transition diagram. A timeout based VM fault detector RT-PUSH proposed for cloud running real-time applications. We used success ratio and execution drop rate as performance metrics to evaluate the effectiveness of the proposed fault detector.

References

  1. Sahoo S., Nawaz S., Mishra S. K. and Sahoo B. (2015). Execution of real time task on cloud environment. Annual IEEE India Conference (INDICON), pp. 1--5Google ScholarGoogle ScholarCross RefCross Ref
  2. Sahoo, S., Sahoo, B., Turuk, A. K., & Mishra, S. K. (2017). Real Time Task Execution in Cloud Using MapReduce Framework. In A. Turuk, B. Sahoo, & S. Addya (Eds.), Resource Management and Efficiency in Cloud Computing Environments, pp. 190--209, PA: IGI Global.Google ScholarGoogle Scholar
  3. Latiff, M. S. A., Madni, S. H. H., and Abdullahi, M. (2016). Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Computing and Applications, pp. 1--15, ISSN: 1433--3058Google ScholarGoogle Scholar
  4. Wang, J., Bao, W., Zhu, X., Yang, L. T., and Xiang, Y. (2015). FESTAL: fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds. IEEE Transactions On Computers, pp. 2545--2558, ISSN: 0018--9340Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cheraghlou, M. N., Khadem-Zadeh, A., & Haghparast, M. (2016). A survey of fault tolerance architecture in cloud computing. Journal of Network and Computer Applications, pp. 81--92, ISSN: 1084--8045 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chan, H., and Chieu, T. (2012). An approach to high availability for cloud servers with snapshot mechanism. In Proceedings of the Industrial Track of the 13th ACM/IFIP/USENIX International Middleware Conference, pp. 6:1--6:6, ACM, ISBN: 978-1-4503-1613-2 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zhu, X., Wang, J., Guo, H., Zhu, D., Yang, L. T., and Liu, L. (2016). Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Transactions on Parallel and Distributed Systems, pp. 3501--3517, ISSN: 1045--9219 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. An, K., Shekhar, S., Caglar, F., Gokhale, A., and Sastry, S. (2014). A cloud middleware for assuring performance and high availability of soft real-time applications. Journal of Systems Architecture, pp. 757--769, ISSN: 1383--7621 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Endo, P. T., Rodrigues, M., Gonçalves, G. E., Kelner, J., Sadok, D. H., and Curescu, C. (2016). High availability in clouds: systematic review and research challenges. Journal of Cloud Computing, pp. 16, ISSN: 2192--113X Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hayashibara, N., Cherif, A., and Katayama, T. (2002). Failure detectors for large-scale distributed systems. In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems, pp. 404--409, IEEE, ISSN: 1060--9857 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jhawar, R., Piuri, V., and Santambrogio, M. (2013). Fault tolerance management in cloud computing: A system-level perspective. IEEE Systems Journal, pp. 288--297, ISSN: 1932--8184Google ScholarGoogle ScholarCross RefCross Ref
  12. Behl, J., Distler, T., Heisig, F., Kapitza, R., & Schunter, M. (2012, April). Providing fault-tolerant execution of web-service-based workflows within clouds. In Proceedings of the 2nd International Workshop on Cloud Computing Platforms, pp. 7:1--7:6, ACM, ISBN: 978-1-4503-1161-8 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Malik, S., and Huet, F. (2011). Adaptive fault tolerance in real time cloud computing. In IEEE World Congress on Services (SERVICES), pp. 280--287, ISSN: 2378--3818 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yi, S., Andrzejak, A., & Kondo, D. (2012). Monetary cost-aware checkpointing and migration on amazon cloud spot instances. IEEE Transactions on Services Computing, pp. 512--524, ISSN: 1939--1374 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. RT-PUSH: a VM fault detector for deadline-based tasks in cloud

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCIP '17: Proceedings of the 3rd International Conference on Communication and Information Processing
      November 2017
      545 pages
      ISBN:9781450353656
      DOI:10.1145/3162957

      Copyright © 2017 ACM

      © 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 November 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate61of301submissions,20%
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader