Skip to main content
Log in

Optimizing task allocation in multi-query edge analytics

  • Published:
Cluster Computing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Edge analytics receives an ever-increasing interest since processing streaming data closer to where they are produced, rather than transferring them to the cloud, ensures lower latency while also addresses data privacy issues. In this work, we deal with the placement of analytic tasks to heterogeneous geo-distributed edge devices while targeting three objectives, namely latency, quality of results, and resource utilization. In addition, we investigate this multi-objective problem in a multi-query setting, where we jointly optimize multiple analytic jobs while dynamically adjusting task placement decisions. We explore multiple solutions that we thoroughly evaluate; interestingly, in a multi-query setting, all three objectives can be improved simultaneously by our proposals in many cases. Furthermore, we develop a proof-of-concept prototype using Apache Storm. Our solutions are thoroughly evaluated and shown to yield improvements by more than 50% compared to advanced baselines targeting only latency. Moreover, our software prototype managed to achieve speedups of up to 6\(\times\) over the Resource Aware Apache Storm scheduler, with an average speedup of 2.76\(\times\), when deployed over a small-scale infrastructure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The scripts to generate the datasets analysed during the current study are publicly available in the following Github repository: https://github.com/annavalentina/Multi-Query-Optimization.

Notes

  1. In the remainder, we refer to edge computing as covering both edge and fog devices although in general fog computing may be deemed as distinct from edge computing [3].

  2. It is trivial to extend our notation and solution to consider task types and the requirements to refer to task types rather than individual tasks; for simplicity and to reduce notation, we do not consider task types.

  3. Again, for simplicity, we assume that tuples in all queries are of the same length; it is straightforward to account for variable-length tuples across different task outputs.

  4. https://storm.apache.org/

  5. https://flink.apache.org/

  6. https://github.com/apache/incubator-heron.

  7. https://github.com/thombashi/tcconfig.

References

  1. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3(5), 637–646 (2016). https://doi.org/10.1109/JIOT.2016.2579198

    Article  Google Scholar 

  2. Asghari, P., Rahmani, A.M., Javadi, H.H.S.: Internet of things applications: a systematic review. Comput. Netw. 148, 241–261 (2019). https://doi.org/10.1016/j.comnet.2018.12.008

    Article  Google Scholar 

  3. Iorga, M., Feldman, L., Barton, R., Martin, M.J., Goren, N.S., Mahmoudi, C., et al.: Fog computing conceptual model. Special Publication (NIST SP)-500-325 (2018)

  4. Zhao, G., Xu, H., Zhao, Y., Qiao, C., Huang, L.: Offloading dependent tasks in mobile edge computing with service caching. In: 39th IEEE Conference on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020, pp. 1997–2006 (2020). https://doi.org/10.1109/INFOCOM41043.2020.9155396

  5. Nardelli, M., Cardellini, V., Grassi, V., Presti, F.L.: Efficient operator placement for distributed data stream processing applications. IEEE Trans. Parallel Distrib. Syst. 30(8), 1753–1767 (2019). https://doi.org/10.1109/TPDS.2019.2896115

    Article  Google Scholar 

  6. Michailidou, A., Gounaris, A., Symeonides, M., Trihinas, D.: EQUALITY: quality-aware intensive analytics on the edge. Inf. Syst. 105, 101953 (2022). https://doi.org/10.1016/j.is.2021.101953

    Article  Google Scholar 

  7. Khan, W.Z., Ahmed, E., Hakak, S., Yaqoob, I., Ahmed, A.: Edge computing: a survey. Future Gen. Comput. Syst. 97, 219–235 (2019). https://doi.org/10.1016/j.future.2019.02.050

    Article  Google Scholar 

  8. Ahmed, E., Rehmani, M.H.: Mobile edge computing: opportunities, solutions, and challenges. Future Gen. Comput. Syst. 70, 59–63 (2017). https://doi.org/10.1016/j.future.2016.09.015

    Article  Google Scholar 

  9. Mao, Y., Zhang, J., Song, S., Letaief, K.B.: Power-delay tradeoff in multi-user mobile-edge computing systems. In: 2016 IEEE Global Communications Conference, GLOBECOM 2016, Washington, DC, USA, December 4–8, 2016, pp. 1–6 (2016). https://doi.org/10.1109/GLOCOM.2016.7842160

  10. Motaghare, O., Pillai, A.S., Ramachandran, K.I.: Predictive maintenance architecture. In: 2018 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–4 (2018). https://doi.org/10.1109/ICCIC.2018.8782406

  11. Amruthnath, N., Gupta, T.: A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. In: 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), pp. 355–361 (2018). https://doi.org/10.1109/IEA.2018.8387124

  12. Zhao, P., Kurihara, M., Tanaka, J., Noda, T., Chikuma, S., Suzuki, T.: Advanced correlation-based anomaly detection method for predictive maintenance. In: 2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017, Dallas, TX, USA, June 19–21, 2017, pp. 78–83 (2017). https://doi.org/10.1109/ICPHM.2017.7998309

  13. Albers, T., Lazovik, E., Yousefi, M.H.N., Lazovik, A.: Adaptive on-the-fly changes in distributed processing pipelines. Front. Big Data 4, 666174 (2021). https://doi.org/10.3389/fdata.2021.666174

    Article  Google Scholar 

  14. Hiessl, T., Karagiannis, V., Hochreiner, C., Schulte, S., Nardelli, M.: Optimal placement of stream processing operators in the fog. In: 3rd IEEE International Conference on Fog and Edge Computing, ICFEC 2019, Larnaca, Cyprus, May 14–17, 2019, pp. 1–10 (2019). https://doi.org/10.1109/CFEC.2019.8733147

  15. Skarlat, O., Nardelli, M., Schulte, S., Dustdar, S.: Towards qos-aware fog service placement. In: 1st IEEE International Conference on Fog and Edge Computing, ICFEC 2017, Madrid, Spain, May 14–15, 2017, pp. 89–96 (2017). https://doi.org/10.1109/ICFEC.2017.12

  16. Xu, X., Cao, H., Geng, Q., Liu, X., Dai, F., Wang, C.: Dynamic resource provisioning for workflow scheduling under uncertainty in edge computing environment. Concurr. Comput. Pract. Exp 34, 14 (2022). https://doi.org/10.1002/cpe.5674

    Article  Google Scholar 

  17. Renart, E.G., Veith, A.D.S., Balouek-Thomert, D., de Assunção, M.D., Lefèvre, L., Parashar, M.: Distributed operator placement for iot data analytics across edge and cloud resources. In: 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2019, Larnaca, Cyprus, May 14–17, 2019, pp. 459–468 (2019). https://doi.org/10.1109/CCGRID.2019.00060

  18. Trihinas, D., Pallis, G., Dikaiakos, M.D.: Adam: an adaptive monitoring framework for sampling and filtering on iot devices. In: 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29–November 1, 2015, pp. 717–726 (2015). https://doi.org/10.1109/BigData.2015.7363816

  19. Wen, Z., Quoc, D.L., Bhatotia, P., Chen, R., Lee, M.: Approxiot: approximate analytics for edge computing. In: 38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018, Vienna, Austria, July 2–6, 2018, pp. 411–421 (2018). https://doi.org/10.1109/ICDCS.2018.00048

  20. Cardellini, V., Grassi, V., Presti, F.L., Nardelli, M.: Optimal operator placement for distributed stream processing applications. In: Proceedings of the 10th ACM International Conference on Distributed and Event-Based Systems, DEBS ’16, Irvine, CA, USA, June 20–24, 2016, pp. 69–80 (2016). https://doi.org/10.1145/2933267.2933312

  21. Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2022). https://www.gurobi.com

  22. Li, J., Deshpande, A., Khuller, S.: Minimizing communication cost in distributed multi-query processing. In: Ioannidis, Y.E., Lee, D.L., Ng, R.T. (eds.) Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009–April 2 2009, Shanghai, China, pp. 772–783 (2009). https://doi.org/10.1109/ICDE.2009.85

  23. Kougka, G., Gounaris, A., Tsichlas, K.: Practical algorithms for execution engine selection in data flows. Future Gen. Comput. Syst. 45, 133–148 (2015). https://doi.org/10.1016/j.future.2014.11.011

    Article  Google Scholar 

  24. Peng, B., Hosseini, M., Hong, Z., Farivar, R., Campbell, R.H.: R-storm: resource-aware scheduling in storm. In: Lea, R., Gopalakrishnan, S., Tilevich, E., Murphy, A.L., Blackstock, M. (eds.) Proceedings of the 16th Annual Middleware Conference, Vancouver, BC, Canada, December 07–11, 2015, pp. 149–161 (2015). https://doi.org/10.1145/2814576.2814808

  25. Bordin, M.V., Griebler, D., Mencagli, G., Geyer, C.F.R., Fernandes, L.G.L.: Dspbench: a suite of benchmark applications for distributed data stream processing systems. IEEE Access 8, 222900–222917 (2020). https://doi.org/10.1109/ACCESS.2020.3043948

    Article  Google Scholar 

  26. Aït-Salaht, F., Desprez, F., Lebre, A.: An overview of service placement problem in fog and edge computing. ACM Comput. Surv. 53(3), 65–16535 (2020). https://doi.org/10.1145/3391196

    Article  Google Scholar 

  27. Sonkoly, B., Czentye, J., Szalay, M., Németh, B., Toka, L.: Survey on placement methods in the edge and beyond. IEEE Commun. Surv. Tutor. 23(4), 2590–2629 (2021). https://doi.org/10.1109/COMST.2021.3101460

    Article  Google Scholar 

  28. Zhu, C., Pastor, G., Xiao, Y., Li, Y., Ylä-Jääski, A.: Fog following me: latency and quality balanced task allocation in vehicular fog computing. In: 15th Annual IEEE International Conference on Sensing, Communication, and Networking, SECON 2018, Hong Kong, China, June 11–13, 2018, pp. 298–306 (2018). https://doi.org/10.1109/SAHCN.2018.8397129

  29. Benamer, A.R., Teyeb, H., Hadj-Alouane, N.B.: Latency-aware placement heuristic in fog computing environment. In: Panetto, H., Debruyne, C., Proper, H.A., Ardagna, C.A., Roman, D., Meersman, R. (eds.) On the Move to Meaningful Internet Systems. OTM 2018 Conferences - Confederated International Conferences: CoopIS, C &TC, and ODBASE 2018, Valletta, Malta, October 22–26, 2018, Proceedings, Part II. Lecture Notes in Computer Science, vol. 11230, pp. 241–257 (2018). https://doi.org/10.1007/978-3-030-02671-4_14

  30. Xia, Y., Etchevers, X., Letondeur, L., Coupaye, T., Desprez, F.: Combining hardware nodes and software components ordering-based heuristics for optimizing the placement of distributed iot applications in the fog. In: Haddad, H.M., Wainwright, R.L., Chbeir, R. (eds.) Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09–13, 2018, pp. 751–760 (2018). https://doi.org/10.1145/3167132.3167215

  31. Wan, Z., Deng, X., Cao, Z., Zhang, H.: Mobile resource aware scheduling for mobile edge environment. In: 2018 IEEE International Conference on Communications, ICC 2018, Kansas City, MO, USA, May 20–24, 2018, pp. 1–6 (2018). https://doi.org/10.1109/ICC.2018.8422631

  32. Sajjad, H.P., Danniswara, K., Al-Shishtawy, A., Vlassov, V.: Spanedge: towards unifying stream processing over central and near-the-edge data centers. In: IEEE/ACM Symposium on Edge Computing, SEC 2016, Washington, DC, USA, October 27–28, 2016, pp. 168–178 (2016). https://doi.org/10.1109/SEC.2016.17

  33. Filatov, M., Kantere, V.: Multi-workflow optimization in PAW. In: Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21–24, 2017, pp. 566–569 (2017). https://doi.org/10.5441/002/edbt.2017.71

  34. Jonathan, A., Chandra, A., Weissman, J.B.: Multi-query optimization in wide-area streaming analytics. In: Proceedings of the ACM Symposium on Cloud Computing, SoCC 2018, Carlsbad, CA, USA, October 11–13, 2018, pp. 412–425 (2018). https://doi.org/10.1145/3267809.3267842

  35. Dökeroglu, T., Bayir, M.A., Cosar, A.: Robust heuristic algorithms for exploiting the common tasks of relational cloud database queries. Appl. Soft Comput. 30, 72–82 (2015). https://doi.org/10.1016/j.asoc.2015.01.026

    Article  Google Scholar 

  36. Michiardi, P., Carra, D., Migliorini, S.: In-memory caching for multi-query optimization of data-intensive scalable computing workloads. In: Proceedings of the Workshops of the EDBT/ICDT 2019 Joint Conference, EDBT/ICDT 2019, Lisbon, Portugal, March 26, 2019. CEUR Workshop Proceedings, vol. 2322 (2019). http://ceur-ws.org/Vol-2322/DARLIAP_2.pdf

  37. Zhao, H., Sakellariou, R.: Scheduling multiple dags onto heterogeneous systems. In: 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Proceedings, 25–29 April 2006, Rhodes Island, Greece (2006). https://doi.org/10.1109/IPDPS.2006.1639387

  38. Liu, B., Xu, X., Qi, L., Ni, Q., Dou, W.: Task scheduling with precedence and placement constraints for resource utilization improvement in multi-user MEC environment. J. Syst. Archit. 114, 101970 (2021). https://doi.org/10.1016/j.sysarc.2020.101970

    Article  Google Scholar 

  39. Hülsmann, J., Traub, J., Markl, V.: Demand-based sensor data gathering with multi-query optimization. Proc. VLDB Endow. 13(12), 2801–2804 (2020). https://doi.org/10.14778/3415478.3415479

  40. Georgiou, Z., Symeonides, M., Trihinas, D., Pallis, G., Dikaiakos, M.D.: Streamsight: a query-driven framework for streaming analytics in edge computing. In: Sill, A., Spillner, J. (eds.) 11th IEEE/ACM International Conference on Utility and Cloud Computing, UCC 2018, Zurich, Switzerland, December 17–20, 2018, pp. 143–152 (2018). https://doi.org/10.1109/UCC.2018.00023

  41. Rabkin, A., Arye, M., Sen, S., Pai, V.S., Freedman, M.J.: Aggregation and degradation in jetstream: streaming analytics in the wide area. In: Mahajan, R., Stoica, I. (eds.) Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2014, Seattle, WA, USA, April 2–4, 2014, pp. 275–288 (2014)

  42. Li, Y., Chen, Y., Lan, T., Venkataramani, G.: Mobiqor: Pushing the envelope of mobile edge computing via quality-of-result optimization. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1261–1270 (2017). https://doi.org/10.1109/ICDCS.2017.54

  43. Aniello, L., Baldoni, R., Querzoni, L.: Adaptive online scheduling in storm. In: Chakravarthy, S., Urban, S.D., Pietzuch, P.R., Rundensteiner, E.A. (eds.) The 7th ACM International Conference on Distributed Event-Based Systems, DEBS ’13, Arlington, TX, USA, June 29–July 03, 2013, pp. 207–218. ACM. https://doi.org/10.1145/2488222.2488267

  44. Xu, J., Chen, Z., Tang, J., Su, S.: T-storm: traffic-aware online scheduling in storm. In: IEEE 34th International Conference on Distributed Computing Systems, ICDCS 2014, Madrid, Spain, June 30–July 3, 2014, pp. 535–544 (2014). https://doi.org/10.1109/ICDCS.2014.61

  45. Liu, X., Buyya, R.: D-storm: dynamic resource-efficient scheduling of stream processing applications. In: 23rd IEEE International Conference on Parallel and Distributed Systems, ICPADS 2017, Shenzhen, China, December 15–17, 2017, pp. 485–492 (2017). https://doi.org/10.1109/ICPADS.2017.00070

  46. Eskandari, L., Mair, J., Huang, Z., Eyers, D.M.: T3-scheduler: a topology and traffic aware two-level scheduler for stream processing systems in a heterogeneous cluster. Future Gen. Comput. Syst. 89, 617–632 (2018). https://doi.org/10.1016/j.future.2018.07.011

    Article  Google Scholar 

  47. Nasiri, H., Nasehi, S., Divband, A., Goudarzi, M.: A scheduling algorithm to maximize storm throughput in heterogeneous cluster. J. Big Data 10, 103 (2023). https://doi.org/10.1186/s40537-023-00771-y

    Article  Google Scholar 

  48. Hadian, H., Farrokh, M., Sharifi, M., Jafari, A.: An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters. J. Supercomput. 79, 461–498 (2023). https://doi.org/10.1007/s11227-022-04669-z

    Article  Google Scholar 

  49. Hadian, H., Sharifi, M.: GT-scheduler: a hybrid graph-partitioning and tabu-search based task scheduler for distributed data stream processing systems. Clust. Comput. (2024). https://doi.org/10.1007/s10586-023-04260-y

    Article  Google Scholar 

Download references

Acknowledgements

The research work was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant (Project Number:1052, Project Name: DataflowOpt).

Funding

The research work was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant (Project Number:1052, Project Name: DataflowOpt).

Author information

Authors and Affiliations

Authors

Contributions

A-V.M conducted the main research, which was supervised by A.G. The system implementation was performed by A-V.M and C.B. A-V.M and A.G both contributed to the writing. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Anna-Valentini Michailidou, Christos Bellas or Anastasios Gounaris.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Michailidou, AV., Bellas, C. & Gounaris, A. Optimizing task allocation in multi-query edge analytics. Cluster Comput 27, 8289–8306 (2024). https://doi.org/10.1007/s10586-024-04427-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-024-04427-1

Keywords