Skip to main content

Advertisement

Log in

Task reduction using regression-based missing data imputation in sparse mobile crowdsensing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Mobile Crowd Sensing (MCS) involves allocation of sensing tasks associated with an area of interest to a crowd of participants over time. Consequently, the collective amount of time and energy spent on sensing can be quite large. Sparse Mobile Crowd Sensing (Sparse-MCS) aims at reducing this overhead by reducing the number of sensing tasks, which results in obtaining sensed values from only some portions of the area or time. For those portions which are not thus covered, their corresponding values can be inferred from the collected sensed values. Hence, missing data inference is an integral part of Sparse-MCS. This study is divided into two phases: First, we explore the viability of using machine learning, viz., regression for missing data inference in Sparse-MCS. Hence, we explore several representative regression algorithms such as Linear Regression, LASSO, Elastic Net, Ridge, Decision Tree (DT), Random Forest (RF) and KNN. Using two real data-sets, we conclude that some algorithms such as DT and RF exhibit good performance (giving normalized mean absolute error much less than 0.1 most of the time) whereas the rest do not. Moreover, we compare these techniques with a state-of-the-art missing data inference method known as Compressing Sensing with the help of simulation results. Next, we propose a divide-and-conquer polynomial-time algorithm for task reduction which is based on the proposed inference approach. We also present the results of the analysis of the algorithm in terms of: (i) its time complexity, and (ii) lower and upper bounds on task reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37

Similar content being viewed by others

References

  1. Ganti RK, Ye F, Lei H (2011) Mobile crowdsensing: current state and future challenges. IEEE Commun Mag 49(11):32–39

    Article  Google Scholar 

  2. Wang L, Zhang D, Wang Y, Chen C, Han X, M’hamed A (2016) Sparse mobile crowdsensing: challenges and opportunities: IEEE Commun Mag 54(7):161–167

  3. Dutta P, et al (2009) Demo Abstract: Common Sense: participatory urban sensing using a network of handheld air quality monitors. In: Proceedings of the ACM SenSys, pp 349–50

  4. Zhang X, Xie Z, Hu L, Huang Y, Pang J (2021) A semiopportunistic task allocation framework for mobile crowdsensing with deep learning. In: Wireless Communications and Mobile Computing, Hindawi, pp 1530–8669, https://doi.org/10.1155/2021/6643229,

  5. Yin H, Yu Z, Wang L, Wang J, Han L, Guo B (2021) ISIATasker: task allocation for instant-sensing-instant-actuation mobile crowd sensin. IEEE Internet Things J. https://doi.org/10.1109/JIOT.2021.3095160

  6. Ji J, Guo Y, Gong D, Shen X (2021) Evolutionary multi-task allocation for mobile crowdsensing with limited resource. Swarm Evol Comput, 63. https://doi.org/10.1016/j.swevo.2021.100872

  7. Liu W, Yang Y, Wang E, Wu J (2020) User recruitment for enhancing data inference accuracy in sparse mobile crowdsensing. IEEE Internet Things J 7(3):1802–1814. https://doi.org/10.1109/JIOT.2019.2957399

    Article  Google Scholar 

  8. Kong L, Xia M, Liu X-Y, Chen G, Gu Y, Wu M-Y, Liu X (2014) Data loss and reconstruction in wireless sensor networks. IEEE Trans Parallel Distrib Syst 25(11):2818–2828. https://doi.org/10.1109/TPDS.2013.269

    Article  Google Scholar 

  9. Zhu Y, Li Z, Zhu H, Li M, Zhang QA (2013) Compressive sensing approach to urban traffic estimation with probe vehicles. IEEE Trans Mob Comput 12(11):2289–2302

    Article  Google Scholar 

  10. Wang L, Zhang D, Pathak A, Chen C, Xiong H, Yang D, Wang Y (2015) CCS-TA: quality-guaranteed online task allocation in compressing crowdsensing. In: Proceedings of UBICOMP 2015, Sep 7–11, Osaka, Japan

  11. Wang L, Zhang D, Yang D, Pathak A, Chen C, Han X, Xiong H, Wang Y (2017) SPACE-TA: cost-effective task allocation exploiting intradata and interdata correlations in sparse crowdsensing. ACM Trans Intell Syst Technol 9(2), article 20

  12. Wang L, Liu W, Zhang D, Wang Y, Wang E, Yang Y (2018) Cell selection with deep reinforcement learning in sparse mobile crowdsensing. In: Proceedings of 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp 1543-1546. https://doi.org/10.1109/ICDCS.2018.00164

  13. Marchang N, Tripathi R (2020) KNN-ST: exploiting spatio-temporal correlation for missing data inference in environmental crowd sensing. IEEE Sensors, early access article; https://doi.org/10.1109/JSEN.2020.3024976

  14. Jerez JM, Molina I, García-Laencina PJ, Alba E, Ribelles N (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med 50(2):105–115

  15. Lakshminarayan K, Harp SA, Goldman R, Samad T (1996) Imputation of missing data using machine learning techniques. In: Proceedings of Second International Conference on Knowledge Discovery and Data Mining, edited by Simoudis, Han and Fayyad, pp 140–145

  16. Lakshminarayan K, Har S, Samad T (1999) Imputation of missing data in industrial databases. Appl Intell 11:259–275

    Article  Google Scholar 

  17. Qin Y, Zhang S, Zhu X, Zhang J, Zhang C (2007) Semi-parametric optimization for missing data imputation. Appl Intell 27(1):79–88

    Article  Google Scholar 

  18. Pyle D (1999) Data Preparation for Data Mining. Morgan Kaufmann

  19. Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann

  20. Friedman JH, Kohavi R, Yun Y (1996) Lazy decision trees. In: Proceedings of National Conference on Artificial Intelligence, pp 717–724

  21. Cheeseman P, Stutz J (1996) Bayesian classification (Autoclass): theory and results. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthirusamy R (eds) Advances in Knowledge Discovery and Data Mining

  22. White AP (1987) Probabilistic induction by dynamic path generation in virtual trees. In: Bramer MA (ed) Research and Development in Expert Systems III, pp. 35–46

  23. Shi W, Zhu Y, Zhang J, Tao X, Sheng G, Lian Y, Wang G, Chen Y (2015) Improving power grid monitoring data quality: an efficient machine learning framework for missing data prediction. In: Proceedings of 17th International Conference on High Performance Computing and Communications, pp 417–422 (2015)

  24. Ma J, Cheng JC, Jiang F, Chen W, Wang M, Zhai C (2020) A bi-directional missing data imputation scheme based on LSTM and transfer learning for building energy data. Energy Build, 216(109941)

  25. Ingelrest F, Barrenetxea G, Schaefer G, Vetterli M, Couach O, Parlange M (2010) Sensorscope: application-specific sensor network for environmental monitoring. ACM Trans Sens Netw 6(2): 1–32

  26. Zheng Y, Liu F, Hsieh H-P (2013) U-air: when urban air quality inference meets big data. In: KDD 1436–1444

  27. Shang J, Zheng Y, Tong W, Chang E, Yu Y (2014) Inferring gas consumption and pollution emission of vehicles throughout a city. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1027–1036

  28. Alswailim MA, Hassanein HS, Zulkernine M (2015) CRAWDAD dataset queensu/crowd\_temperature (v. 2015-11-20): derived from roma/taxi (v. 2014-07-17), downloaded from https://crawdad.org/queensu/crowd\_temperature/20151120, https://doi.org/10.15783/C7CG65

  29. https://www.kaggle.com/bappekim/air-pollution-in-seoul

  30. Durán-Rosal AM, Herv/’as-Martínez C, Tallón-Ballesteros AJ (2016) Massive missing data reconstruction in ocean buoys with evolutionary product unit neural networks. Ocean Eng, 117:292—301

  31. Tak S, Woo S, Yeo H (2016) Data-driven imputation method for traffic data in sectional units of road links. IEEE Trans Intell Transp Syst 17:1762–1771

    Article  Google Scholar 

  32. Tonini F, Dillon WW, Money ES (2016) Spatio-temporal reconstruction of missing forest microclimate measurements. Agric For Meteorol 2016(218–219):1–10

    Article  Google Scholar 

  33. Londhe S, Dixit P, Shah S (2015) Infilling of missing daily rainfall records using artificial neural network. ISH J Hydraul Eng 2015(21):255–264

    Article  Google Scholar 

  34. Tipton J, Hooten M, Goring S (2017) Reconstruction of spatio-temporal temperature from sparse historical records using robust probabilistic principal component regression. Adv Stat Clim Meteorol Oceanogr 2017(3):1–16

    Article  Google Scholar 

  35. Ruan W, Xu P, Sheng QZ (2017) Recovering missing values from corrupted spatio-temporal sensory data via robust low-rank tensor completion. In: Proceedings of International Conference on Database Systems for Advanced Applications, Springer: Cham

  36. Cheng S, Lu F, Peng P, Wu S (2018) A spatiotemporal multi-view-based learning method for short-term traffic forecasting. ISPRS Int J Geo-Inf 7:218. https://doi.org/10.3390/ijgi7060218

    Article  Google Scholar 

  37. Luo X, Li D, Yang Y, Zhang S (2019) Spatiotemporal traffic flow prediction with KNN and LSTM. J Adv Transp, Volume (2019), Article ID 4145353. https://doi.org/10.1155/2019/4145353

  38. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  Google Scholar 

  39. Zhu H, Zhu Y, Li M, Ni LM (2009) Seer: metropolitan-scale traffic perception based on lossy sensory data. In: Proceedings of IEEE INFOCOM

  40. Zhang Y, Roughan M, Willinger W, Qiu L (2019) Spatio-temporal compressive sensing and internet traffic matrices. In: SIGCOMM 2019, pp 267–278

  41. Baraniuk R (2007) Compressing sensing. IEEE Signal Process Mag 24(4):118–121

    Article  Google Scholar 

  42. Candes EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489–509

    Article  MathSciNet  Google Scholar 

  43. Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306

    Article  MathSciNet  Google Scholar 

  44. Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9(6):717–772

    Article  MathSciNet  Google Scholar 

  45. Xiaofeng L et al (2020) Spatial imputation for air pollutants data sets via low rank matrix completion algorithm. Environ Int 139:105713

    Article  Google Scholar 

  46. Wang Z, Lai M-J, Lu Z, Fan W, Davulcu H, Ye J (2014) Rank-one matrix pursuit for matrix completion. In: Proceedings of International Conference on Machine Learning, Beijing, China, pp 91–99

  47. Gotoh JY, Takeda A, Tono K (2018) DC formulations and algorithms for sparse optimization problems. Math Program 169:141–176

  48. Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition, pp 5115–5124

  49. Etemadi M, Ghobaei-Arani M, Shahidinejad A (2020) Resource provisioning for IoT services in the fog computing environment: an autonomic approach. Comput Commun 161:109–131. https://doi.org/10.1016/j.comcom.2020.07.028

  50. Aslanpour MS, Dashti SE, Ghobaei-Arani M, Rahmanian AA (2018) Resource provisioning for cloud applications: a 3-D, provident and flexible approach. J Supercomput, 74(12):6470–6501. https://doi.org/10.1007/s11227-017-2156-x

  51. Ghobaei-Arani M, Shahidinejad A (2021) An efficient resource provisioning approach for analyzing cloud workloads: a metaheuristic-based clustering approach. J Supercomput 77(1):711–750. https://doi.org/10.1007/s11227-020-03296-w

    Article  Google Scholar 

  52. Cormen TH, Leiserson CE, Rivest LR, Stien C. Introduction to Algorithms, 3rd Edition. MIT Press

Download references

Acknowledgements

This work is partially supported by DST-SERB, Government of India under grant EEQ/2017/000083.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ningrinla Marchang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marchang, N., Meitei, G.M. & Thakur, T. Task reduction using regression-based missing data imputation in sparse mobile crowdsensing. J Supercomput 78, 15995–16028 (2022). https://doi.org/10.1007/s11227-022-04518-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04518-z

Keywords

Navigation