Skip to main content

Advertisement

Log in

Cost-aware prediction service pricing with incomplete information

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Trading the machine learning-based prediction services has been up-and-coming for individuals and small companies. It serves to directly provide the predictions, e.g., classifications, for consumers without domain knowledge. Existing prediction service pricing methods closely rely on the strong assumption of completely known information on service quality and consumers’ valuations. In this paper, we study the profit maximization problem of pricing prediction services under incomplete information for the first time. We propose a novel Service Market model, named SMELT, considering multiple types of customers with dEmand and quaLity-aware valuaTions. We first derive the theoretical optimal solution to maximize service profit with complete information. Then, we develop an effective framework PSPricer under the profit ratio guarantee to solve the profit maximization problem with incomplete information. It is capable of not only efficiently getting the sub-optimal service price with bounded revenue loss, but also effectively learning the service quality function with the maximum likelihood estimation. Moreover, due to the resource-intensive and costly characteristic of machine learning model inference, we further extend the SMELT model to consider the inevitable inference cost in service trading. We formulate a novel cost-aware profit maximization problem and derive the general optimal solution. The PSPricer framework is tailored with an effective heuristic to maximize the cost-aware profit with the theoretical profit ratio guarantee. Extensive experiments on real-life datasets demonstrate our theoretical findings and the effectiveness and efficiency of PSPricer in various settings, compared with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 6
Algorithm 5
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. ABBYY: Abbyy OCR service (2022). https://www.abbyy.com/

  2. Agarwal, A., Dahleh, M.A., Sarkar, T.: A marketplace for data: An algorithmic solution. In: EC, pp. 701–726 (2019)

  3. AggData: Aggdata market (2022). https://terbine.com

  4. Alsheikh, M.A., Hoang, D.T., Niyato, D., Leong, D., Wang, P., Han, Z.: Optimal pricing of internet of things: a machine learning approach. IEEE J. Sel. Areas Commun. 38(4), 669–684 (2020)

    Article  Google Scholar 

  5. Amazon: Amazon mechanical turk (2022). https://www.mturk.com/

  6. Amazon: Amazon text classification service (2022). https://aws.amazon.com/marketplace/pp/prodview-ztfazwmcw7pxc

  7. Amazon: Aws marketplace (2022). https://aws.amazon.com/marketplace

  8. An, B., Xiao, M., Liu, A., Xie, X., Zhou, X.: Crowdsensing data trading based on combinatorial multi-armed bandit and stackelberg game. In: ICDE, pp. 253–264 (2021)

  9. Azcoitia, S.A., Iordanou, C., Laoutaris, N.: Understanding the price of data in commercial data marketplaces. In: ICDE, pp. 3718–3728 (2023)

  10. Azcoitia, S.A., Laoutaris, N.: Try before you buy: A practical data purchasing algorithm for real-world data marketplaces. In: Proceedings of the 1st International Workshop on Data Economy, pp. 27–33 (2022)

  11. Bertin-Mahieux, T.: Year prediction MSD. UCI machine learning repository (2011). DOI: https://doi.org/10.24432/C50K61

  12. Bishop, C.M.: Pattern recognition and machine learning. Springer-Verlag, Berlin, Heidelberg (2006)

    MATH  Google Scholar 

  13. Boston, K., Bettinger, P.: An analysis of monte carlo integer programming, simulated annealing, and tabu search heuristics for solving spatial harvest scheduling problems. Forest Sci. 45(2), 292–301 (1999)

    Article  MATH  Google Scholar 

  14. Cai, H., Ye, F., Yang, Y., Zhu, Y., Li, J., Xiao, F.: Online pricing and trading of private data in correlated queries. IEEE Trans. Parallel Distrib. Syst. 33(3), 569–585 (2022)

    Article  MATH  Google Scholar 

  15. Cai, Z., Zheng, X., Wang, J., He, Z.: Private data trading towards range counting queries in internet of things. IEEE Trans. Mob. Comput. 22(8), 4881–4897 (2023)

    Article  MATH  Google Scholar 

  16. Castelo, S., Rampin, R., Santos, A., Bessa, A., Chirigati, F., Freire, J.: Auctus: a dataset search engine for data discovery and augmentation. Proc. VLDB Endow. 14(12), 2791–2794 (2021)

    Article  Google Scholar 

  17. Chapman, A., Simperl, E., Koesten, L., Konstantinidis, G., Ibáñez, L.D., Kacprzak, E., Groth, P.: Dataset search: a survey. VLDB J. 29(1), 251–272 (2020)

    Article  Google Scholar 

  18. Chawla, S., Deep, S., Koutris, P., Teng, Y.: Revenue maximization for query pricing. Proc. VLDB Endow. 13(1), 1–14 (2019)

    Article  MATH  Google Scholar 

  19. Chen, C., Yuan, Y., Wen, Z., Wang, G., Li, A.: GQP: A framework for scalable and effective graph query-based pricing. In: ICDE, pp. 1573–1585 (2022)

  20. Chen, L., Acun, B., Ardalani, N., Sun, Y., Kang, F., Lyu, H., Kwon, Y., Jia, R., Wu, C., Zaharia, M., Zou, J.: Data acquisition: A new frontier in data-centric AI. CoRR (2023)

  21. Chen, L., Jin, Z., Eyuboglu, S., Qu, H., Ré, C., Zaharia, M., Zou, J.: Hapi explorer: comprehension, discovery, and explanation on history of ml apis. AAAI 37, 16416–16418 (2023)

    Article  Google Scholar 

  22. Chen, L., Koutris, P., Kumar, A.: Towards model-based pricing for machine learning in a data marketplace. In: SIGMOD, pp. 1535–1552 (2019)

  23. Chen, L., Zaharia, M., Zou, J.: Efficient online ML API selection for multi-label classification tasks. ICML 162, 3716–3746 (2022)

    MATH  Google Scholar 

  24. Chen, Y., Seshadri, S.: Product development and pricing strategy for information goods under heterogeneous outside opportunities. Inf. Syst. Res. 18(2), 150–172 (2007)

    Article  MATH  Google Scholar 

  25. Chen, Z., Zhang, H., Li, X., Miao, Y., Zhang, X., Zhang, M., Ma, S., Deng, R.H.: FDFL: fair and discrepancy-aware incentive mechanism for federated learning. IEEE Trans. Inf. Forensics Secur. 19, 8140–8154 (2024)

    Article  MATH  Google Scholar 

  26. Cheng, K., Wang, L., Shen, Y., Liu, Y., Wang, Y., Zheng, L.: A lightweight auction framework for spectrum allocation with strong security guarantees. In: INFOCOM, pp. 1708–1717 (2020)

  27. CitizenMe: Citizenme market (2022). https://www.citizenme.com/

  28. Cloud, G.: Ocr service price on google cloud (2022). https://cloud.google.com/vision/on-prem/pricing

  29. Datarade: B2B contact data on datarade (2022). https://datarade.ai/

  30. Deep, S., Koutris, P.: QIRANA: A framework for scalable query pricing. In: SIGMOD, pp. 699–713 (2017)

  31. Ding, D., Xu, B., Lakshmanan, L.V.: Occam: Towards cost-efficient and accuracy-aware image classification inference. arXiv preprint arXiv:2406.04508 (2024)

  32. Feng, Z., Lahaie, S., Schneider, J., Ye, J.: Reserve price optimization for first price auctions in display advertising. In: ICML, pp. 3230–3239 (2021)

  33. Gorham, T.: Online surveys are not enough (2022). https://www.greenbook.org/mr/market-research-methodology/online-surveys-are-not-enough/

  34. Han, M., Light, J., Xia, S., Galhotra, S., Fernandez, R.C., Xu, H.: A data-centric online market for machine learning: From discovery to pricing. arXiv preprint arXiv:2310.17843 (2023)

  35. Handel, B.R., Misra, K.: Robust new product pricing. Mark. Sci. 34(6), 864–881 (2015)

    Article  MATH  Google Scholar 

  36. He, Z., Cai, Z.: Trading aggregate statistics over private internet of things data. IEEE Trans. Comput. 73(2), 394–407 (2024)

    Article  MathSciNet  MATH  Google Scholar 

  37. Hebrail, G., Berard, A.: Individual household electric power consumption. UCI machine learning repository (2012). DOI: https://doi.org/10.24432/C58K54

  38. Hou, H., Qiao, L., Yuan, Y., Chen, C., Wang, G.: A scalable query pricing framework for incomplete graph data. In: DASFAA, pp. 97–113 (2023)

  39. Hu, X., Guo, Y., Hao, J., Li, C.: Co-deployment of UAV and ground base station for data collection in wireless sensor networks. In: ICC, pp. 1–6 (2020)

  40. Hu, Y., Ghosh, R., Govindan, R.: Scrooge: A cost-effective deep learning inference system. In: SoCC, pp. 624–638 (2021)

  41. Information, T.: Microsoft readies ai chip as machine learning costs surge (2024). https://www.theinformation.com/articles/microsoft-readies-ai-chip-as-machine-learning-costs-surge

  42. Insights, F.B.: Machine learning market to reach usd 117.19 billion by 2027 (2020). https://www.globenewswire.com/news-release/2020/07/17/2063938/0/en/Machine-Learning-Market-to-Reach-USD-117-19-Billion-by-2027- Increasing-Popularity-of-Self-Driving-Cars-to-Propel-Demand-from-Automotive-Industry-says-Fortune-Business-Insights.html

  43. Juda, A.I., Parkes, D.C.: The sequential auction problem on eBay: an empirical analysis and a solution. In: EC, pp. 180–189 (2006)

  44. Just, H.A., Kang, F., Wang, T., Zeng, Y., Ko, M., Jin, M., Jia, R.: LAVA: data valuation without pre-specified learning algorithms. In: ICLR (2023)

  45. LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database (2010). https://yann.lecun.com/exdb/mnist/

  46. Li, B., Samsi, S., Gadepally, V., Tiwari, D.: Kairos: Building cost-efficient machine learning inference systems with heterogeneous cloud resources. In: HDPC, pp. 3–16 (2023)

  47. Li, M., Li, P., Guo, L., Huang, X.: PPER: Privacy-preserving economic-robust spectrum auction in wireless networks. In: INFOCOM, pp. 909–917 (2015)

  48. Li, Q., Li, Z., Zheng, Z., Wu, F., Tang, S., Zhang, Z., Chen, G.: Capitalize your data: optimal selling mechanisms for IoT data exchange. IEEE Trans. Mob. Comput. 22(4), 1988–2000 (2023)

    Article  MATH  Google Scholar 

  49. Li, Y., Shen, Y., Chen, L.: Camel: Managing data for efficient stream learning. In: SIGMOD, pp. 1271–1285 (2022)

  50. Li, Y., Yu, X., Koudas, N.: Data acquisition for improving machine learning models. Proc. VLDB Endow. 14(10), 1832–1844 (2021)

    Article  MATH  Google Scholar 

  51. Li, Z., Ding, B., Yao, L., Li, Y., Xiao, X., Zhou, J.: Performance-based pricing for federated learning via auction. Proc. VLDB Endow. 17(6), 1269–1282 (2024)

    Article  MATH  Google Scholar 

  52. Lim, B.H., Lee, H.S.: Portfolio decision with a quadratic utility and inflation risk. Adv. Differ. Equ. 2018(1), 1–16 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  53. Liu, J., Lou, J., Liu, J., Xiong, L., Pei, J., Sun, J.: Dealer: an end-to-end model marketplace with differential privacy. Proc. VLDB Endow. 14(6), 957–969 (2021)

    Article  MATH  Google Scholar 

  54. MacKie-Mason, J.K., Varian, H.R.: Pricing congestible network resources. IEEE J. Sel. Areas Commun. 13(7), 1141–1149 (1995)

    Article  MATH  Google Scholar 

  55. Maffii, S., Parolin, R., Ponti, M.: Social marginal cost pricing and second best alternatives in partnerships for transport infrastructures. Res. Transp. Econ. 30(1), 23–28 (2010)

    Article  Google Scholar 

  56. Mankiw, N.G.: Principles of microeconomics (1998)

  57. Mathpix: Mathpix OCR service (2022). https://mathpix.com/

  58. Miao, X., Gao, Y., Chen, L., Peng, H., Yin, J., Li, Q.: Towards query pricing on incomplete data. IEEE Trans. Knowl. Data Eng. 34(8), 4024–4036 (2022)

    Article  MATH  Google Scholar 

  59. Miao, X., Peng, H., Chen, K., Peng, Y., Gao, Y., Yin, J.: Maximizing time-aware welfare for mixed items. In: ICDE, pp. 1044–1057 (2022)

  60. Miao, X., Peng, H., Huang, X., Chen, L., Gao, Y., Yin, J.: Modern data pricing models: Taxonomy and comprehensive survey. arXiv preprint arXiv:2306.04945 (2023)

  61. Mirzasoleiman, B., Bilmes, J.A., Leskovec, J.: Coresets for data-efficient training of machine learning models. In: ICML, pp. 6950–6960 (2020)

  62. Misra, K., Schwartz, E.M., Abernethy, J.: Dynamic online pricing with incomplete information using multiarmed bandit experiments. Mark. Sci. 38(2), 226–252 (2019)

    Article  MATH  Google Scholar 

  63. Misra, K., Schwartz, E.M., Abernethy, J.D.: Dynamic online pricing with incomplete information using multiarmed bandit experiments. Mark. Sci. 38(2), 226–252 (2019)

    Article  MATH  Google Scholar 

  64. Newey, W.K., McFadden, D.: Large sample estimation and hypothesis testing. Handb. Econ. 4, 2111–2245 (1994)

    MathSciNet  MATH  Google Scholar 

  65. Niazadeh, R., Hartline, J.D., Immorlica, N., Khani, M.R., Lucier, B.: Fast core pricing for rich advertising auctions. Oper. Res. 70(1), 223–240 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  66. Niu, C., Zheng, Z., Wu, F., Tang, S., Chen, G.: Online pricing with reserve price constraint for personal data markets. In: ICDE, pp. 1978–1981 (2020)

  67. Niu, C., Zheng, Z., Wu, F., Tang, S., Gao, X., Chen, G.: Unlocking the value of privacy: Trading aggregate statistics over private correlated data. In: SIGKDD, pp. 2031–2040 (2018)

  68. Niyato, D., Alsheikh, M.A., Wang, P., Kim, D.I., Han, Z.: Market model and optimal pricing scheme of big data and Internet of Things (IoT). In: ICC, pp. 1–6 (2016)

  69. Oshima, H.: Improved randomized algorithm for k-submodular function maximization. SIAM J. Discret. Math. 35(1), 1–22 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  70. Park, Y., Qing, J., Shen, X., Mozafari, B.: BlinkML: Efficient maximum likelihood estimation with probabilistic guarantees. In: SIGMOD, pp. 1135–1152 (2019)

  71. Paschalidis, I.C., Tsitsiklis, J.N.: Congestion-dependent pricing of network services. IEEE/ACM Trans. Netw. 8(2), 171–184 (2000)

    Article  MATH  Google Scholar 

  72. Pei, J.: A survey on data pricing: from economics to data science. IEEE Trans. Knowl. Data Eng. 34(10), 4586–4608 (2020)

    Article  MATH  Google Scholar 

  73. Peng, H., Miao, X., Chen, L., Gao, Y., Yin, J.: Pricing prediction services for profit maximization with incomplete information. In: ICDE, pp. 1353–1365 (2023)

  74. Roughgarden, T.: Algorithmic game theory. Commun. ACM 53(7), 78–86 (2010)

    Article  MATH  Google Scholar 

  75. Salmani, M., Ghafouri, S., Sanaee, A., Razavi, K., Mühlhäuser, M., Doyle, J., Jamshidi, P., Sharifi, M.: Reconciling high accuracy, cost-efficiency, and low latency of inference serving systems. In: EuroMLSys, pp. 78–86 (2023)

  76. Sun, P., Che, H., Wang, Z., Wang, Y., Wang, T., Wu, L., Shao, H.: Pain-FL: personalized privacy-preserving incentive for federated learning. IEEE J. Sel. Areas Commun. 39(12), 3805–3820 (2021)

    Article  MATH  Google Scholar 

  77. Sun, P., Chen, X., Liao, G., Huang, J.: A profit-maximizing model marketplace with differentially private federated learning. In: INFOCOM, pp. 1439–1448 (2022)

  78. Trovo, F., Paladino, S., Restelli, M., Gatti, N., et al.: Multi-armed bandit for pricing. In: Proceedings of the 12th European Workshop on Reinforcement Learning, pp. 1–9 (2015)

  79. VWO: A/b test pricing (2024). https://vwo.com/blog/ab-testing-price-testing/

  80. Whiteson, D.: SUSY. UCI machine learning repository (2014). DOI: https://doi.org/10.24432/C54606

  81. Wright, K.B.: Researching internet-based populations: advantages and disadvantages of online survey research, online questionnaire authoring software packages, and web survey services. J. Comput.-Med. Commun. 10(3), 1034 (2005)

    MATH  Google Scholar 

  82. Xiao, Y., Liu, L., Huang, G., Cui, Q., Huang, S., Shi, S., Chen, J.: Bitiimt: A bilingual text-infilling method for interactive machine translation. In: ACL, pp. 1958–1969 (2022)

  83. Xu, A., Zheng, Z., Li, Q., Wu, F., Chen, G.: Vap: Online data valuation and pricing for machine learning models in mobile health. IEEE Trans. Mob, Comput (2023)

    MATH  Google Scholar 

  84. Xu, X., Wu, Z., Foo, C.S., Low, B.K.H.: Validation free and replication robust volume-based data valuation. In: NeurIPS, pp. 10,837–10,848 (2021)

  85. Yi, C., Cai, J., Zhang, G.M.: Spectrum auction for differential secondary wireless service provisioning with time-dependent valuation information. IEEE Trans. Wirel. Commun. 16(1), 206–220 (2017)

    Article  MATH  Google Scholar 

  86. Zhang, C., Yu, M., Wang, W., Yan, F.: \(\{\)MArk\(\}\): Exploiting cloud services for \(\{\)cost-effective\(\}\),\(\{\)SLO-aware\(\}\) machine learning inference serving. In: USENIX ATC, pp. 1049–1062 (2019)

  87. Zhang, M., Arafa, A., Wei, E., Berry, R.: Optimal and quantized mechanism design for fresh data acquisition. IEEE J. Sel. Areas Commun. 39(5), 1226–1239 (2021)

  88. Zhang, M., Beltrán, F., Liu, J.: A survey of data pricing for data marketplaces. IEEE Trans. Big Data 9(4), 1038–1056 (2023)

    Article  MATH  Google Scholar 

  89. Zhang, Z., Pfister, T.: Learning fast sample re-weighting without reward data. In: ICCV, pp. 705–714 (2021)

  90. Zheng, Z., Peng, Y., Wu, F., Tang, S., Chen, G.: ARETE: on designing joint online pricing and reward sharing mechanisms for mobile data markets. IEEE Trans. Mob. Comput. 19(4), 769–787 (2019)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the National Natural Science Foundation of China under Grants No. 62372404, No. 62125206, and No. 61825205, Leading Goose R&D Program of Zhejiang under Grant No. 2024C01109, Project of National Office for Philosophy and Social Sciences under Grant No. 22 &ZD154, and the Fundamental Research Funds for the Central Universities under Grant No. 226-2024-00030. Jinshan Zhang was supported by National Natural Science Foundation Project of China under Grant No. 62472376, the Key Research and Development Jianbing Program of Zhejiang Province under Grant No. 2023C01002, Hangzhou Major Project and Development Program under Grant No. 2022AIZD0140, and Yongjiang Talent Introduction Program under Grant No. 2022A-236-G. Xiaoye Miao is the corresponding author of the work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoye Miao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, H., Miao, X., Zhang, J. et al. Cost-aware prediction service pricing with incomplete information. The VLDB Journal 34, 28 (2025). https://doi.org/10.1007/s00778-025-00909-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00778-025-00909-9

Keywords