Skip to main content

GoAT: File Geolocation via Anchor Timestamping

  • Conference paper
  • First Online:
Financial Cryptography and Data Security (FC 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14745))

Included in the following conference series:

  • 89 Accesses

Abstract

Decentralized storage systems are a crucial component of the rapidly growing blockchain ecosystem. They aim to achieve robustness by proving that they store multiple replicas of every file. They have a serious limitation, though: They cannot prove that file replicas are spread across distinct systems, e.g., different hard drives. Consequently, files are vulnerable to loss in a single, locally catastrophic event.

We introduce a new primitive, Proof of Geo-Retrievability or PoGeoRet, that proves that a file is located within a strict geographic boundary. Using PoGeoRet, one can, for example, prove that a file is spread across several distinct geographic regions—and by extension across multiple systems, e.g., hard drives. We define what it means for a PoGeoRet scheme to be complete and sound, extending prior formalism in key ways.

We also propose GoAT, a practical PoGeoRet scheme to prove file geolocation. Unlike previous geolocation systems that only offer nominal geolocation guarantees and require dedicated anchors, GoAT geolocates provers using any timestamping server on the internet with a fixed, known location as a geolocation anchor. GoAT’s geolocation guarantees directly depend on the physical constraints of the internet, making them very reliable. GoAT internally uses a communication-efficient Proof-of-Retrievability (PoRet) scheme in a novel way to achieve constant-size PoRet-component in its proofs.

We validate GoAT’s practicality by conducting an initial measurement study to find usable anchors and perform a real-world experiment. The results show that a significant fraction of the internet can be used as anchors and that GoAT achieves geolocation radii as low as 500 km.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Some nations only require that a copy of data be stored locally whereas more stricter laws make transferring data abroad illegal [26]. Our techniques suffice for the former but the latter would additionally require the use of trusted hardware.

  2. 2.

    TLS 1.2 is still almost universally supported. Roughtime is an emerging protocol adopted by Google, Cloudflare, etc.

  3. 3.

    These databases are known to have some errors [24] and a rigorous geolocation experiment like [42] would have to be done before deploying GoAT.

  4. 4.

    We use a 100 IOPS, 30 GB gpt2 SSD instead of io2 SSD as the latter is more expensive. We did not find a significant impact on commit times due to this.

  5. 5.

    AWS bandwidth costs vary by region, ranging from \(\$20\)\(\$100\) per TB transferred [8]. S3 charges also vary by region, we use the maximum above.

  6. 6.

    In theory, also works as . But for perfect divisors, e.g., ms, this can only be achieved with perfect time synchronization and ideal network conditions, making it impossible in practice.

References

  1. Alexa top sites. https://www.alexa.com/topsites. Accessed Apr 2021

  2. Coinmarketcap, cryptocurrency market prices. https://coinmarketcap.com/. Accessed Aug 2022

  3. Filecoin aims to use blockchain to make decentralized storage resilient and hard to censor (2021). https://www.infoq.com/news/2021/02/filecoin-blockchain-storage/. Accessed July 2022

  4. Linux-native asynchronous i/o access library (2021). https://pagure.io/libaio. Accessed July 2022

  5. Metric space (2021). https://en.wikipedia.org/wiki/Metric_space. Accessed July 2022

  6. Ssd userbenchmarks - 1058 solid state drives compared (2021). https://ssd.userbenchmark.com/. Accessed Apr 2021

  7. Malhotra, A., Langley, A., Ladd, W.: Roughtime (2020). https://datatracker.ietf.org/doc/html/draft-roughtime-aanchal

  8. Amazon: Aws ec2 costs (2021). https://aws.amazon.com/ec2/pricing/on-demand/. Accessed Apr 2021

  9. Amazon: Aws s3 (2021). https://aws.amazon.com/s3/

  10. Aranha, D.F., Gouvêa, C.P.L., Markmann, T., Wahby, R.S., Liao, K.: RELIC is an Efficient LIbrary for Cryptography. https://github.com/relic-toolkit/relic

  11. Ateniese, G., et al.: Provable data possession at untrusted stores. In: ACM CCS, pp. 598–609 (2007)

    Google Scholar 

  12. Bellare, M., Palacio, A.: The knowledge-of-exponent assumptions and 3-round zero-knowledge protocols. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 273–289. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28628-8_17

    Chapter  MATH  Google Scholar 

  13. Benet, J.: IPFS - content addressed, versioned, P2P file system. CoRR (2014). http://arxiv.org/abs/1407.3561

  14. Benet, J., Greco, N.: Filecoin: a decentralized storage network. Protocol Labs, pp. 1–36 (2018)

    Google Scholar 

  15. Benet, J., Dalrymple, D., Greco, N.: Proof of replication. Protocol Labs, p. 20, July 2017

    Google Scholar 

  16. Benson, K., Dowsley, R., Shacham, H.: Do you know where your cloud files are? In: Proceedings of the 3rd ACM Workshop on Cloud Computing Security Workshop, pp. 73–82 (2011)

    Google Scholar 

  17. Bozkurt, I.N., et al.: Why is the internet so slow?! In: International Conference on Passive and Active Network Measurement. pp. 173–187. Springer, Cham (2017)

    Google Scholar 

  18. Cecchetti, E., Fisch, B., Miers, I., Juels, A.: Pies: public incompressible encodings for decentralized storage. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1351–1367 (2019)

    Google Scholar 

  19. Clark, M.: Nfts, explained, 11 March 2021. https://www.theverge.com/22310188/nft-explainer-what-is-blockchain-crypto-art-faq. Accessed July 2022

  20. Dierks, T., Rescorla, E.: TLS 1.2 RFC 5246 (2008). https://tools.ietf.org/html/rfc5246

  21. League of Entropy: Drand: distributed randomness beacon. drand.love. Accessed Aug 2022

    Google Scholar 

  22. Fisch, B.: PoReps: proofs of space on useful data. IACR Cryptol. ePrint Arch. 2018, 678 (2018)

    MATH  Google Scholar 

  23. Fisch, B., Bonneau, J., Greco, N., Benet, J.: Scaling proof-of-replication for filecoin mining. Benet//Technical report, Stanford University (2018)

    Google Scholar 

  24. Gill, P., Ganjali, Y., Wong, B., Lie, D.: Dude, where’s that IP? Circumventing measurement-based IP geolocation. In: Proceedings of the 19th USENIX Conference on Security, pp. 16–16 (2010)

    Google Scholar 

  25. Hanser, C., Slamanig, D.: Efficient simultaneous privately and publicly verifiable robust provable data possession from elliptic curves. In: 2013 International Conference on Security and Cryptography (SECRYPT), pp. 1–12. IEEE (2013)

    Google Scholar 

  26. Harding, E.L., Acevedo, L.J., Dailey, L.R.: Data localization and data transfer restrictions, August 2021. https://www.natlawreview.com/article/data-localization-and-data-transfer-restrictions/. Accessed July 2022

  27. Høiland-Jørgensen, T., Ahlgren, B., Hurtig, P., Brunstrom, A.: Measuring latency variation in the internet. In: Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, pp. 473–480 (2016)

    Google Scholar 

  28. Jeon, K.E., She, J., Soonsawad, P., Ng, P.C.: BLE beacons for Internet of Things applications: survey, challenges, and opportunities. IEEE Internet Things J. (2018). https://doi.org/10.1109/JIOT.2017.2788449

    Article  Google Scholar 

  29. Juels, A., Kaliski Jr., B.S.: Pors: proofs of retrievability for large files. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 584–597 (2007)

    Google Scholar 

  30. Katz-Bassett, E., John, J.P., Krishnamurthy, A., Wetherall, D., Anderson, T., Chawathe, Y.: Towards IP geolocation using delay and topology measurements. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, pp. 71–84 (2006)

    Google Scholar 

  31. Kohls, K., Diaz, C.: VerLoc: verifiable localization in decentralized systems. In: 31st USENIX Security Symposium (USENIX Security 2022), pp. 2637–2654. USENIX Association, Boston, MA, August 2022. https://www.usenix.org/conference/usenixsecurity22/presentation/kohlsD

  32. Labs, P.: Filecoin: a decentralized storage network, 19 July 2017. https://filecoin.io/filecoin.pdf. Accessed July 2022

  33. Labs, S.: Storj: a decentralized cloud storage network framework, 30 October 2018. https://www.storj.io/storjv3.pdf. Accessed July 2022

  34. Mellor, C.: Ssds will crush hard drives in the enterprise, bearing down the full weight of wright’s law, 25 January 2021. https://blocksandfiles.com/2021/01/25/wikibon-ssds-vs-hard-drives-wrights-law/. Accessed July 2022

  35. Patton, C.: Roughtime: securing time with digital signatures (2018). https://blog.cloudflare.com/roughtime/. Accessed July 2022

  36. Qualys: Ssl pulse (2022). https://www.ssllabs.com/ssl-pulse/. Accessed July 2022

  37. Reinheimer, P., Roberts, W.: Global ping statistics: Manhattan. https://wondernetwork.com/pings/Manhattan. Accessed Apr 2021

  38. Shacham, H., Waters, B.: Compact proofs of retrievability. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 90–107. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89255-7_7

    Chapter  MATH  Google Scholar 

  39. Stock, S.: Roughenough (2021). https://github.com/int08h/roughenough. Accessed July 2022

  40. Suica, E.: Single c file tls 1.2/1.3 implementation (2021). https://github.com/eduardsui/tlse/. Accessed July 2022

  41. Vorick, D., Champine, L.: Sia: simple decentralized storage, 29 November 2014. https://sia.tech/sia.pdf. Accessed July 2022

  42. Wang, Y., Burgener, D., Flores, M., Kuzmanovic, A., Huang, C.: Towards street-level client-independent IP geolocation. In: NSDI, vol. 11, p. 27 (2011)

    Google Scholar 

  43. Watson, G.J., Safavi-Naini, R., Alimomeni, M., Locasto, M.E., Narayan, S.: Lost: location based storage. In: Proceedings of the 2012 ACM Workshop on Cloud Computing Security Workshop, pp. 59–70 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Maram .

Editor information

Editors and Affiliations

Appendices

A Formalism Details

Storage Devices: To model an adversary that can place files in several distinct locations, we introduce a model for (storage) devices. We denote a device by . In our security experiments (for soundness), all devices are under the control of the adversary. The adversary can place devices in locations of its choice but those locations remain fixed throughout the experiment. Devices have access to unlimited storage memory. The adversary cannot execute any function outside the device environment, a requirement that simplifies our model w.l.o.g., as the adversary can transmit files freely between storage devices placed in locations of its choice. Formally, we model all devices by way of an oracle \(\mathcal {O}_\textsf{dev}\) presented in Appendix A.1.

Modeling Time: As noted before, in all PoGeoRet schemes, the verifier uses an internal clock to time interactions with the prover . must also know an upper bound on the computation time required by an honest .

We allow the adversary to communicate messages (of any size) between devices with speed \(S_\textsf{max}\).

1.1 A.1 Soundness

Our security definition for soundness has many similarities to that of a PoRet. The PoRet soundness definition involves two experiments: setup and challenge. The setup experiment lets the adversary set up its devices and pick a file F for the challenge-response interactions in the challenge experiment. The challenge experiment corresponds to interactions with a real-world verifier, and requires that an adversary responds to \(\epsilon \)-fraction of queries correctly. The challenge experiment interface is reused for extraction, in which a verifier tries to reconstruct F from file fragments obtained in the protocol. A PoRet scheme is said to be sound if success in the challenge experiment implies that extraction succeeds.

The PoGeoRet soundness definition includes these requirements, but also that success means the file F is inside the region \(R^\textsf{rou}\). To capture this requirement, we introduce into PoGeoRet a commitment oracle \(\mathcal {O}^\textsf{rou}_\textsf{com}\) that models the function in a localized way. At a high level, \(\mathcal {O}^\textsf{rou}_\textsf{com}\) models the necessity of computing in a real-world execution similar to how a random oracle models that of a hash function. In particular, \(\mathcal {O}^\textsf{rou}_\textsf{com}\) models running on a storage device inside \(R^\textsf{rou}\) and consequently captures the location where the file fragment \({\boldsymbol{\mu }}\) input to is stored. Our modeling of a commitment oracle is similar to how a random oracle models a hash function, but generalized to a generic function and localized to a geographic region. (In one of the GoAT protocols, is a hash function, and the commitment oracle then becomes a localized random oracle.) Note that the adversary can run on any device it wants, either inside or outside \(R^\textsf{rou}\), but \(\mathcal {O}^\textsf{rou}_\textsf{com}\) only tracks executions inside \(R^\textsf{rou}\). \(\mathcal {O}^\textsf{rou}_\textsf{com}\) is a subroutine of the device oracle \(\mathcal {O}_\textsf{dev}\).

Extraction, now called “geo-extraction,” is deemed successful only if F can be computed from a set of file fragments \(\mu ^\textsf{all}\) such that every fragment \({\boldsymbol{\mu }}\in \mu ^\textsf{all}\) was previously seen in a query to \(\mathcal {O}^\textsf{rou}_\textsf{com}\). The idea is that, if \(\mathcal {O}^\textsf{rou}_\textsf{com}\) was queried on a fragment, it happened inside \(R^\textsf{rou}\). Consequently, if enough file fragments are inside \(R^\textsf{rou}\), then the file \(F\) itself is in \(R^\textsf{rou}\). The rest of the PoGeoRet soundness definition is same as for a PoRet, as explained in more detail below.

Corresponding to the setup and challenge experiments, the adversary \(\mathcal {A}\) consists of two parts, \(\mathcal {A}_\textsf{setup}\) and \(\mathcal {A}_\textsf{chal}\), each involved only in its respective experiment.

\(\mathcal {A}_\textsf{setup}\) may interact arbitrarily with the verifier; it may create files and run on them; it may also undertake challenge-response interactions with the verifier and observe if the verifier accepts or not. \(\mathcal {A}_\textsf{setup}\) is allowed to place any number of devices at locations of its choice and decide what to store in their memories. Device locations are fixed after creation.

The setup experiment runs on a file \(F\) picked by the adversary. The resulting output \(F^*\) is challenged in the challenge experiment.

During the challenge experiment, challenges are issued to the second adversary component \(\mathcal {A}_\textsf{chal}\) and success is based on whether the proof verifies.

Geo-extraction is the crux of the soundness definition. Just as a proof of knowledge has knowledge-soundness if success with the verifier implies an ability to extract a witness, a PoGeoRet is sound only if success with the verifier implies an ability to extract the target file from a target geographical region—which, again, we refer to as “geo-extraction.”

Geo-extraction consists of three steps. First derives file fragments \(\mu ^\textsf{all}\) from interactions with \(\mathcal {A}_\textsf{chal}\), i.e., the same adversary component as in the challenge experiment. We allow the adversary to be rewound in this step, as is standard in the PoRet literature, e.g., [29, 38]. Second, tries to recompute the file from the derived file fragments \(\mu ^\textsf{all}\). does not interact with the adversary. The third and final step is verifying if all the fragments \(\mu ^\textsf{all}\) were seen inside \(R^\textsf{rou}\), i.e., as inputs to the commitment oracle \(\mathcal {O}^\textsf{rou}_\textsf{com}\). We say that geo-extraction succeeds only if both succeeds and all file fragments were in \(R^\textsf{rou}\). A PoGeoRet scheme is said to be sound if adversarial success in the challenge experiment implies that geo-extraction succeeds w.h.p.

From the point of view of an adversary whose goal is to “cheat” a verifier, \(\mathcal {A}\) wants to create an environment in which believes the file is in \(R^\textsf{rou}\), but it isn’t. Thus the aim of \(\mathcal {A}_\textsf{setup}\) is to set up devices in such a way that: (1) accepts responses from \(\mathcal {A}_\textsf{chal}\) in the challenge experiment and (2) cannot geo-extract the file \(F\), i.e., fails to recompute \(F\) from the file fragments input to \(\mathcal {O}^\textsf{rou}_\textsf{com}\).

We now present the device oracle formally in Appendix A.1. Our detailed security experiments and soundness definition are in Appendix A.1.

Device Oracle. We model device actions in our experiments via the device oracle \(\mathcal {O}_\textsf{dev}\) specified in Fig. 3.

  • \(\mathcal {O}_\textsf{dev}.\textsf{init}\) is used in the setup experiment for initialization.

  • \(\mathcal {O}_\textsf{dev}.\textsf{createDevice}\) allows the adversary to spawn devices at any location (both inside and outside \(R^\textsf{rou}\)) and \(\mathcal {O}_\textsf{dev}.\textsf{setStorage}\) allows changes to device storage.

  • \(\mathcal {O}_\textsf{dev}.\textsf{exec}\) allows the adversary to execute a function \(\textsf{func}\) on a device of its choice, including any function in the PoGeoRet API except . \(\mathcal {O}_\textsf{dev}.\textsf{execComFrag}\) models the commitment oracle \(\mathcal {O}^\textsf{rou}_\textsf{com}\) and tracks all calls to . \(\mathcal {O}_\textsf{dev}.\textsf{exec}\) can also communicate with other devices through \(\mathcal {O}_\textsf{dev}.\textsf{sendTo}\) or execute code on a different device.

  • \(\mathcal {O}_\textsf{dev}.\textsf{seenInROU}\) is used only in the soundness definition to check if a given set of inputs were previously seen in a call to \(\mathcal {O}^\textsf{rou}_\textsf{com}\).

Soundness Experiments and Definition. In all experiments, the adversary has complete freedom to call any device function. Both the experiments are in Fig. 4.

In the setup experiment \(\mathsf{\textbf{Exp}}_{\mathcal {A}}^\textsf{setup}\), is run over a file \(F\) and the output given to \(\mathcal {A}\), who decides where to place the file. \(\mathcal {A}_\textsf{setup}\) outputs state \(s\) that is given to \(\mathcal {A}_\textsf{chal}\) as initial input.

In the challenge experiment \(\mathsf{\textbf{Exp}}_{\mathcal {A}}^\textsf{chal}\), \(\mathcal {A}_\textsf{chal}\) responds to a challenge issued by the verifier. Note that we issue one PoGeoRet challenge which internally comprises one PoRet challenge. The success probability for the challenge experiment is defined as:

$$ \mathsf{\textbf{Succ}}_{\mathcal {A}}^\textsf{cha}(\eta , R, \textit{pp}, s) = \operatorname {Pr}\left[ {\mathsf{\textbf{Exp}}_{\mathcal {A}}^\textsf{chal}(\eta , R, \textit{pp}, s) = 1}\right] . $$
Fig. 3.
figure 3

The device API.

Fig. 4.
figure 4

Setup and Challenge Experiments.

Definition 1

(Soundness). A PoGeoRet scheme is \((\epsilon ,p)\)-sound w.r.t. a target region \(R^\textsf{target}\) achieving a region of uncertainty \(R^\textsf{rou}\) if for all poly-time \(\mathcal {A}\):

figure el

This definition states that a PoGeoRet scheme is \((\epsilon , p)\)-sound if, for every adversary that succeeds the challenge experiment with \(\epsilon \) probability, geo-extraction must also succeed with probability p. Sometimes we omit p and say that a PoGeoRet scheme is \(\epsilon \)-sound; in such cases we mean that p is negligibly close to 1, i.e., \(p = 1-\textsf{negl}(\lambda )\).

1.2 A.2 Completeness

Completeness requires that a valid prover using a device inside the target region \(R^\textsf{target}\) can successfully prove operation inside \(R^\textsf{rou}\).

figure en

Definition 2

(Completeness). A PoGeoRet scheme is complete w.r.t a target region \(R^\textsf{target}\) achieving a region of uncertainty \(R^\textsf{rou}\), if for any \(L \in R^\textsf{target}\), \(\operatorname {Pr}\left[ {\textbf{Exp}^\textsf{comp}(L, R^\textsf{rou}) = 1}\right] > 1 - \textsf{negl}(\lambda )\).

Note that we use the device oracle \(\mathcal {O}_\textsf{dev}\) to spawn a device at the target location L and run on this device.

Remark: For simplified presentation, the above definitions of soundness and completeness assume an interactive protocol between the prover and verifier. Our main goal, though, is for GoAT to operate non-interactively. Due to lack of space, we relegate non-interactive definitions to Appendix F.1 (they only require minor modifications).

1.3 A.3 Extension to Flexible-Challenge Model

As before, the setup experiment has the adversary pick file F and initialize several devices. But then, \(I\) challenge experiments take place, one per interval. After the epoch (or) \(I\) intervals end, the challenge responses are verified. Geo-extraction takes place after that.

The device oracle \(\mathcal {O}_\textsf{dev}\) now maintains a record of the commitment oracle queries made in each interval; let \(\mu ^\textsf{rec}_i\) denote the list of fragments queried in the ith interval. In each interval, the adversary requests a challenge at a time of its choosing. After the epoch (or \(I\) intervals) ends, we extract the file \(I\) times by running . Geo-extraction in the ith interval succeeds if the file can be assembled from the fragments in \(\mu ^\textsf{rec}_i\). Soundness is defined in the same way as before except that we now require geo-extraction succeed in all \(I\) intervals.

Economic Argument: Note that for short intervals, typically \(\phi \ll 1\), as we now show in an example. Consider the bandwidth and storage costs currently charged by Amazon. (We take storage cost as a proxy for revenue.) Suppose we set the interval length \(\beta =30\) mins. If an encoded file size is \(|F^*|\) = 1 TB, then a prover’s storage revenue is at most \(\$0.02\) per interval on Amazon S3 [9]. On the other hand, AWS bandwidth costs start from \(\$20\) per TBFootnote 5. So downloading 1 GB would cost the same as the revenue obtained by storing 1 TB. Therefore \(\phi = 1/1000\).

B GoAT-H

\(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) Version: The key difference in \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) is the use of a hash function as the vector commitment. This results in larger proofs and extra computational steps in and . is the same as before except the change in the PoRet commitment function, i.e. instead of . PoRet computation ( ) involves naïvely running \(I\) times because the aggregation tricks do not work anymore. If \(\mathcal {S}_j\) denotes the jth set of challenges, compute ; the final proof is \(\pi ^\textsf{PoRet}\leftarrow \{\pi ^\textsf{PoRet}_1, \pi ^\textsf{PoRet}_2, \ldots , \pi ^\textsf{PoRet}_{I}\}\). Accordingly, verification involves running  \(I\) times. The proof sizes in \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) are asymptotically the same as \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \), but concretely about 3\(\times \) larger. Geolocation accuracy remains the same.

A summary of both the GoAT protocols can be found in Fig. 11.

C GoAT Security

Both \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) and \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) achieve the same security but in different models. \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) operates in the random oracle model and its security proof relies on the commonly used “knowledge of queries” technique. On the other hand, \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \)’s security relies on a new assumption that we call the KEV Assumption (KEVA). The proof sketches for both protocols are in Appendix D.

\(\text {KEVA}_s\) extends the commonly used KEA1 [12] for an s-sized vector of elements. In particular, if \(s=1\) it reduces to KEA1. It states that if \(\mathcal {A}\) takes two correlated sets of bases \(({\textbf {h}}_1, {\textbf {h}}_2= {\textbf {h}}_1^b)\) as input and outputs \((c_1, c_2)\) s.t. \(c_2= c_1^b\), then there exists an extractor \(\mathcal {E}_\mathcal {A}\) that can output a pre-image \(\textbf{x}\) s.t. the Pedersen commitment of \(\textbf{x}\) with \({\textbf {h}}_1\) is \(c_1\), i.e., \({\textbf {h}}_1^\textbf{x}= c_1\) while using the same inputs as before. This is saying that the only way of computing \((c_1, c_2)\) is by picking a pre-image \(\textbf{x}\) and computing its Pedersen commitment.

Definition 3

(\(\text {KEVA}_s\)). Given any set of distinct bases \({\textbf {h}}_1\in \mathbb {G}^s\), for any PPT \(\mathcal {A}\), there exists a PPT extractor \(\mathcal {E}_\mathcal {A}\) s.t.

figure ez

Say the target region is a single location, \(R^\textsf{target}= (L; 0)\). Then the region of uncertainty achieved by \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) and \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) is a circle centered at anchor’s location with radius . In practice, the target region might have a small diameter, \(R^\textsf{target}= (L; \delta ')\). As long as \(\delta '\) is small, we can approximate and define the region of uncertainty as where \(\delta '' = \max _{\{L' \in R^\textsf{target}\}} \delta _{L'}\).

Theorem 1

Let \(w = \big (\rho + \phi + 1 - 2^{\frac{-\lambda -\alpha }{l}}\big )^l\). For any \(\epsilon \le 1\) s.t. \(\epsilon - w\) is positive and non-negligible and \(\text {KEVA}_s\) holds and that the CDH problem is hard in bilinear groups, \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) is \((\epsilon , p)\)-sound at a target geographic region \(R^\textsf{target}=(L; \delta ')\) achieving a geolocation guarantee of under the flexible challenge model and the random oracle model.

D Security Proof Sketches

We now provide a proof sketch for Theorem 1. We primarily focus on \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) with a Roughtime anchor adding notes about how the proof extends to \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) (or) to TLS anchors where needed.

Recall that the \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) proof \(\pi ^{\textsf{geo}}\) consists of \(I\) geo-commitments and a PoRet proof. Each geo-commitment \(C^\textsf{geo}\) consists of \(a+ 1\) anchor transcripts and all but the first transcript contain a PoRet commitment. In total, \(N = Ia\) PoRet commitments are in a proof. Similarly, the \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) proof consists of \(N = Ia\) PoRet proofs and \(I\) geo-commitments.

We prove soundness of GoAT in four steps.

  1. 1.

    Prove that the N PoRet commitments and the PoRet proof(s) are correctly computed, i.e., PoRet verification ( ) part of must detect otherwise.

  2. 2.

    A combination of timing and knowledge based arguments to prove that the operation is run on a device inside \(R^\textsf{in}\), i.e., prove that all file fragments part of a correct proof must have been queried to the commitment oracle \(\mathcal {O}^\textsf{rou}_\textsf{com}\).

  3. 3.

    Prove that the extraction algorithm can efficiently reconstruct \(\rho \) fraction of file blocks from the fragments in each of the \(I\) snapshots \(\{S_i\}_{i=1}^I\).

  4. 4.

    Prove that the file can be reconstructed from any \(\rho \) fraction.

The proof for part 4 follows directly from the properties of a rate-\(\rho \) erasure code, so we do not expand on it further.

1.1 D.1 Part-Two Proof

For this part, we need to prove that the commitment oracle \(\mathcal {O}^\textsf{rou}_\textsf{com}\) receives all file fragments that are part of a correct PoRet commitment, proof. (The latter is guaranteed by the part-one proof provided later).

We proceed in two steps. First we argue that the only way of computing a valid PoRet commitment is by computing on valid file fragments. This relies on the KEV assumption (See Definition 3) for \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) and the ROM for \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \). Next we argue that all calls to must take place from within the desired target region \(R^\textsf{in}\). This relies on a timing based argument. Overall, this proves that if correct PoRet commitments and proofs are computed, then the commitment oracle (\(\mathcal {O}^\textsf{rou}_\textsf{com}\)) records all the corresponding file fragments.

The proof for first step is as follows. Given a valid PoRet commitment \(C_{{\boldsymbol{\mu }}}\) for SW-P, we need to prove the existence of a valid pre-image \({\boldsymbol{\mu }}\). But the KEVA directly offers this. We can use the extractor provided by the assumption to efficiently extract \({\boldsymbol{\mu }}\) for every valid PoRet commitment.

The proof for the second part is given below. We provide two arguments based on whether a high-resolution/low-resolution anchor is used, beginning with the former setting.

As noted before, we assume that the clock offsets of all anchors are observed apriori and that clock drift is negligible. So we can safely assume that the anchor timestamps lie inside the expected interval, as otherwise the geo-commit verification would detect.

High-Resolution Anchors (\(\boldsymbol{a= 1}\) ). Fixing some notation, assume that the storage provider is at a location \(L_p \in R^\textsf{target}\) and that the anchor assigned to \(L_p\) is , located at \(L_1\). Recall that the target region in GoAT is a spherical circle centered at \(L_1\) with radius \(\delta = \varDelta (L_p,L_1) \cdot S_\textsf{max}/ 2\), i.e., the region \(R^\textsf{in}=(L_1; \delta )\). Expanding the radius further we have, \(\delta = (t_\textsf{com}+ \mathsf{rtt_{max}}(L_p, L_1) + t_\textsf{proc}) \cdot (S_\textsf{max}/ 2)\).

Recall that in the case of high-resolution anchors, the prover computes one PoRet commitment per interval. We want to prove that all the \(I\) PoRet commitments are computed on some device in \(R^\textsf{in}\). Assume the contrary, i.e., say there exists a device situated at \(L_2 \in R^\textsf{out}\) on which one of the PoRet commitments is computed. By definition we have \(\textsf{dist}(L_1, L_2) > \delta \).

Without loss of generality, assume that a copy of the encoded file \(F^*\) (generated during the setup experiment) exists in its entirety in the memory of , and therefore the time taken to compute commitment on is negligible, i.e., \(t_\textsf{com}^\mathcal {A}= 0\). We also set the anchor processing time \(t_\textsf{proc}^\mathcal {A}= 0\).

The time taken to receive and respond from during the geo-commitment protocol with is given by \(z = 2\textsf{dist}(L_1, L_2) / S_\textsf{max}\). This is because in Fig. 7 we derive challenges from anchor signatures, i.e., they arise at \(L_1\) and must reach \(L_2\). We can assume that the adversary probability of guessing these challenges is negligible (requires breaking selective unforgeability of the signature scheme used by the anchor which happens with negligible probability).

Note in particular that this value is irrespective of any other factors, e.g., the adversary’s strategy might be to place a device exactly at the anchor location \(L_1\), and initiate the protocol from with challenges forwarded to . Moreover, we do not include any startup cost when the adversary is sending messages between devices, so \(t_\textsf{setup}^\mathcal {A}= 0\).

For the geo-commitment verification to succeed, it must be that \(z \le \varDelta (L_p,L_1)\). (See last step in Fig. 2 when \(a=1\)).

But we have a contradiction, as z must also satisfy \(z > 2\delta / S_\textsf{max}\) because \(\textsf{dist}(L_1, L_2) > \delta \). Substituting for \(\delta \) we get \(z > \varDelta (L_p,L_1)\). Hence proved.    \(\square \)

The proof for low-resolution anchors is similar and relegated to the full paper due to lack of space.

1.2 D.2 Remaining Proofs

We now prove the remaining parts, part-one and part-three.

Part-One Proof. For this we reuse the proof for Theorem 4.2 in [38]. They provide a series of games that prove that, except with negligible probability, no adversary ever causes a verifier to accept in a PoRet instance, except by responding with values \(\{{\mu _j}\}, \sigma \) that are computed correctly (under the assumption that the computational Diffie-Hellman problem is hard in bilinear groups). This directly proves that if the challenger provides a challenge set \(\mathcal {S}^{*}\), then the correctly computed output of and containing \(\{{C_{\boldsymbol{\mu }}, \boldsymbol{\mu }, \sigma }\}\) must be accepted by the verification algorithm . The only change we made is the extra vector commitment. Assuming that the binding property of the vector commitment scheme holds, this directly follows.

The remaining thing to be proved is that all the individual PoRet commitments used to compute \(C = C_{\boldsymbol{\mu }}\) are correctly computed. Assume for contradiction that some of them are not computed correctly. Observe that we derive random coefficients \(r_j\) from the final PoRet commitment \(\textsf{com}_N\). These coefficients are used during verification to compute C as follows, \(C = \prod _{j=1}^N (\textsf{com}_j)^{r_N}\). Under the random oracle model, we can assume that the probability of prover guessing these coefficients beforehand is negligible. Note the two checks in : the commitment check (\(\mathsf{\textsf{VC}.Verify}\)) and the pairing equation check. Assuming that the latter succeeds, that is the final commitment C is the same as that computed by an honest prover, then the only way prover can make \(\mathsf{\textsf{VC}.Verify}\) succeed is by guessing the random coefficients correctly (or) by breaking commitment binding, both of which happen with negligible probability. Grinding concerns are discussed in the main body.

Part-Three Proof. We re-purpose the extraction algorithm provided in the proof of Theorem 4.3 in [38]. [38] provides an extraction algorithm that, given an adversary that answers \(\epsilon \) fraction of the queries correctly, can extract \(\rho \) fraction of the encoded file blocks provided that \(\epsilon - (\rho n)^l/ (n - l+ 1)^l\) is positive and non-negligible.

Recall that our extraction algorithm is composed of and . And the extraction algorithm of [38] already follows this additional structure we impose. Querying the adversary corresponds to and assembling the file from query responses corresponds to .

The only change now is that extraction must succeed in every interval, i.e., \(\mathcal {O}_\textsf{dev}.\textsf{seenInROU}(\mu ^\textsf{all}_i) = 1\) \(\forall i \in \{1,2, \ldots , I\}\) must pass for all the intervals. And the key question is how the new bandwidth constraint \(\phi \) and grinding attacks (discussed in Sect. 4) impact the above theorem.

Recall that the size of the encoded file is \(|F^*|\). Of this, due to grinding, at least \(g= (1 - (1 - 2^{-\lambda }) ^{1/\alpha })^{1/l}\) fraction is only stored inside \(R^\textsf{in}\) and hence only that is available for extraction (\(\alpha \) is the grinding cap). And further, upto \(\phi \) bytes (the bandwidth cap) of the \(g\)-sized fraction can be downloaded, and is hence unavailable.

The idea in the proof of Theorem 4.3 of [38] is to query enough times and use linear algebraic techniques to recover file blocks from query responses. Queries are made randomly. Three types of queries are listed, and the fraction of type-1 queries (the useful ones that help recover file blocks) is \(\epsilon - w\) where \(w = (\rho n)^l/ (n - l+ 1)^l\) (omitting the negligible part of the equation caused by type-2 queries). The extractor needs \(\rho n \le n\) type-1 queries to succeed, which happens in \(O(n / (\epsilon - w))\) time.

The maximum number of blocks unavailable inside \(R^\textsf{in}\) is given by \(\gamma = (\frac{n\phi }{|F^*|}) + n(1 - g)\). Therefore the extractor needs more type-1 queries to succeed, \((\rho n + \gamma )\). Note that we assume if a query challenges a block that belongs to the unavailable portion in \(S_1\), a special symbol “\(-1\)” is used in place of the file block, and the challenge response is computed. And by extracting \((\rho n + \gamma )\) blocks, we are guaranteed to have at least \(\rho n\) actual file blocks (removing the \(-1\)’s).

The useful fraction of queries now is \(\epsilon - w\) where \(w = (\rho n + \gamma )^l/ (n - l+ 1)^l\). And assuming \(\rho n + \gamma \le n\), extraction happens in \(O(n / (\epsilon - w))\) time, i.e., same order as before. One constraint we get is \(\frac{\phi }{|F^*|} \le g- \rho \) (Fig. 10).

We want \(\epsilon - w\) to be positive and non-negligible. Therefore w needs to be negligibly small. Meaning \((\rho + \gamma / n)^l\) (or) \((\rho + \frac{\phi }{|F^*|} + 1 - g)^l\) needs to be negligible. As noted above, the number of interactions required and the time to extract is the same order as in [38].    \(\square \)

Fig. 5.
figure 5

Proof of Geo-Retrievability (PoGeoRet ) API.

Fig. 6.
figure 6

NIPoGeoRet API.

Fig. 7.
figure 7

Geo-commitment protocols.

Fig. 8.
figure 8

The Shacham-Waters PoRet schemes with an extra commitment step. SW-H, SW-P differ in the choice of VC scheme.

E Supporting TLS 1.2 Anchors

1.1 E.1 Low-Resolution Anchors

This section deals more broadly with supporting low-resolution anchors.

Chaining of the two operations is done in a similar fashion to before. In total, \(a\) PoRet commitment computations and \(a+ 1\) anchor pings take place. We refer to \(a\) as the amplification factor. Note that this modification applies to both GoAT variants, \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) and \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \).

The value \(a\) is set based on the exact resolution offered by an anchor. For example if the anchor resolution is in seconds and the time difference is 50ms, then 20 consecutive proofs (when started at a one-second boundary in the anchor’s clock) will have the same timestamp, so \(a= 19\) (since \(a+1\) pings are needed). More generally, if the resolution of an anchor is , we set .Footnote 6 Below, we explain how to time proof execution in order to ensure receipt of \(a+1\) transcripts with the same timestamp.

Fig. 9.
figure 9

Publicly verifiable PoRet API. is the only addition compared to prior modeling [29].

Fig. 10.
figure 10

Pedersen and Hash-based Vector Commitment scheme

Fig. 11.
figure 11

The \(\textsf{GoAT} \) proof of geo-retrievability schemes. It includes both the \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) and \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) variants that internally use SW-H and SW-P PoRet schemes respectively.

In , the prover computes a single PoRet similar to before, leveraging the aggregability of SW. We also make a change to : instead of checking the difference between timestamps, the verifier counts if \(a+1\) anchor transcripts have the same timestamp. Other steps are similar to before.

A general protocol for any anchor, low- or high-resolution, is specified in Fig. 7, in the paper appendix.

Effect on Geolocation: The use of amplification has a small effect on the radius of ROU, explained through an example. Suppose 250 ms,  ms. Applying the above formula, we get \(a= 3\), i.e., 4 pings are needed. But this leaves some “extra time”—for example, if the anchor’s clock times at the moment of receipt of the 4 \(\mathsf GetAuthTime\) requests are x, \(x+250\), \(x+500\), \(x+750\) (all in ms), then an adversary still has about 250 ms left in the end (Assume x is a second boundary). So an adversary can spend an extra \(250/a= 83.33\) ms on each of the \(a\) PoRet commitment computations and thus position the file further from the target location than with no amplification. Such manipulation will go undetected because the difference between the last and first anchor clock times is still within a resolution tick, .

The precise extra time available due to amplification is . Distributing it equally leads to an extra \(e / a\) time per commitment computation. For practical values, the extra time is small and hence its impact is minimal. For example, if  ms and  ms, then \(e = 50 / 19 = 2.6\) ms causing about 260 km increase compared to that without amplification.

GoAT Security: Considering both high-resolution and low-resolution anchors, the following equation describes GoAT’s geolocation radii. Say the target region is a single location, \(R^\textsf{target}= (L; 0)\). Then the region of uncertainty achieved by GoAT (both \(\mathsf{\textsf{GoAT}}\text {-}\textsf{H} \) and \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \)) is a circle centered at anchor’s location with radius \(\delta _L\) given by:

figure gr

1.2 E.2 Changes to 

One other change needs to be made to support TLS. In \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \), the vector commitment has two elements and won’t fit into the nonce field of the TLS handshake for commonly used algebraic groups. So we include a hash of the commitment and reveal the underlying commitment as part of the proof. Details below.

For example, the size of each group element in our implementation is 20 bytes, so the SW-P PoRet commitment is 40 bytes whereas the TLS nonce is 32 bytes only.

So we modify the PoRet commitment protocol by hashing the previous commitment to fit in the nonce field (which is essentially in turn modifying the protocol). The output of the protocol, i.e., the PoGeoRet proof will now include all the PoRet commitments generated during the epoch.

If the number of intervals is 1, the proof will consist of a proof-of-retrievability, \(a\) PoRet commitments and \(a+1\) anchor transcripts.

F Formalism Extensions

1.1 F.1 Non-interactive Proofs of Geo-Retrievability

NIPoGeoRet allows any newcomer to verify that the prover indeed had the file inside the region of uncertainty (ROU), during a specified time duration. The NIPoGeoRet API ( Fig. 6) is almost the same as the PoGeoRet one except that the function is removed. We attach the preamble NI to other API functions, e.g., and .

Relation to GoAT : Recall that GoAT is a non-interactive protocol. So the API in Fig. 6 map to the GoAT protocol specified in Fig. 11.

For ease of explaining GoAT, we divide into two sub-functions, and . The former specifies the interaction with anchor to derive challenges. for GoAT is specified in Fig. 7.

Modeling Time: In our previous modeling for interactive PoGeoRet, we relied on the verifier to keep track of time during the security experiments. Instead now we introduce a notion of time into the definition. Each system entity maintains an internal clock. Clocks need not be synchronous, but we assume that clock drift is negligible. The clock time of say an anchor is given by . If the true time is given by \(\mathsf{true\_time}\), then the clock offset of an entity is . The offsets of all anchors are assumed to be public (this can be observed once during a setup phase in practice). Note that the function relies on these clock offsets to judge if the proof is valid.

Security Properties: The completeness definition is the same as before, except that no challenges are issued by the verifier.

The changes to the security experiments related to soundness are also minimal. The setup experiment is same as before, except that the public information \(\textit{pp}\) could also contain extra information such as anchor public keys. The challenge experiment now does not involve sending challenges to the prover. Instead, the prover computes NIPoGeoRet proofs itself, and submits a proof at the end of an epoch. This proof is verified using . And the soundness definition is the same as before.

G Miscellaneous

1.1 G.1 Practical Considerations

Grinding Attacks: Since protocol is prover-initiated, an adversarial prover can exploit by re-running the protocol. For example, an adversary could save on storage by only storing a portion of the file, and repeatedly query the anchor until all the challenges lie in the stored part.

Table 3. Reed Solomon encoding time with a symbol size of 32 bytes. Averaged over 10 runs.

Let \(g\) be the stored fraction. To model practical constraints, we assume that a prover can make upto \(2^\alpha \) \(\mathsf GetAuthTime\) API calls per interval (this number needs to be set based on the actual API call costs). The success probability after \(2^\alpha \) API calls is \(p = 1 - (1-g^l)^{2^\alpha }\). The adversary needs to choose the file-fraction \(g\) such that p is non-negligible, i.e., \(g\ge (1 - (1 - 2^{-\lambda }) ^{2^{-\alpha }})^{1/l} \) (or) \(g> 2^{\frac{-\lambda -\alpha }{l}}\) (via binomial expansion). Intuitively as the number of challenges \(l\) is raised, the adversary is forced to store more. We derive an exact constraint involving \(l\) and \(\alpha \) in our security proofs.

Coefficient Randomization: Randomization at the end of an epoch is necessary to ensure that the PoRet commitments \(\{\textsf{com}_i\}\) are correctly computed in all intervals. If the ratio between any two random coefficients was predictable, e.g., say \(\tau = r_i / r_j\) was known for some \(i < j\), then an adversary could cheat by postponing file access required to be done in the ith interval to the jth interval. Simply set \(\textsf{com}_i\) to random and \(\textsf{com}_j\) in a way that the verification equation checks out, i.e., \(\textsf{com}_j = (H_i (\textsf{com}_i)^{-1})^\tau H_j\). \(H_i\) and \(H_j\) are the actual ith and jth PoRet commitments that the adversary computes in the jth interval. More formally, we later show that an adversary that skips PoRet commitments cannot succeed in verification, as it is equivalent to breaking commitment binding, which can happen with negligible probability.

We ensure a negligible likelihood of guessing the random coefficients \(\{r_j\}\) a priori by deriving them from the final PoRet commitment \(\textsf{com}_I\). This still leaves possible grinding attacks. The best strategy for an adversary is to randomly choose the commitments (or random coefficients) and check if the verification equation succeeds. The probability of success is \(2^{-\lambda }\) (as \(2^{\lambda }\) is the size of the group used). With grinding, the probability increases to \(2^{-\lambda +\alpha }\), which is still negligible for practical parameters. One way to avoid grinding is to obtain random coefficients from public randomness beacons, e.g., [21].

Parameterization Trade-Offs: We discuss various trade-offs arising in GoAT parameterization now.

The number of sectors s impacts the proof sizes, geolocation quality and the storage overhead. Higher s leads to reduced storage overhead but at the cost of relatively poorer geolocation and worse proof size. Note that higher s leads to increased PoRet commit times and thereby worse geolocation (Eq. ()).

The number of challenges k and the code rate \(\rho \) need to be set following the constraint given in Theorem 1. As shown in Sect. 5, for practical values of \(\rho \), the number of challenges is around 200. k and \(\rho \) impact geolocation quality and storage overhead respectively. There is a direct trade-off between the two—higher code rate (\(\rho \)) leads to less storage overhead but requires setting a higher number of challenges (k), which leads to higher PoRet commit times and worse geolocation.

Fig. 12.
figure 12

(Left) GoAT proof sizes. e is the size of elements in \(\mathbb {G}\) or \(\mathbb {Z}_p\) of the Shacham-Waters PoRet scheme. (Right) Proof size per interval of \(\mathsf{\textsf{GoAT}}\text {-}\textsf{P} \) with a Roughtime (RT) anchor against the epoch length. Interval length \(\beta = \) 1hr. Dashed line (720B) corresponds to the size of two RT transcripts.

1.2 G.2 Implementation Details

Anchor Processing Times: Many TLS servers take a non-negligible amount of time to compute the response, called the anchor processing time (\(t_\textsf{aproc}\)). This is measured by pinging 114 servers at repeated intervals over two weeks both via TLS (with TCP connections established apriori) and ICMP (for raw RTT). The processing time is defined as the difference between the two. We compute the average processing time for each server, and then the 75th percentile over all the servers, which is \(t^\textsf{tls}_\textsf{aproc}=6.5\)ms. Anchors in the remaining 25th percentile are discarded. Note that setting a somewhat high value of 6.5 ms for all TLS servers is conservative—a better approach is to set anchor-specific values.

For Roughtime, we find that the processing times are almost negligible, we set \(t^\textsf{rt}_\textsf{aproc}=2\) ms. This could be due to a combination of several factors, e.g., less load, faster transport layer (UDP) [17] and faster signature scheme (EdDSA).

Remaining Parameters: Two more parameters remain to be set: \(t_\textsf{proc}\) and \(t_\textsf{com}\) (see Eq. ()). \(t_\textsf{proc}\) is separated into client (\(t_\textsf{cproc}\)) and anchor (\(t_\textsf{aproc}\)) components, with the latter discussed before. \(t_\textsf{cproc}\) corresponds to the time spent in handling the anchor response. We set \(t_\textsf{cproc}=1.5\) ms and \(t_\textsf{com}=2\) ms based on code benchmarks (the latter is discussed below).

Erasure Codes: Recall that a PoRet encoding \(F^*\) of file \(F\) incorporates an erasure code (to amplify soundness). Our implementation omits this part as it is standard to all PoRet implementations.

An (NK)-erasure code encodes a message consisting of K symbols into another message of K symbols such that the original message can be recovered from a subset of the K symbols. The code rate \(\rho \) is equal to K/N. Note that a symbol here is same as that defined in the explanation of the Shacham-Waters scheme in Sect. 4. In our implementation, the symbol size is 32 bytes. For a fixed \(\rho \) and file size |F|, observe that we want an erasure code with \(K = |F| / 32\) and \(N = K / \rho \).

RS encoding is expected to run in \(O(n \log {n})\) time. We use an off-the-shelf library that implements Reed-Solomon codes with large message, block lengths. Table 3 presents the execution results from running RS encoding on a c5.4xlarge AWS machine (like before). We set the symbol size to 32 bytes and modify the field appropriately to allow for larger N values. It takes about 8 s to encode a 1 GB file, and more generally, a slightly superlinear growth can be observed in the timing numbers matching the expected shape of \(O(n \log {n})\).

Some directions of future work include exploring optimizations that can reduce encoding times for larger file sizes or exploring settings where the user can offload erasure coding to another entity.

1.3 G.3 Ways to Improve Geolocation Accuracy

One set of ideas is related to improving the network model. Our current network model is unified, i.e., it assumes the network conditions across the globe are same for simplicity. Taking endpoint locations into account can improve geolocation quality in areas with better connectivity. Moreover a network model that avoids the blanket use of a startup cost \(t_\textsf{setup}\) (we set it to 5 ms) is desirable given that it causes upto 2x worse geolocation for nearby anchors. In our small measurement study, we found a lot of variance in the round trip times for nearby locations. But since GoAT can deal with short-lived network variances better due to the use of flexible-challenge model, a smaller value for \(t_\textsf{setup}\) could be used. More experiments to understand if this idea can be used in practice are needed.

The network model could also be more nuanced, for example nearby locations are known to have higher latencies due to long routing paths. In this case, choosing a different model based on how close the two locations are would be better.

Another idea is to optimize the PoRet commit compute time (we set it to 2 ms). This can for example be done by finding a pairing-friendly curve that has faster vector commit times and optimizing code runtime.

With regards to the choice of anchors, using Roughtime servers is clearly beneficial if possible due to their low processing times. Otherwise finding TLS servers that respond quickly is suggested, i.e., have low processing times. Overall Roughtime is a better choice of anchor, both from a performance perspective and an ethical standpoint since our use of TLS might be seen as abusing it. We hope that Roughtime gains more adoption in the future.

Several other optimization opportunities exist: reducing the client processing time by optimizing client-code (we allocate 1.5 ms which could potentially be reduced to almost zero), using an anchor-specific model for processing times, and perhaps even deploying new anchors with fast connectivity and low processing times.

Rights and permissions

Reprints and permissions

Copyright information

© 2025 International Financial Cryptography Association

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Maram, D., Kelkar, M., Bentov, I., Juels, A. (2025). GoAT: File Geolocation via Anchor Timestamping. In: Clark, J., Shi, E. (eds) Financial Cryptography and Data Security. FC 2024. Lecture Notes in Computer Science, vol 14745. Springer, Cham. https://doi.org/10.1007/978-3-031-78679-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78679-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78678-5

  • Online ISBN: 978-3-031-78679-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics