Skip to main content
Log in

DancerFly: An Order-Aware Network-on-Chip Router On-the-Fly Mitigating Multi-path Packet Reordering

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Adaptive routing algorithms can improve performance by balancing load across network channels in the presence of non-uniform traffic patterns. However, out-of-order packets can be introduced due to multi-path transmission of adaptive routing. With out-of-order transmission in the network, packets need to be reordered at the destination before being absorbed. Increasing network size with adaptive routing makes the time when a packet arrives at the destination extremely uncertain, which requires a large buffer to reorder the packets and this can exceed design space. Therefore, the challenge is to balance the trade-off between multi-path transmission and packet reordering. In this paper, we propose a novel packet reordering metric-OOD to quantify the degree of out-of-order. To minimize the OOD of packets, we propose DancerFly, an order-aware network-on-chip router that mitigates out-of-order packets caused by adaptive routing. DancerFly achieves this goal by providing two-level reordering. First, it performs in-buffer reordering by reordering packets queuing in the input buffer. Second, packets from different input ports are reordered before traversing through the router. We evaluate our design and the results show that the OOD can be reduced by 36.3% with comparable performance to the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. A set of paths are defined to be non-intersecting if the paths originate from the same source vertex but do not intersect each other in the network, except at the destination vertex.

References

  1. Dally, W.J.: Virtual-channel flow control. IEEE Trans. Parallel Distrib. Syst. 3(2), 194–205 (1992)

    Article  Google Scholar 

  2. Dally, W.J., Aoki, H.: Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans. Parallel Distrib. Syst. 4(4), 466–475 (1993). https://doi.org/10.1109/71.219761

    Article  Google Scholar 

  3. Dally, W., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2003)

    Google Scholar 

  4. Devadas, S., Cho, M.H., Shim, K.S., Lis, M.: Guaranteed in-order packet delivery using exclusive dynamic virtual channel allocation (2009)

  5. Fu, B., Kim, J.: Footprint: Regulating routing adaptiveness in networks-on-chip. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA ’17, pp. 691–702. ACM, New York (2017). https://doi.org/10.1145/3079856.3080249

  6. Gratz, P., Grot, B., Keckler, S.W.: Regional congestion awareness for load balance in networks-on-chip. In: 2008 IEEE 14th International Symposium on High Performance Computer Architecture, pp. 203–214 (2008). https://doi.org/10.1109/HPCA.2008.4658640

  7. Hennessy, J.L., Patterson, D.A.: Computer Architecture, Fourth Edition: A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco (2006)

    MATH  Google Scholar 

  8. Jerger, N.D.E., Peh, L.S., Lipasti, M.H.: Virtual tree coherence: leveraging regions and in-network multicast trees for scalable cache coherence. In: Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 41, pp. 35–46. IEEE Computer Society, Washington, DC, USA (2008). https://doi.org/10.1109/MICRO.2008.4771777

  9. Jiang, N., Becker, D.U., Michelogiannakis, G., Balfour, J., Towles, B., Shaw, D.E., Kim, J., Dally, W.J.: A detailed and flexible cycle-accurate network-on-chip simulator. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 86–96 (2013). https://doi.org/10.1109/ISPASS.2013.6557149

  10. Jin, K., Li, C., Dong, D., Fu, B.: HARE: history-aware adaptive routing algorithm for endpoint congestion in networks-on-chip. Int. J. Parallel Program. (2018). https://doi.org/10.1007/s10766-018-0614-6

    Article  Google Scholar 

  11. Kwon, W., Yoo, S., Um, J., Jeong, S.: In-network reorder buffer to improve overall noc performance while resolving the in-order requirement problem. In: 2009 Design, Automation Test in Europe Conference Exhibition, pp. 1058–1063 (2009). https://doi.org/10.1109/DATE.2009.5090821

  12. Li, M., Zeng, Q.A., Jone, W.B.: DyXY: a proximity congestion-aware deadlock-free dynamic routing method for network on chip. In: Proceedings of the 43rd annual Design Automation Conference, pp. 849–852. ACM (2006)

  13. Lis, M., Cho, M.H., Shim, K.S., Devadas, S.: Path-diverse in-order routing. In: The 2010 International Conference on Green Circuits and Systems, pp. 311–316 (2010). https://doi.org/10.1109/ICGCS.2010.5543048

  14. Lotfi-Kamran, P., Rahmani, A.M., Daneshtalab, M., Afzali-Kusha, A., Navabi, Z.: EDXY: a low cost congestion-aware routing algorithm for network-on-chips. J. Syst. Archit. 56(7), 256–264 (2010). https://doi.org/10.1016/j.sysarc.2010.05.002

    Article  Google Scholar 

  15. Ma, S., Jerger, N.E., Wang, Z.: DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In: 2011 38th Annual International Symposium on Computer Architecture (ISCA), pp. 413–424 (2011)

  16. Murali, S., Atienza, D., Benini, L., De Micheli, G.: A multi-path routing strategy with guaranteed in-order packet delivery and fault tolerance for networks on chips. In: Proceedings of Design Automation Conference (DAC) pp. 845–848 (2006). http://infoscience.epfl.ch/record/89541

  17. Palesi, M., Holsmark, R., Wang, X., Kumar, S., Yang, M., Jiang, Y., Catania, V.: An efficient technique for in-order packet delivery with adaptive routing algorithms in networks on chip. In: 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, pp. 37–44 (2010). https://doi.org/10.1109/DSD.2010.53

  18. Thies, W., Karczmarek, M., Amarasinghe, S.: Streamit: a language for streaming applications. In: Horspool, R.N. (ed.) Compiler Construction, pp. 179–196. Springer, Berlin (2002)

    Chapter  Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers for their valuable feedback. We also appreciate members of Tianhe interconnect group at NUDT for many inspiring conversations early in the project. This project was partially supported by the National Science and Technology Major Projects on Core Electronic Devices, High-End Generic Chips and Basic Software under grants No.2018ZX01028101 and No. 2017ZX01038104-002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dezun Dong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, K., Dong, D., Li, C. et al. DancerFly: An Order-Aware Network-on-Chip Router On-the-Fly Mitigating Multi-path Packet Reordering. Int J Parallel Prog 48, 730–749 (2020). https://doi.org/10.1007/s10766-019-00648-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-019-00648-9

Keywords

Navigation