Skip to main content
Log in

A repository of Unix history and evolution

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2016 as a widely-used 27 million line system. The 1.1gb repository contains 496 thousand commits and 2,523 branch merges. The repository employs the commonly used Git version control system for its storage, and is hosted on the popular GitHub archive. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386bsd team, two legacy repositories, and the modern repository of the open source Freebsd system. In total, 973 individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. https://github.com/dspinellis/unix-history-repo.

  2. Updates may add or modify material. To ensure replicability the repository’s users are encouraged to fork it on GitHub or archive it.

  3. https://archive.org/details/git-history-of-linux.

  4. The dates provided here are given by Salus (1994, p. 43).

  5. http://www.tuhs.org/Archive/PDP-11/Distributions/research/1972_stuff/.

  6. https://github.com/dspinellis/unix-history-make.

  7. https://www.mckusick.com/csrg/.

  8. http://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/share/misc/bsd-family-tree.

  9. http://unix.stackexchange.com/questions/64025/who-are-these-bsd-unix-contributors.

  10. ftp://ftp.tuhs.org.ua/PDP-11/Tools/Tapes/newoldar.c.

  11. https://github.com/jonathangray/csrg-git-patches/.

References

  • Aho A V, Kernighan B W, Weinberger P J (1979) Awk—a pattern scanning and processing language. Softw Pract Exper 9(4):267–280

    Article  MATH  Google Scholar 

  • Babaog~lu O, Joy W (1981) Converting a swap-based system to do paging in an architecture lacking page-referenced bits. In: Proceedings of the Eighth ACM symposium on operating systems principles SOSP ’81. ACM, New York, pp 78–86

    Google Scholar 

  • Bashkow TR (1972) Study of UNIX. Bell Laboratories memo MH-8234-TRB-mbh. Available online at http://bitsavers.informatik.uni-stuttgart.de/pdf/bellLabs/unix/PreliminaryUnixImplementationDocument_Jun72.pdf. Current September 2015

  • Bird C, Gourley A, Devanbu P, Gertz M, Swaminathan A (2006) Mining email social networks. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, ACM, New York, NY, USA, MSR ’06, pp 137–143. doi:10.1145/1137983.1138016

  • Bourne S R (1978) The UNIX shell. Bell Syst Tech J 56(6):1971–1990

    Article  Google Scholar 

  • Bourne SR (1979) An introduction to the UNIX shell. In: UNIX programmer’s manual, volume 2—supplementary documents, 7th edn. Bell Telephone Laboratories. Murray Hill

  • Dolotta T A, Haight R C, Mashey J R (1978) The programmer’s workbench. Bell Syst Tech J 56(6):2177–2200

    Article  Google Scholar 

  • Feldman S I (1979) Make—a program for maintaining computer programs. Softw Pract Exper 9(4):255–265

    Article  MATH  Google Scholar 

  • FreeBSD (2015) FreeBSD Handbook. The FreeBSD Documentation Project, revision 47376 edn, available online, https://www.freebsd.org/doc/handbook/index.html

  • Gall H, Menzies T, Williams L, Zimmermann T (2014) Software Development Analytics (Dagstuhl Seminar 14261). Dagstuhl Reports 4(6):64–83. doi:10.4230/DagRep.4.6.64. http://drops.dagstuhl.de/opus/volltexte/2014/4763

    Google Scholar 

  • Gehani N (2003) Bell labs: life in the crown jewel. Silicon Press, Summit

    Google Scholar 

  • Johnson S C (1975) Yacc—yet another compiler-compiler. Computer Science Technical Report 32. Bell Laboratories, Murray Hill

    Google Scholar 

  • Johnson S C (1977) Lint, a C program checker. Computer Science Technical Report 65. Bell Laboratories, Murray Hill

    Google Scholar 

  • Johnson S C, Lesk M E (1978) Language development tools. Bell Syst Tech J 56(6):2155–2176

    Article  Google Scholar 

  • Johnson S C, Ritchie D M (1978) Portability of C programs and the UNIX system. Bell Syst Tech J 57(6):2021–2048

    Article  Google Scholar 

  • Jolitz W F, Jolitz L G (1991) Porting UNIX to the 386: a practical approach. Designing a software specification. Dr Dobb’s J 16(1)

  • Kernighan B, Lesk M, Ossanna J J (1978) UNIX time-sharing system: Document preparation. Bell Syst Techn J 57(6):2115–2135

    Article  Google Scholar 

  • Kernighan B W (1982) A typesetter-independent TROFF. Computer Science Technical Report 97. Bell Laboratories, Murray Hill, available online at http://cm.bell-labs.com/cm/cs/cstr/97.ps.gz

    Google Scholar 

  • Kernighan B W, Cherry L L (1974) A system for typesetting mathematics. Computer Science Technical Report 17. Bell Laboratories, Murray Hill

    Google Scholar 

  • Kernighan BW, Ritchie DM (1979) The M4 macro processor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2– supplementary documents, 7th edn. Bell Telephone Laboratories, Murray Hill

    Google Scholar 

  • Lesk M (1979a) Some applications of inverted indexes on the Unix system. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 4th edn. Bell Telephone Laboratories, Murray Hill

  • Lesk M E (1975) Lex—a lexical analyzer generator. Computer Science Technical Report 39. Bell Laboratories, Murray Hill

    Google Scholar 

  • Lesk ME (1979b) TBL—a program to format tables. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill

  • Lewis A (1956) AT&T settles antitrust case; shares patents. New York Times 16:1

    Google Scholar 

  • Libes D, Ressler S (1989) Life with UNIX. Prentice Hall, Englewood Cliffs

    Google Scholar 

  • Lions J (1996) Lions’ commentary on Unix 6th edition with source code. Annabooks, Poway

    Google Scholar 

  • Mashey JR, Smith DW (1976) Documentation tools and techniques. In: Proceedings of the 2Nd international conference on software engineering ICSE ’76. IEEE Computer Society Press, Los Alamitos, pp 177–181

    Google Scholar 

  • McIlroy M D, Pinson E N, Tague B A (1978) UNIX time-sharing system: foreword. Bell Syst Tech J 57(6):1899–1904

    Article  Google Scholar 

  • McKusick M K (1999) Twenty years of Berkeley Unix: from AT&T-owned to freely redistributable. In: DiBona C, Ockman S, Stone M (eds) Open sources: voices from the open source revolution, O’Reilly, pp 31–46

  • McKusick M K, Neville-Neil G V (2004) The design and implementation of the FreeBSD operating system. Addison-Wesley, Reading

    Google Scholar 

  • McMahon LE (1979) SED—a non-interactive text editor. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill

    Google Scholar 

  • Nowitz DA, Lesk ME (1979) A dial-up network of UNIX systems. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill

    Google Scholar 

  • Ossanna JF (1979) NROFF/TROFF user’s manual. In: Unix Programmer’s Manual (1979) UNIX Programmer’s Manual. Volume 2–Supplementary Documents, 7th edn. Bell Telephone Laboratories, Murray Hill

    Google Scholar 

  • Pike R, Kernighan B W (1984) Program design in the UNIX system environment. AT&T Bell Lab Tech J 63(8):1595–1606

    Article  Google Scholar 

  • Quarterman J S, Hoskins J C (1986) Notable computer networks. Commun ACM 29(10):932–971

    Article  Google Scholar 

  • Raymond ES (2003) The art of Unix programming. Addison-Wesley

  • Resnick P (2008) Internet message format. RFC 5322, RFC Editor. doi:10.17487/RFC5322. http://www.rfc-editor.org/rfc/rfc5322.txt

  • Ritchie D M (1978) A retrospective. Bell System Technical Journal 56(6):1947–1969

    Article  Google Scholar 

  • Ritchie D M (1984) The evolution of the UNIX time-sharing system. AT&T Bell Lab Tech J 63(8):1577–1593

    Article  Google Scholar 

  • Ritchie DM (1993) The development of the C language. ACM SIGPLAN Not 28 (3):201–208. preprints of the History of Programming Languages Conference (HOPL-II)

    Article  Google Scholar 

  • Ritchie D M, Thompson K (1974) The UNIX time-sharing system. Commun ACM 17(7):365–375

    Article  Google Scholar 

  • Ritchie D M, Thompson K (1978) The UNIX time-sharing system. Bell Syst Tech J 57(6):1905–1929

    Article  Google Scholar 

  • Ritchie D M, Johnson S C, Lesk M E, Kernighan B W (1978) The C programming language. Bell Syst Tech J 57(6)

  • Rochkind M J (1975) The source code control system. IEEE Trans Softw Eng SE 1(4):255–265

    Google Scholar 

  • Rosler L (1984) The evolution of C — past and future. Bell Syst Tech J 63(8)

  • Salus P H (1994) A quarter century of UNIX. Addison-Wesley, Boston

    Google Scholar 

  • Spinellis D (2015) A repository with 44 years of Unix evolution. In: MSR ’15: Proceedings of the 12th working conference on mining software repositories. IEEE, pp 462–465. doi:10.1109/MSR.2015.6. http://www.dmst.aueb.gr/dds/pubs/conf/2015-MSR-Unix-History/html/Spi15c.html, best Data Showcase Award

  • Spinellis D, Louridas P, Kechagia M (2015) An exploratory study on the evolution of C programming in the Unix operating system. In: Wang Q, Ruhe G (eds) ESEM ’15: 9th International symposium on empirical software engineering and measurement. http://www.dmst.aueb.gr/dds/pubs/conf/2015-ESEM-CodeStyle/htm l/SLK15.html. IEEE, pp 54–57

  • Spinellis D, Louridas P, Kechagia M (2016) The evolution of C programming practices: a study of the Unix operating system. In: Visser W, Williams L (eds) ICSE ’16: Proceedings of the 38th international conference on software engineering. doi:10.1145/2884781.2884799, (to appear in print). to appear. Association for Computing Machinery, New York, pp 1973–2015

  • Stevens W R (1990) UNIX network programming. Prentice Hall, Englewood Cliffs

    Google Scholar 

  • Stroustrup B (1984) Data abstraction in C. Bell Syst Tech J 63(8):1701–1732

    Google Scholar 

  • Stroustrup B (1994) The design and evolution of C++. Addison-Wesley, Boston

    Google Scholar 

  • Takahashi N, Takamatsu T (2013) UNIX license makes Linux the last missing piece of the puzzle. Ann Bus Admin Sci 12:123–137

    Google Scholar 

  • Tichy WF (1982) Design, implementation, and evaluation of a revision control system. In: Proceedings of the 6th international conference on software engineering. IEEE

  • Toomey W (2009) The restoration of early UNIX artifacts. In: Proceedings of the 2009 USENIX annual technical conference USENIX’09. USENIX Association, Berkeley, pp 20–26

  • Toomey W (2010) First edition Unix: its creation and restoration. IEEE Ann Hist Comput 32(3):74–82. doi:10.1109/MAHC.2009.55

    Article  MathSciNet  Google Scholar 

  • Wall L, Schwartz R L (1990) Programming Perl. O’Reilly and Associates, Sebastopol

    MATH  Google Scholar 

  • Yoo A B, Jette M A, Grondona M (2003) SLURM: Simple Linux utility for resource management. In: Feitelson D, Rudolph L, Schwiegelshohn U (eds) JSSPP 03: 9th International workshop on job scheduling strategies for parallel processing. doi:10.1007/10968987_3, (to appear in print). lecture Notes in Computer Science Volume 2862. Springer, Berlin Heidelberg, pp 44–60

Download references

Acknowledgments

The author thanks the many individuals who contributed, directly or indirectly, to the effort. John Cowan, Brian W. Kernighan, Larry McVoy, Doug McIlroy, Jeremy C. Reed, Aharon Robbins, and Marc Rochkind helped with Bell Labs login identifiers. Clem Cole, John Cowan, Era Eriksson, Mary Ann Horton, Warner Losh, Kirk McKusick, Jeremy C. Reed, Ingo Schwarze, Anatole Shaw, and Norman Wilson helped with bsd login identifiers and code authorship information. The historical and current material used in the repository was made available thanks to efforts by the Free bsd Project, Lynne Greer Jolitz, William F. Jolitz, Kirk McKusick, and the Unix Heritage Society. The early Unix editions were released under an bsd-style license thanks to the efforts of Bill Broderick, Paul Hatch, Dion L. Johnson II, Ransom Love, and Warren Toomey. The bsd sccs import code is based on work by H. Merijn Brand and Jonathan Gray. The newoldar program is a result of work by Brandon Creighton and Dan Frasnelli. The First Research Edition Unix was restored by Johan Beiser, Tim Bradshaw, Brantley Coile, Christian David, Alex Garbutt, Hellwig Geisse, Cyrille Lefevre, Ralph Logan, James Markevitch, Doug Merritt, Tim Newsham, Brad Parker, and Warren Toomey.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diomidis Spinellis.

Additional information

Communicated by: Romain Robbes, Martin Pinzger and Yasutaka Kamei

The work has been partially funded by the Research Centre of the Athens University of Economics and Business, under the Original Scientific Publications framework (project code EP-2279-01) and supported by computational time granted from the Greek Research & Technology Network (grnet) in the National hpc facility — aris — under project id pa003005-cdolpot.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Spinellis, D. A repository of Unix history and evolution. Empir Software Eng 22, 1372–1404 (2017). https://doi.org/10.1007/s10664-016-9445-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-016-9445-5

Keywords

Navigation