Abstract
The research activities in the area of data base systems are reviewed. Most of the issues considered by research institutes center around models of information, interactive data manipulation, system aspects, implementation techniques and modelling and analysis. Comparison with industry activities and documented user requirements shows differences of emphasis between research and development. Conclusions are drawn with respect to established and potentially emerging principles in the area of data base design and architecture and with respect to potential future trends in data base research.
Chapter PDF
Similar content being viewed by others
8.2. References
Abrial, J.R. Data Semantics. Data Base Management, Proc. of IFIP Work. Conf., Cargese, Corsica April 1974. North Holland, Amsterdam 1974. The paper is mathematical and philosophical with a scope exceeding the data base management area. It describes and advocates a data model with binary relations between entities.
ANSI/X3/SPARC. Interim Report: Study Committee on Data Base Management Systems. ACM SIGMOD Newsletter, 1975.
Ash, W. L., and Sibley. TRAMP: An Interpretive Associative Processor with Deductive Capabilities. 1968 ACM Natl. Conference, 144–156 (1968). TRAMP is the implementation of a binary relational model in a question answering system. It accepts definitions of relations in terms of other relations (e.g. the grandfather as a function of father and mother) which leads to deductive capabilities.
Ashany, R. Concepts of Data Manipulation. The Connection Matrix Method. IBM System Development Division, Poughkeepsie, T.R. 00.2200 June 1971 Information is represented as a binary matrix, where the rows represent entities, the columns represent attributes, and a 1 in a position indicates that the attribute is true with respect to the entity, otherwise false. Sparse matrix techniques have to be applied to reduce storage requirements.
Astrahan, M. M., and Gosh, S. P. A Search Path Selection Algorithm for the Data Independent Accessing Model (DIAM). Proc. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. A heuristic algorithm is described which constructs a DIAM access path to a given query in RIL (Fehder).
Astrahan, M. M., and Chamberlin, D. D. Implementation of a Structured English Query Language. CACM 18, 580–588 (1975). Describes essentially the SEQUEL interpreter and the reduction algorithm employed by it to make use of secondary indexes for "minimization" of data accessing operations.
Bachmann, C. W. The Programmer as Navigator. CACM 16, 653–658 (1973). C. W. Bachmann's famous 1973 ACM Turing Award Lecture.
Bachmann, C. W. Trends in Data Base Management. AFIPS NCC 1975 Proc. vol. 44, 569–576 (1975). Trends are: 1. The evolution of a tripartite data description (conceptual, internal, external) as used by ANSI/X3/SPARC. 2. The current debate data structured model (graph, network) vs relational model contributes to the understanding of the nature of data. 3. The introduction of new hardware to support data base algorithms.
Bayer, R., and McCreight, E. Organization and Maintenance of Large Ordered Indexes. Acta Informatica 1, 173–189 (1972). The described hierarchical index organization (B-tree) has become a standard storage structure. Logarithmic search and efficient insert, delete are characteristics of the method.
Bayer, R. Symmetric Binary B-trees, Data Structure and Maintenance Algorithms. Acta Informatica 1, 290–306 (1972). Symmetric Binary B-trees are a modification of the storage structure described by Bayer and McCreight.
Bayer, R. Storage Characteristics and Methods for Searching and Addressing. Information Processing 74, 440–444, North Holland, Amsterdam, 1974. The paper contains a discussion of hashing and B-trees in random access, pseudo random access (i.e. indexed sequential) and virtual memories.
Bennet, B. T., and Kruskal, V. J. Stack Processing for Data Base Systems. To appear in IBM J. of Res. and Dev. (1975). Traditional stack processing algorithms are inefficient for large average stack distances as they appear in the case of a large number of distinct pages. The authors describe a new algorithm to handle this situation with drastically improved efficiency.
Bergen, M., Erbe, R., Pistor, P., Schauer, U., and Walch, G. An Environment for the Interactive Evaluation of Scientific Data and its Application in Computer Aided Design. Proc. Workshop on data bases for interactive design (W. M. Cleemput and J. G. Linders, editors), Waterloo, Canada, September 15–16, 1975, available from ACM. See also Schauer /149/.
Biller, H., and Neuhold, E. J. Formal View on Schema-Subschema Correspondence. Information Processing 74, Proc. of IFIP Congress, North Holland, Amsterdam, 1974.
Bjork, L. A. Recovery Scenario for a DB/DC System. 1973 ACM National Conf. Proc., 142–146 (1973). This paper is the second of two papers describing a recovery concept in a data base system. See C. T. Davies for the first of the two papers.
Bjorner, D., Codd, E. F., Decker, K. L., Traiger, I. L. The Gamma-Zero n-ary Relational Data Base Interface: Specifiacations of Objects and Operations. IBM Research Report RJ 1200, 1973. A detailed description of a low level query language accessing a relational data base.
Bobrow, R. J. An Experimental Data Management System. In Data Base Systems (R. Rustin editor), Prentice-Hall, Englewood Cliffs, 1972. The paper describes an experimental system implemented in LISP. It contains a brief but excellent discussion of the EDMS (hierarchy or network) approach vs. Codd's relational approach.
Boyce, R. F., Chamberlin, D. D., King, W. F., and Hammer, M. M. Specifying Queries as Relational Expressions: SQUARE. Data Base Management, Proc. of IFIP Work. Conf. Cargese, Corsica, April 1974, North Holland, Amsterdam, 1974. SQUARE is a syntatically terse, set oriented, high level query language based on the so-called "concept of mapping". See also Chamberlin/Boyce for "SEQUEL".
Bracchi, G., Fedeli, A., and Paolini, P. A Relational Data Base Management System. Laboratorio di Calcolatori, Instituto di Elettrotecnica, Politechnica di Milano, Internal Report No. 72-5, 1972. MORIS is a Codd relational system with a calculus oriented manipulation language. The users view (external schema) may include hierarchical structures (i.e. unnormalized data).
Bracchi, G., Fedeli, A., and Paolini, P. A Multilevel Relational Model for Data Base Management Systems. In Data Base Management, Proc. of IFIP Work. Conf., Cargese, Corsica, April 1974, North Holland, Amsterdam, 1974. Advocates the binary relational model (graph model) for the conceptual schema and many models for the external schema (hierarchical, Codd relational, etc.) as well as internal schema.
Buzen, J. P., and Chen, P. P.-S. Optimal Load Balancing in Memory Hierarchies. Information Processing 74, 271–275. North Holland, Amsterdam, 1974. A queuing model for the access to data sets in a memory hierarchy is used to analyze the allocation of data sets. The paper offers a generalization of Chen's results.
Cardenas, A. F. Evaluation and Selection of File Organization — a Model and System. CACM 16, 540–548, 1973. A program is described, which may be used to estimate total storage costs and average access time given the data organization and device related specifications.
Cardenas, A. F. Analysis and Performance of Inverted Data Base Structures. CACM 18, 253–263, 1975. See also King, Farley/Stewart, Schkolnick and Yue/Wong for recent treatments of this subject.
Cardenas, A. F., and Sagamang, J. P. Modeling and Analysis of Data Base Organization: The Doubly Chained Tree Structure. Inform. Systems 1, 57–67, 1975.
Carlson, E. P., Bennet, J. L., Giddings, G. M., and Mantey, P. E. The Design and Evaluation of an Interactive Analysis and Display System. Information Processing 74, 1055–1061, North Holland, Amsterdam, 1974. GADS is an interactive graphics system for data related to geographic locations and intended as a tool to be used by non-programmers. It provides a data extraction technique for accessing data stored in a variety of files. The paper discusses experience gained with GADS and the requirements, which must be met by a system of this kind.
Casey, R. G. Allocations of Copies of a File in an Information Network. AFIPS SJCC 1972 Proc., vol. 40, 617–225, 1972. The author gives an exact and a heuristic solution to the problem of allocating data sets within a network of computers, given the costs of storing at and transmission between nodes.
Casey, R. G. Design of Tree Networks for Distributed Data. AFIPS NCC 1973 Proc. vol. 42, 251–257, 1973.
Chamberlin, D. D., and Boyce, R. F. SEQUEL — a Structured English Query Language. ACM SIGFIDET Workshop 1974, ACM, New York, 1974. SEQUEL is a language with semantics very similar to those of SQUARE, however, with a syntax closer to natural English. See Boyce/ Chamberlin for SQUARE.
Chamberlin, D. O., Boyce, R. F., and Traiger, I. L. A Deadlock Free Scheme for Resource Locking in a Data Base System. Information Processing 74, 340–343. North Holland, Amsterdam, 1974. The authors propose to use deadlock-detection and backout of processes in case of deadlocks. Their algorithm avoids indefinite delays of a process.
Chamberlin, D. D., Gray, J. N., Traiger, I. L. Views, Authorization and Locking in a Relational Data Base System. 1975 AFIPS NCC Proc. vol. 44, 425–430, 1975. A view is a virtual relation derived form other relations via the query language SEQUEL. The problem of updating views is discussed. Views can be used for authorization. Locks temporarily restrict the access to a view for the exclusive use of one user.
Chandra, A. K., and Wong, C. K. Worst Case Analysis of a Placement algorithm related to Storage Allocation. To appear in SIAM Journal on Computing. The authors specify a heuristic algorithm to allocate data sets to disk drives such that the probability of simultaneous access of one disk drive is minimized. The worst case performance of the algorithm is analyzed. See also Easton/Wong.
Chang, S.K. Data Base Decomposition in a Hierarchic Computer System. ACM SIGMOD 1975 Int. Conf. on Mgmt. of Data, San Jose, 1975. The author has extended Casey's results by allowing a non-linear cost function.
Chen, P. S. Optimal File Allocation in Multilevel Storage System. 1973 AFIPS NCC Proc. vol. 42, 277–282, 1973. A treatment of the hierarchy allocation problem taking queuing effects into considerations. See also Buzen/Chen.
CODASYL Development Committee. Language Structure Group. An Information Algebra. CACM 5, 190–204, 1962. An "oldtimer" and source for many ideas. Contains, for example, the definition of an entity or the idea that files may be interpreted as sets of n-tuples on which then joins, union and intersection can be performed.
CODASYL Programming Language Committee. DBTG-Report. 1971. Available from ACM. The original DBTG proposal.
CODASYL Programming Language Committee. DBLTG proposal, February 1973. Contains the COBOL data manipulation and suvschema data definition language. The languages are essentially those of ref. 35.
CODASYL Data description Language Committee. Data Description Language. Journal of Development, June 1973. Essentially the same data definition language as in 35.
CODASYL Systems Committee. Feature Analysis of Generalized Data Base Management Systems. Technical Report, May 1971. Available from ACM. Primarily compares commercially available systems, contains also a network data model.
Codd, E. F. A Relational Model of Data for Large Shared Data Banks. CACM 13, 377–387, 1970. The paper in which Codd introduced the (Codd) relational model of data.
Codd, E. F. A Data Base Sublanguage Founded on the Relational Calculus. 1971. ACM SIGFIDET Workshop, ACM, New York, 1971.
Codd, E. F. Further Normalization of the Data Base Relational Model, and Relational Completeness of Data Base Sublanguages. In Data Base Systems (R. Rustin editor). Prentice-Hall, Englewood Cliffs, 1971.
Codd, E. F. Seven Steps to Rendezvous with the Casual User. In Data Base Management, Proc. of IFIP Work. Conf. Cargese, Corsica, April 1974, North Holland, Amsterdam, 1974. The description of seven steps to a proposed natural language question answering system. The steps are: simple data model, high level internal logic, clarification dialogue, query restatement, declarative query, multiple choice interrogation and a definition capability.
Codd, E. F. Recent Investigations in Relational Data Base Systems. Information Processing 74, 1017–1021, North Holland, Amsterdam, 1974. A brief survey of Codd's relational model including a discussion of normalization and data sublanguage types. The author lists concurrency, performance, superimposition and storage access theory among the topics needing investigation.
Conway, R. W., Maxwell, W. L., and Morgan, H. L. On the Implementation of Security Measures in Information Systems. CACM 15, 211–220, 1972. The main idea in this paper is to perform checking of security only "once at compile time", an approach which is conscious of the CPU as a resource. The paper contains also a discussion of security systems implemented by 1972.
Conway, R. W., Maxwell, W. L., and Morgan, H. L. A Technique for File Surveillance. Information Processing 74, 988–992. North Holland, Amsterdam, 1974. A technique implemented by the authors in their system ASAP is described. Each file has associated with it a set of function declarations, which are compiled into a file surveillance program. All accesses to the file have to pass through the surveillance program, which can then be used to perform certain automatic functions.
Dana, C., and Presser, L. An Inforamtion Structure for Data Base and Device Independent Report Generation. AFIPS FJCC 1972 Proc. vol. 41, 1111–1116, 1972. The paper describes high level elements for the generation and manipulation of reports.
Date, C. J., and Hopewell, P. Storage Structure and Physical Data Independence. 1971 ACM SIGFIDET Workshop, ACM, New York, 1971.
Date, C. J., and Hopewell, P. File Definition and Logical Data Independence. 1971 ACM SIGFIDET Workshop, ACM, New York, 1971.
Date, C. J. An Introduction to Data Base Systems. Addison-Wesley, Reading, Massachusetts, 1975. Similar to Wedekind's book, one of the first attempts of a comprehensive introduction to data base systems. Many annotated references.
Davies, C. T. Recovery Semantics for a DB/DC System. 1973 ACM Natl. Conf. Proc., 136–141, 1973. Together with Bjork's paper an easy to understand introduction to a recovery concept.
Dearnley, P. A. Operation of a Model Self Organizing Data Management System. Comp. Journal 17, 205–210, 1974. Among others the system observes patterns of usage and restructures the data accordingly. Simulation results are reported.
Delobel, C., and Casey, R. G. Decomposition of a Data Base and the Theory of Boolean Switching Functions. IBM J. Res. Develop. 17, 374–386, 1973. Deals with the problem of decomposition of a flat file with (enormous) redundancy into a set of flat files having the minimal cover property, i.e. allowing to derive the same information as the original file without allowing further decomposition.
Di Paola, R. A. The Solvability of the Decision Problem for Classes of Proper Formulas and Related Results. Rand Corp., Santa Monica, Calif. Technical Report R-803-PR, August 1971. The paper deals with the solvability of the decision problem of a class of questions to be processed by Rands Relational Data File. See Levien/Maron.
D'Imperio, M. E. Data Structures and their Representation in Storage. Annual Review in Automatic Programming, vol. 5, Pergamon Press, 1969.
Dittmann, E. L. Klassifizierung von Datenunabhaengigkeit fuer den System-Entwurf. Technische Hochschule Darmstadt. Berichte der Informatik-Forschungsgruppen DV75-1
Doerrscheidt, A. Das Konzept des Objektbeschreibungsbaumes als Grundstruktur eines graphenorientierten Datenbankmodells. Lecture notes in computer science 26, 532–541, Sringer Verlag, Berlin, 1975. Describes a typically graph oriented data model based on LISP ideas.
Durchholz, R., and Richter, G. Concepts for Data Base Management Systems. Data Base Management Proc. IFIP Work. Conf. Cargese, Corsica, April 1974. North Holland, Amsterdam, 1974. Influenced by the data model of the "CODASYL Feature Analysis" the authors discuss a hierarchical data model and schema.
Earley, J. Towards an Understanding of Data Structures. CACM 14, 617–628, 1971. Sketches some ideas related to a theory of data structures similar to the available theory of formal string languages.
Earley, J. Relational Level Data Structures for Programming Languages. Acta Informatica 2, 293–309, 1973. A proposal for the incorporation of relational level data structures into ALGOL like languages.
Easton, M. C., and Wong, C. K. The Effect of Capacity Constraints on the Minimal Cost of a Partition. JACM, 22, 441–449, 1975. A new algorithm to the problem considered by Chandra/Wong is proposed, which accepts capacity constraints.
Easton, M. C. Model for Interactive Data Base Reference String. IBM Research Report RC 5050, Sept. 1974. Describes a modification of the independent references model, which describes measured behaviour well. An advantage of the model is its analytical tractability under working set assumptions.
Edelberg, M. Data Base Contamination and Recovery. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. The paper describes an algorithm, which for a given error and a given set of data transfers (i.e. log) determines the error propagation into processes and data blocks. A recovery algorithm is also described, which restores blocks and reruns processes.
Ehrich, H. D. Grundlagen einer Theorie der Datenstrukturen. Acta Informatica 4, 201–211, 1975. A graph oriented data model and graph oriented schemata within the model are investigated from a more mathematical point of view.
Engles, R. W. A Tutorial on Data Base Organization. Annual Review in Automatic Programming vol. 7 part 1, Pergamon Press, 1972.
Eswaran, K. P., Gray, J. N., Lorie, R. A., and Traiger, I. L. On the Notions of Consistency and Predicate Locks in a Data Base System. IBM Research Report RJ 1487, December 1974. The paper defines the notion of transaction, consistency within concurrency, and predicate locks and their consequences. A language for predicate specification is proposed, and an algorithm is presented which determines whether two such predicates overlap.
Eswaran, K. P., and Chamberlin, D. D. Functional Specifications of a Subsystem for Data Base Integrity. IBM Research Report RJ 1601, 1975. Contains a classification of consistency rules. Consistency rules are interpreted as routines to be invoked after changes of the data base.
Everest, G. C. Concurrent Update Control and Data Base Integrity. Data Base Management, 241–270, Proc. IFIP Work. Conf. Cargese, Corsica, April 1974. North Holland, Amsterdam, 1974. Preclaiming of resources to prevent deadlocks is advocated by the author.
Falkenberg, E., Meyer, B., and Schneider, J. Resultatspezifizierende Handhabung von Datensystemen. Lecture Notes in computer science 1, Springer Verlag, Heidelberg, 1973. Informal discussion of the "Gegenstandsmodell", a data model, and of a high level manipulation language for it.
Falkenberg, E. Time-Handling in Data Base Management Systems. University of Stuttgart, Institut fuer Informatik, Internal CIS-Report 07/74, 1974. Adds the dimension of time to (for example: A is employee of B from T1 to T2) stored relations and extends a data manipulation language to cope with the time dimension.
Faikenberg, E. Strukturierung und Darstellung von Information an der Schnittstelle zwischen Datenbankbenutzer und Datenbank-Management-System. Thesis, University of Stuttgart, 1975. A detailed description of a data model and a data manipulation language where both are closely related to concepts in natural language. The model is graphoriented though it allows for n-ary relations which can be (and graphically are) interpreted as joins of binary relations.
Farley, J. H. G., and Stewart, S. A. Query Execution and Index Selection for Relational Data Bases. Technical Report CSRG-53, University of Toronto, March 1975. See also Cardenas for recent investigations into this subject.
Fehder, P. L. The Representation Independent Language. IBM Research Reports RJ 1121 (1972) and RJ 1251 (1973). The papers describe RIL, the data manipulation language to the DIAM system.
Fehder, P. L. The Hierarchic Query Language (HQL) part 1. IBM Research Report RJ 1307, Nov. 1973. Describes a query language to operate on IMS like hierarchic data.
Feldman, J. A., and Rovner, P. P. An ALGOL based Associative Language. CACM 12, 439–449, 1969. The high level, ALGOL like programming language LEAP is based on binary associations, which are implemented using a hash coding technique.
Fernandez, E. B., Summers, R. C., and Coleman, C. P. An Authorization Model for a Shared Data Base. ACM SIGMOD 1975 Intl. Conf. on Mgmt. of Data, San Jose, 1975. Authorization is governed by predicates over applications and data base contents and enforced primarily at compile time.
Fiedler, H. Datenschutz und Gesellschaft. Lecture Notes in Computer Science, vol. 26, 1975 A survey of the discussions on privacy.
Finkel, R. A., and Bentley, J. L. Quad-trees: a Data Structure for Retrieval on Composite Keys. Acta Informatica 4, 1–9, 1974. A generalization of binary trees for the search on composite keys.
Florentin, J. J. Consistency Auditing of Data Bases. Comp. Journal 17, 52–58, 1974. Consistency rules are predicate calculus expressions over the data base contents. Problems of their implementation are discussed.
Frank, R. L., and Sibley, E. H. The DBTG Report: An Illustrative Example. University of Michigan, ISDOS — working paper — 7. Shows in detail the steps, which have to be made to get a COBOL application program running in the DBTG approach.
Frank, R. L., and Yamaguchi, K. A Method for a Generalized Data Access Method 1974 AFIPS NCC Proc. vol. 43, 45–52, 1974. Describes ideas and a keyword oriented language to tailor access methods to the users specifications.
Fraser, A. G. Integrity of a Mass Storage Filing System. Comp. Journal 12, 1–5, 1969. Describes the recovery in MULTICS.
Frasson, C. A System to Increase Data Independence in an Hierarchical Structure. Lecture Notes in Computer Science, vol. 34 (GI 1975), Springer Verlag, Heidelberg, 1975. Describes how IMS structures can be accessed independent of their position in the hierarchy.
Genton, A. Recovery Procedures for direct Access Commercial Systems. Comp. Journ. 13, 123–126, 1970. Describes elementary checkpointing and journaling techniques.
Ghosh, S. P., and Senko, M. E. String Path Search Procedures for Data Base Systems. IBM J. Res. Dev. 18, 408–422, 1974. Within DIAM the reduction of queries to access paths in a network is considered. An algorithm is given, which is claimed to yield an access path of minimum "path cardinality".
Ghosh, S. P., and Lum, V. Y. Analysis of Collision when Hashing by Division. Inform. System 1, 15–22, 1975. It is analytically shown that "hashing by division" is in general best.
Ghosh, S. P., and Tuel, W. G. A Design of an Experiment to Model Data Base System Perfromance. IBM Research Report RJ 1482, Dec. 1974. The authors construct a linearized performance model and evaluate the model by comparison with measurements in an IMS system.
Goldstein, R. C., and Strnad, A. J. The MacAims Data Management System. 1970 ACM SIGFIDET Workshop, ACM, New York, 1970. MacAims is an early relational system.
Gorenstein, S., and Galati, G. Data Base Reorganization for a Storage Hierarchy. IBM Research Report RC 5063, Oct. 1974. The problem considered is that of clustering records into blocks (i. e. units of transfer) in a way as to minimize the number of transfers necessary.
Greenfeld, N. R. Quantification in a Relational Data System. 1974 AFIPS NCC Proc. vol. 43, 71–75, 1974. Discusses optimization techniques for a relational system like LEAP (see Feldman/Rovner).
Haerder, T. Die Implementierung von Zugriffspfaden durch Bitlisten. Technische Hochschule Darmstadt, Berichte der Informatik-Forschungsgruppen DV74-2. The author proposes bit lists as an index organization and investigates when bit lists are superior to conventional methods of indexing.
Haerder, T. Zugriffszeitverhalten bei der Auswahl von Saetzen aus einer Datenbank. Technische Hochschule Darmstadt, Berichte der Informatik-Forschungsgruppen DV 74-3. Analysis of access with the help of simulation. Includes a comparison of storage structures for indexes.
Hall, P. A. V. Common Subexpression Identification in General Algebraic Systems. IBM UK Report UKSC0060, Nov. 1974.
Held, G. D., Stonebraker, M. R., and Wong, E. INGRES — a Relational Data Base System. 1975 AFIPS NCC Proc. vol. 44, 409–416, 1975. INGRES is a relational data management system with calculus based QUEL as its high level query language. An interesting plan of the authors is to incorporate access control and integrity assurance via query modification at preprocessing time.
Hoffmann, L. J. (editor). Security and Privacy in Computer Systems. Melville Publishing Company, Los Angeles, 1973.
Housel, B. C., Smith, D. P., Shu, N. C., and Lum, V. Y. DEFINE: A Nonprocedural Data Description Language for Defining Information Easily. Proc. of ACM Pacific, San Francisco, April 1975, ACM, New York, 1975. Describes a language DEFINE to map graph structures to a linear form, which is then referenced by (and processed according to) a translation specification, written in the language CONVERT. See Shu et al.
Inglis, J. Iverted Indexes and Multilist Structures. Comp. Journ. 17, 59–63, 1974. Discusses how to use multilist structures in order to maintain inverted files.
Karp, R. M., McKellar, A. C., and Wong, C. K. Near-optimal solutions to a 2-dimensional placement problem. IBM Research Report RC 4740, also to appear in SIAM Journal of Computing. The problem considered is the placement of records in a 2-dimensional storage array, so that the expected distance between two consecutive references is minimized.
King, W. F. On the Selection of Indices for a File. IBM Research Report RJ 1341, January 1974. See also Cardenas for recent research in this area.
Knuth, D. E. The Art of Computer Programming, vol. 1: Fundamental Algorithms. Addison-Wesley, Reading, Massachusetts, 1968.
Knuth, D. E. The Art of Computer Programming, vol. 3: Sorting and Searching. Addison-Wesley, Reading, Massachusetts, 1973.
Kogon, R., Lattermann, D., Lehmann, H., Ott, N., and Zoeppritz, M. User Specialty Languages: General Information. IBM Germany, Scientific Center Heidelberg, Technical Report 75.08.007, 1975. An interactive system is introduced designed to a data manipulation language, which is very close to natural language.
Kraegeloh, K. P., and Lockemann, P. C. Retrieval in a set-theoretically Structured Data Base: Concepts and Practical Considerations, Proc. of International Computing Symposium 1973, 531–539. North Holland, Amsterdam, 1973. The described system has a natural language like query language, which is translated into a "set theoretic" intermediate language suitable for interpretation.
Lavenberg, S. S., and Shedler, G. S. A Queuing Model of the DL/I Component of IMS. IBM Research Report RJ 1561, 1975. A simplified, analytically tractable queuing model of the processes during data base access.
Lefkovitz, D. File Structures for On-Line Systems. Spartan Books, 1969.
Levien, R. E., and Maron, M. E. A Computer System for Inference Execution and Data Retrieval. CACM 10, 715–721, 1967. Introduces the Relational Data File, a system based on binary relations (see also Di Paola).
Levitt, G., Stewart, D. H., and Yormark, B. A Prototype System for Interactive Data Analysis. 1974 AFIPS NCC Proc. vol. 43, 63–69, 1974. Describes an implemented system for analysis of measurement data relying on standard analytic procedures. It makes heavy use of graphics and statistical methods.
Lewis, P. A. W., and Shedler, G. S. Statistical Analysis of Transaction Processing in a Data Base System. IBM Research Report RJ 1629, August 1975. Describes the modeling of a transaction stream as a Poisson process with a time varying rate.
Liu, S., and Heller, J. A Record Oriented, Grammar Driven Data Translation Model. 1974 ACM SIGF IDET Workshop, ACM, New York, 1974. Grammars may be taken as mappings of a string to a tree. Two grammars mapping different strings to equivalent trees are used as a string to string mapping specification.
Lockemann, P. C., and Knutsen, W. D. A Multiprogramming Environment for Online Data Acquisition and Analysis. CACM 10, 758–764, 1967. An earlier approach to the problem of measurement data. Prefabricated programs may be assembled communicating via data sets and parameters.
Lorie, R. A., and Symonds, A. J. A Schema for Describing a Relational Data Base. Proc. 1970 ACM SIGFIDET Workshop, ACM, New York, 1970. Describes RAM — a data base management system based on binary relations (in some sense like LEAP of Feldman/Rovner).
Lorie, R. A. XRM — an Extended (n-ary) Relational Memory. IBM Scientific Center Report G 320 — 2096, Cambridge, Massachusetts, January 1974. XRM implements homogeneous flat files on top of RAM (see Lorie/Symonds).
Lum, V. Y. Multi-attribute Retrieval with Combined Indexes. CACM 13, 660–665, 1970.
Lum, V. Y., Yuen, P. S. T., and Dodd, M. Key to Address Transform Techniques, a Fundamental Performance Study on Large Existing Formatted Files. CACM 14, vol. 4, 1971. Contains a survey and evaluations of hashing techniques as applied to large data sets.
Lum, V. Y., and Ling, H. An Optimization Problem on the Selection of Secondary Keys. Proc. 1971 ACM Natl. Conf., vol. 26, 349–356, 1971. One of the earlier investigations into the problem considered by Cardenas and others.
Lum, V. Y. General Performance Analysis of Key-To-Address Transformation Methods Using an Abstract File Concept. CACM 16, 603–612, 1973.
Lum, V. Y., Senko, M. E., Wang, C. P., and Ling, H. A Cost Oriented Algorithm for Data Set Allocation in Storage Hierarchies. CACM 18, 318–322, 1975. A cost function combining the cost of storage, CPU, channel etc. is defined and an algorithm for data set allocation is outlined, which minimizes this cost.
Maruyama, K., and Smith, S. E. Analysis of Design Alternatives for Virtual Memory Indexes. IBM Research Report RC 5087, Oct. 1974. A number of implementation alternatives for indexes organized as B-trees are analyzed resulting into formulas, which are numerically evaluated.
Maurer, W. D., and Lewis, T. G. Hash Table Methods. ACM Computing Surveys 7, 5–19, 1975.
McDonald, N., and Stonebraker, M. CUPID — the Friendly Query Language. ACM Pacific Conference, San Francisco, April 1975, ACM, New York, 1975. CUPID is a grahic, data flow diagram-like language to the INGRES system. See also Held.
McGee, W. C. Generalized File Processing. Annual Review in Automatic Programming vol. 5, Pergamon Press, 1969.
McGee, W. C. File Structures for Generalized Data Management. Information Processing 68, 1233–1239, North Holland, Amsterdam, 1968. Introduces graphs as conceptual models for stored information.
McGee, W. C. A Contribution to the Study of Data Equivalence. Data Base Management. Proc. IFIP Work. Conf. Cargese, Corsica, April 1974, North Holland, Amsterdam, 1974. The author presents a number of equivalent organizations in the class of homogeneous flat file (CRM) organizations and of data description language (DBTG) organizations.
McGee, W. C. File Level Operations on Network Data Structures. ACM SIGMOD 1975 Intl. Conference, Proc., ACM, New York, 1975. The paper outlines requirements and a proposal for a data manipualtion language operating on network data structures.
Mealey, G. H. Another Look at Data. Proc. AFIPS 1967 FJCC 525–534, 1967. One of the earlier papers proposing to view information as sets and relations between sets.
Mehl, J. W., and Wang, C. P. A Study of Order Transformations of Hierarchic Structures in IMS Data Bases. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. A proposal to increase the data independence supported by IMS with the help of compiled routines, which intercept the communication between application program and data management.
Merten, A. G., and Fry, J. P. A Data Description Language Approach to File Translation. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. Describes the idea and design behind the University of Michigan data translation project.
Merten, A. G., and Severance, D. G. Performance Evaluation File of Organizations through Modeling. Proc. ACM 1972 Natl. Conf., ACM, New York, 1972
Meyer, B., and Schneider, H. J. Predicate Logic and Data Base Technology. Course Notes, University of Berlin, available from the authors. Reviews predicate logic and its use as a model for man-machine interface like in Codd's work and in natural language question-answering systems.
Minsky, N. On Interaction with Data Bases. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. The author discusses concepts, integrity rules, user views etc. He proposes a constructive approach to integrity by defining "consistent operators" to be used as primitives for more complex operations.
Mullin, J. K. An Improved Index Sequential Access Method using Hashed Overflow. CACM 15, 301–307, 1972.
Mylopoulos, J., Schuster, S., and Tsichritzis, D. A Multilevel Relational System. 1975 AFIPS NCC Proc. vol. 44, 403–408, 1975. The mechanism used in the development of the prototype system ZETA/TORUS are described. ZETA is a relational data management system with a definition capability to define a high level query language on top of lower level primitives. TORUS is built on ZETA as an "intelligent" natural language interface.
Nakamura, F., Yoshida, I., and Kondo, H. A Simulation Model for Data Base System Performance Evaluation. 1975 AFIPS NCC Proc. vol. 44, 459–463, 1975. Description of experiments simulating the processes within a data base management system in a conventional simulation package.
Navathe, S. B., and Merten, A. G. Investigation into the Application of the Relational Model to Data Translation. ACM SIGMOD 1975 Intl. Conf. Proc., 123–138. The paper concludes that Codd's relational model "... poses serious problems when used in the context of data translation as a vehicle for more powerful restructuring".
Neuhold, E. J. Data Mapping: A Formal Hierarchical and Relational View. University of Karlsruhe, Forschungsberichte, Bericht 10, February 1973. The paper compares hierarchical and relational data models in formal notation. In particular, it makes clear that the relational model is a special case of the hierarchical model.
Nievergelt, J. Binary Search Trees and File Organization. ACM Computing Surveys 6, 3, 1974.
Notley, M. G. The Peterlee IS/I System. IBM UK, Peterlee, Report UK-SC 0018. Describes IS/I, one of the earlier Codd relational implementations.
Olson, C. A. Random Access File Organization for Indirectly Accessed Records. Proc. of 1969 ACM Natl. Conf. ACM, New York, 1969.
Owens, P. J. Phase II — a Data Base Management Modeling System. Information Processing 71, 827–832, North Holland, Amsterdam, 1972. Phase II is a modeling tool designed specifically for data management evaluation.
Palermo, F. P. A Quantitative Approach to the Selection of Secondary Indexes. IBM Research Report RJ 0730, July 1970. One of the earlier papers on index selection. See Cardenas for recent results.
Palermo, F. P. A Data Base Search Problem. IBM Research Report RJ 1072, July 1972. The paper contains one of the earlier optimizing reduction algorithms for queries in predicate calculus form.
Petrick, S. R. Semantic Interpretation in the REQUEST system. IBM Research Report RC 4457, July 1973. REQUEST is an experimental, natural language question answering system.
Ramirez, J. A., Rin, N. A., and Prywes, N. S. Automatic Generation of Data Conversion Programs using a Data Description Language. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. Describes an implementation of a data definition language (due to D. P. Smith), which compiles data definitions into data translating programs.
Reisner, P., Boyce, R. F., and Chamberlin, D. P. Human Factors Evaluation of two Data Base Query Languages — SQUARE and SEQUEL. 1975 AFIPS NCC Proc. vol. 44, 447–452, 1975. A psychological experiment with 64 subjects is described and analyzed. Only nonprogrammers show a slight but statistically significant dependency on the language, which differ primarily in syntax.
Reiter, A. Data Models for Secondary Storage Representation. University of Wisconsin, MRC Report no. 1554, May 1975. The data models are designed with the objective to be used for the performance evaluation of different implementations.
Rodriguez-Rosell, J., and Hildebrand, D. A Framework for Evaluation of Data Base Systems. Proc. of ACM European Chapters International Computing Symposium 1975. An implemented framework for the measurement and evaluation of sequences of events at different levels of a data base system is presented. The different levels involve commands issued in the application program at the hgih end, and disk address reference traces at the low end.
Rothnie, J. B., and Lozano, T. Attribute Based File Organization in a Paged Memory Environment. CACM 17, 63–69, 1974. A combination of "multiple key hashing" and inverted file technique allowing for a reduction of the number of page faults for multi-key-retrieval.
Rothnie, J. B. Evaluating Inter-Entry Retrieval Expressions in a Relational Data Base Management System. 1975 AFIPS NCC Proc. vol. 44, 417–423, 1975. The employed strategy attempts to utilize the information gained with every tuple-access for the purpose of optimization.
Sayani, H. H. Restart and Recovery in a Transaction Oriented Information Processing System. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. Restart and recovery policies are defined and discussed. The author puts emphasis on performance.
Schauer, U. Ein System zur interaktiven Bearbeitung umfangrelcher Messdaten. IBM Germany, Informatik Symposium 1975, Bad Homburg. To appear as Lecture Notes in Computer Science, Springer Verlag, Heidelberg. Introduces an interactive measurement data base system combining interactive computational facilities (APL), a relational data storage, a graphics oriented data manipulation language (like "query by example", see Zloof) with access to an open ended library of PL/I or FORTRAN subroutines. See also /13/.
Schkolnick, M. Secondary Index Optimization. ACM SIGMOD 1975 Intern. Conf. on Mgmt. of Data, San Jose, 1975. See also Cardenas for similar research.
Schmid, H. A., and Swenson, J. R. On the Semantics of the Relational Data Model. ACM SIGMOD 1975 Intl. Conf. on Mgmt. of Data, San Jose, 1975. The authors are concerned with the gap between the pure formalism of Codd's relational model and the modelled part of the real world. The authors employ a kind of graph model to fill the gap.
Schmutz, H. Parenthesis Regular Languages and Relations. IBM Germany, Heidelberg Scientific Center, Technical Report 74.10.004, Oct. 1974. A special form of context-free grammars is used to describe the schema to a hierarchical data model. Pair grammars are used to describe the mapping between conceptual and internal or external view. The described system is a model for a theoretical treatment of important problems in data base systems.
Schneider, G. M., and Deasautels, E. J. Creation of a File Translation Language for Networks. Information Systems 1, 23–31, 1975. The authors propose a language for data translation in a network such as the ARPA network.
Senko, M. E., Lum, V. Y., and Owens, P. J. A File Organization Evaluation Model (FOREM). Information Processing 68, 514–519, 1968. North Holland, Amsterdam, 1969. FOREM is an evaluation and simulation tool specifically designed to evaluate data management systems.
Senko, M. E., Altman, E. B., Astrahan, M. M., and Fehder, P. L. Data Structures and Accessing in Data Base Systems. IBM Systems Journ. 12, 30–93, 1973. This paper describes the thoughts and ideas behind the DIAM system, one of the earlier comprehensive approaches to data base research systems.
Senko, M. E. Information Systems: Records, Relations, Sets, Entities and Things. Inform. Systems 1, 3–13, 1975.
Senko, M. E. Data Description Language in the Context of a Multilevel Structured Description — DIAM II with FORAL. IBM Research Report RC 5073, Oct. 1973.
Senko, M. E. An Introduction to FORAL for Users. IBM Research Report RC 5263, 1975.
Senko, M. E. Specification of Stored Data Structures and Desired Output Results in DIAM II with FORAL. Proc. of the Int. Conference on Very Large Data Bases, Boston, 1975, available from ACM. The last three references introduce DIAM II, a proposed system, which is based on binary associations and has FORAL as its query language.
Severance, D. G. Identifier Search Mechanism: A Survey and Generalized Model. ACM Computing Surveys 6, 3, 1974.
Severance, D. G. A Parametric Model of Alternative File Structures. Inform. Systems 1, 51–55, 1975. A scheme is described, which maps a "two dimensional space of parameters" to a set of data organizations including well-known conventional organizations as special case.
Shneiderman, B. Optimum Data Base Reorganization Points. CACM 16, 362–365, 1973.
Shneiderman, B., and Scheuermann, P. Structured Data Structures. CACM 17, 566–577, 1974. The paper describes an approach to deal with integrity in case of certain classes of data structures.
Shneiderman, B. A Model for Optimizing Indexed File Structures. IJCIS 3, 93–103, 1974. The paper is concerned with the selection of index size at different levels to improve performance.
Shu, N. C., Housel, B. C., and Lum, V. Y. CONVERT a High Level Translation Definition Language for Data Conversion. CACM 18, 557–567, 1975. A companion paper to Housel et al.
Sibley, E. H., and Taylor, R. W. A Data Definition and Mapping Language. CACM 16, 750–759, 1973. The paper discusses goals of a data definition language and illustrates data definition and mapping by examples.
Sibley, E. H. On the Equivalence of Data Based Systems. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. The two philosophical directions, "relational" (Codd) and the "data structured" or "procedural" (DBTG) are compared. Also data translation with its connection to data restructuring and data independence is discussed.
Sibley, E. H., and Sayani, H. H. Data Element Dictionaries for the Information Systems Interface. NBS-Report, 1974. A discussion of the need for and objectives of a Data Dictionary capability.
Smith, D. P. An Approach to Data Description and Conversion. PH. D. dissertation, University of Pennsylvania, 1971. One of the earlier data definition and mapping languages. See also Ramirez.
Smith, S. E., and Mommens, J. H. Automatic Generation of Physical Data Base Structures. ACM SIGMOD 1975 Intl. Conf. San Jose, 1975. A prototype design aid is described which generates from descriptive input IMS physical data structure definitions taking into account constraints and objective functions.
Stahl, F. A. A Homophonic Cipher for Computational Cryptograhy. AFIPS NCC Proc. vol. 42, 565–568, 1973.
Steel, T. B. Data Base Standardization — A Status Report. ACM SIGMOD 1975 Intl. Conf. on Mgmt. of Data, San Jose, 1975.
Steuert, J., and Goldman, J. The Relational Data Management System: A Perspective. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974. An introductory description of RDMS, a system being used at MIT and based on Codd's relational model.
Stonebraker, M. The Choice of Partial Inversions and Combined Indices. IJCIS 3, 167–188, 1974. See also Cardenas for research on this topic.
Stonebraker, M. A Functional View of Data Independence. 1974 ACM SIGFIDET Workshop Proc., ACM, New York, 1974. The paper first analyzes the problem with a promising formal approach, which unfortunately is not kept through up to the end. It describes the types of data independence to be provided in INGRES.
Stonebraker, M. Implementation of Integrity Constraints and Views by Query Modification. ACM SIGMOD 1975 Intl. Conf. Proc., San Jose, 1975. Describes the INGRES approach to integrity in more detail. See also Held et al.
Su, S. Y. W., and Lam, H. A Semiautomatic Data Base Translation System for Achieving Data Sharing in a Network Environment. 1974 ACM SIGFIDET Workshop, ACM, New York, 1974.
Sundgren, B. Conceptual Foundation of the Infological Approach to Data Base. Data Base Management Proc. of IFIP Work. Conf. Cargese, Corsica, April 1974. North Holland, Amsterdam, 1974. The infological approach is a kind of a conceptual data model philosophy. It may have a corresponding datalogical approach associated with it, which deals with internal data forms.
Taylor, R. W. Generalized Data Base Management System Data Structures and their Mapping to Physical Storage. Ph. D. dissertation, University of Michigan, Ann Arbor, 1971. Contains a proposal for a data definition and mapping language, which is being used in the Michigan data translation experiments. See Merten/Fry.
Taylor, R. W. Data Administration and the DBTG Report. 1974 ACM SIGFIDET Workshop Proc., ACM, New York, 1974. Among others, the author proposes to use a preprocessor to obtain data independence at precompile time.
Taylor, R. W., and Stemple, D. W. On the Development of Data Base Editions. Data Base Management, Proc. of IFIP Work. Conf. Cargese, Corsica, April 1974. North Holland, Amsterdam, 1974. The authors' concern is the evolution of a data base at a user installation and its impact on programs.
Teichroew, D. An Approach to Research in File Organization. Proc. of the 1971 SIGIR Symposium on Information, Storage and Retrieval, ACM, New York, 1971. The essential message in this paper: there is no absolutely best representation of information. Changes as a function of knowledge about the future use of the data have to be made with assistance of the computer.
Thomas, J. C., and Gould, J. P. A Psychological Study of Query by Example. 1975 AFIPS NCC Proc. vol. 44, 439–445, 1975. Reports the results of an experiment with 35 subjects, who were given questions in English to be translated into query by example (see Zloof).
Thompson, F. B., Lockemann, P. C., Dostert, B., and deverill, R. S. REL: A Rapidly Extensible Language System. ACM 1969 Natl. Conf. Proc., 399–417, 1969.
Todd, S. J. P. PRTV: A Technical Overview. IBM UKSC Peterlee, Technical Report UKSC 0075, 1975. A new description of the experimental system IS/1.
Tsichritzis, P. A Network Framework for Relation Implementation. University of Toronto, Technical Report CSRG-49, February 1975. Discusses how Codd's relational model can be implemented on top of physical networks (i.e. linked structures).
Turn, R., and Shapiro, N. Z. Privacy and Security in Data Bank Systems — Measures of Effectiveness, Costs and Protection-Intruder Interactions. AFIPS 1972 FJCC, vol. 41, 435–444.
Van der Pool, J. A. Optimum Storage Allocation for a File in Steady State. IBM J. Res. Div. 17, 27–38, 1973. Files with key-to-address transformations (hashing) and with overflow areas are analyzed. Storage utilization, overflow rate and other relevant factors are given for the steady state.
Vose, M. R., and Richardson, J. S. An Approach to Inverted Index Maintenance. Comp. Bull. 16, May 1972.
Wang, C. P., and Wedekind, H.H. Segment Synthesis in Logical Data Base Design. IBM J. Res. Dev. 19, 71–77, 1975. The authors specify a minimal cover algorithm, which calculates a set of minimal covers to a given set of relations with transitive dependencies. Each minimal cover is again a set of relations without transitive dependencies. Given the minimum cover, a set of relations in Codd's third normal form can easily be constructed.
Wedekind, H. Datenorganisation. de Gruyter, Berlin, 1972.
Wedekind, H. Datenbanksysteme I. Bibliographisches Institut Mannheim, 1974.
Wedekind, H. On the Selection of Access Paths in a Data Base System. Data Base Management, Proc. of IFIP Work. Conf., Cargese, Corsica, April 1974. North Holland, Amsterdam, 1974. The paper's concern is modeling and analysis for the determination of efficient access paths.
Wellis, M. E., Katke, W., Olson, J., and Yang, S. C. SIMS — an Integrated, User-Oriented Information System. AFIPS FJCC 1972, vol. 41, 1117–1131, 1972. SIMS is interesting for a number of reasons. It offers a high level data definition, mapping and manipulation language. Data on normal files may be mapped to a conceptual high level hierarchical form and used in the query language. Particular attention has been paid to transferability of data and programs.
Wong, E., and Chiang, T. C. Canonical Structure in Attribute Based File Organizations. CACM 14, 593–597, 1971. Each query is assumed to be a boolean expression over elementary queries. In this case the data base can be organized according to the elementary queries and access becomes essentially the problem of putting a boolean expression into some standard form.
Yao, S. B. Evaluation and Optimization of File Organization through Analytic Modeling. Ph. D. dissertation, University of Michigan, 1974.
Yue, P. C., and Wong, C. K. Storage Cost Considerations in Secondary Index Selection. IBM Research Report RC 5070, to appear also in IJCIS. For other recent results in this area of research see Cardenas.
Zloof, M. M. Query By Example. 1975 AFIPS NCC Proc. vol. 44, 431–437, 1975. The basic features of query by example are illustrated. The user's perception of data processing in this query language is that of manipulating tables in a graphically pre-established frame of reference consisting of table skeletons, into which the user fills information.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1976 Springer-Verlag
About this paper
Cite this paper
Blaser, A., Schmutz, H. (1976). Data base research: A survey. In: Hasselmeier, H., Spurth, W.G. (eds) Data Base Systems. IBM 1975. Lecture Notes in Computer Science, vol 39. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-07612-3_3
Download citation
DOI: https://doi.org/10.1007/3-540-07612-3_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-07612-4
Online ISBN: 978-3-540-38130-3
eBook Packages: Springer Book Archive