1 Introduction

Cultural heritages and histories are treasures awaiting more advanced approaches to reveal more secrets among them. Museums, galleries, libraries and archives already have their rich collections and datasets developed for many years which have been organized by domain experts. These datasets are valuable resources to integrate into databases and even some information systems, which can be accessed online by all users. Along with developments of recent techniques and tools, it is essential for modern applications to digitize and manage information of diverse cultural contexts in a more systematic and friendly way.

In this paper, a modern prototype information system on ancient Roman Empire coins and related historical figures on their sides are proposed to achieve better user experiences, richer modern application features and more intuitive data managements for cultural information applications. Additionally, a complete process of the prototype system construction is presented, including fetching data of Roman Empire numismatic information as well as some Wiki data of historical figures, constructing a graph database of linked data with necessary data elements, and building up a feasible progressive web app system. The system focuses on the information of each coin and relevant historical figures on coin sides.

2 Status

Digitizing cultural heritages seems promising, but some problems still exist. Most of cultural heritages are easy to search online, but standard and systematic information or datasets are hard to collect. Despite some highly structured datasets can be found, not all of them are free, open or sharable to all users. A lack of technical skills especially in small museums and developing countries make essential data hard to exploit because of bad data organizations, poor user interface (UI) design, complex operations, low performance and old system architectures. Besides, there are very few complete and standard strategies or approaches for data management in fields of culture and humanity, which causes the fact that people need to deal a lot with datasets in complex but diverse metadata or even a pile of unstructured datasets. Moreover, most of cultural information systems rely on some past techniques, especially some quite old ones. There is no doubt that they are working, but they don’t work well because some techniques are unfriendly, hard to use, in poor performance, out of date or even deprecated. Due to all concerns, cultural heritage data collection, cultural heritage knowledge transformation and standardization, information retrieval technologies, and data visualization are critical tasks in the future.

3 Technology Trend

3.1 Progressive Web App (PWA)

PWAs [3, 7, 8] are a category of web applications with specific features from fusions of recent techniques, friendly designs, better performances, and modern application requirements with specific functions. This category of apps is safe, responsive, installable, discoverable, linkable, re-engageable, always up-to-date, connectivity-independent, interactive with a feel like a native app for better user experience from multiple different aspects.

Not all web apps are PWAs so there is a baseline for the judgment [9]. A PWA should be responsive on multiple devices with different viewports, and of course work cross-browser. The app should be served over HTTPS (Hyper Text Transfer Protocol Secure), and each page can be linkable with a valid URL (Uniform Resource Locator). Smooth page transitions and fast first load on even 3G network are required. Accessing URLs offline on certain contents and adding to home screen make the app “native”. For a large, complex and complete PWA, more features need considerations, such as History API (Application Programming Interface), canonical URLs, metadata for searching and sharing, caching strategies, app notifications, friendly UI design, Credential Management API for login, Payment Request API and so on.

3.2 Graph Database

Graph database literally is a database using graph structures. Fast growth on graph databases recently indicates good characteristics of graphs to depict relations and benefits of organizing datasets into graph structures for diverse application requirements. Graph database, compared with traditional databases, has a better performance on queries and analytics thanks to the inherent indexed data structure in the graph model which never works on irrelevant data. Besides, graph database is more intuitive and natural on data modeling because without strict rules in relational databases or complex data organization strategies in some NoSQL databases, vertices for objects and edges for relations lead to a friendly and semantic mode for datasets. Additionally, flexibility of data structure transformation makes the database adapt to dynamic use cases on changes of schemas, attributes and relations for elastic expanding or shrinking on the data model. For real-time data stream, graph databases can even support simultaneous updates and queries.

Graph databases are in five categories [10]: operational graph databases for a broad range of transactions and operational analytics, like JanusGraph, OrientDB and Neo4j; databases of knowledge graph/RDF which are suitable for operational contexts but have inferencing capabilities and index requirements, such as AllegroGraph, Virtuoso, Blazegraph, Stardog and GraphDB; multi-modal graphs to support different model types for compound requirements, including Microsoft Azure Cosmos DB, ArrangoDB and Sqrrl; analytic graphs focusing on ‘known knowns’ problems or ‘known unknowns’ and even ‘unknown unknowns’, for example, Apache Giraph; real-time big graphs to deal with massive data volumes and high data creation rates and to provide real-time analytics, and an instance is TigerGraph.

3.3 Linked Data and JSON-LD

According to standards of World Wide Web Consortium (W3C), linked data [1, 2] is one concept under Semantic Web. Linked data is the collection of reachable and interrelated data on the web. It empowers people that publish and use information on the web. It provides a way to create a network of standards-based, machine-readable data across web. JSON-LD [4,5,6] is a lightweight JSON-based Serialization for Linked Data of W3C. It is easy for humans to read and write. It is based on the JSON format and provides a way to help JSON data interoperate at web-scale. JSON-LD is an ideal data format for programming environments, REST (REpresentational State Transfer) web services, and document-based NoSQL databases. Compared to other formats or standards for linked data, JSON-LD is easier to transfer, transform, interoperate and store.

4 Implementation

4.1 Data Sources and Tools

The data used in this project is from three data sources: MANTIS (A Numismatic Technologies Integration Service) for coin entities, OCRE (Online Coins of the Roman Empire) for coin series and DBpedia for historical figures information from Wikipedia. There are over 44000 coins entities, more than 11000 types of coin series and beyond 150 historic figures.

The whole system is built up with Node.js, the JavaScript runtime. ArangoDB is the database to store datasets of coins, historical figures and their relations into a graph. Koa is for the prototype server. Vue.js is for web components used on the page. Element-UI is for the web UI widgets and design. Fetch API polyfill is for sending requests to the server. Webpack is the module bundler for packaging code files of Vue and Element-UI. Lighthouse is the tool to check the quality of web pages for PWAs. Let’s Encrypt is the certificate authority for configurations of HTTPS.

4.2 System Architecture

As a prototype system, there are three main parts: front end pages, back end server and the database. Apart from these, there is a mini program for data integration, and this part is currently standalone because datasets are not real-time. Front end pages are the client part which interacts with users with brief UI design and responsive layout. Back end server is small and simple to deal with incoming requests, send responses back and conduct some operations on the database. The database, operated by logical codes in the server, stores the graph of coins, historical figures and relations among them. Mini program will store elements of the graph (Fig. 1).

Fig. 1.
figure 1

System architecture

4.3 Data Integration

As the complexity on data integration is not so high, the strategy is simply the batch processing not real-time. The goal of data integration is to get all necessary datasets, synthesize data fields from both coin entity type and coin series type into a more general and refined coin type, simplify data fields of historical figure datasets from DBpedia, and rebuild relations between the refined coin type and the simplified historical figure type. There are three steps to acquire the needed dataset: fetching, preprocessing and aggregating. Data fetching is downloading data files since MANTIS, OCRE and DBpedia all provide APIs of structured datasets. All downloaded files are in JSON or JSON-LD formats because MANTIS and OCRE both have JSON-LD APIs, and DBpedia has JSON API. Data preprocessing is to select required fields in each JSON structure, which is the core part in integration. Data aggregating is to combine data fields from both coin entity type and coin series type into the coin type based on relations in the JSON-LD files from MANTIS and OCRE and recreate relations between coin type and historical figure type according to data files from OCRE and DBpedia. The last step of aggregation is to add schemas to the context and standardize all key terms into IRIs based on the JSON-LD standard. The JSON-LD standard used is the version 1.0, and schemas are from multiple sources: Schema.org for some general annotations, Nomisma.org for some professional numismatic descriptions and some W3C schemas for complements as needed. Due to relationships among coins, coin series and historical figures from fetched data files, coin datasets should be downloaded firstly, then coin series datasets can be downloaded secondly after getting related coin series IDs from coin data, and finally historical figure datasets are downloaded when acquiring related historical figure IDs from coin series data (Fig. 2).

Fig. 2.
figure 2

Data integration flow

4.4 PWA Construction

The general workflow to build up a PWA is to build up a responsive web application first, then add necessary and some optional features of a PWA based on application requirements, and finally use Lighthouse to check how well the application works for later plans on existing problems or optimizations (Fig. 3).

Fig. 3.
figure 3

PWA conversion chart

For a simple prototype web application, it can be easily constructed as a single page application with web components, so it is very convenient to use Vue.js and Element-UI to accomplish a brief page. These tools need Webpack to build the target webpage for use. Server can be constructed with Koa and latest JavaScript features in a relatively short time for a basic router, response codes and database operations. Interactions between the page and the server are via Fetch API for requests.

For a PWA, HTTPS is required so a valid certificate is necessary for configurations and the application. This system is not very complex so three main parts are necessary after HTTPS. The first is the responsive page, and the solution is the grid layout and the flexbox layout. Grid layout is for the whole page layout while the flexbox for components of the page. The most important is to set layout in different viewports to adapt diverse device screens. The second is the caching mechanism for offline use. The core is the service worker, and all cached contents or datasets rely on IndexedDB API.

Using a service worker to host the request and save necessary data into the IndexedDB at the client side can well realize the offline requirement. The last is the web app manifest to add the web app to the home screen of the device. A manifest is required to configured and then added to the current app. For better user experiences, extra efforts on transitions and loadings are preferred. For effective indexability and shareability, it is better to edit the meta tags of the page. Above all, Lighthouse can be applied to check whether the web app is progressive and how well it performs.

5 Conclusion

With the workflow of this PWA, a prototype of modern cultural information system is done, and the app has good features of PWAs indeed, which is one way to solve problems of web applications on multiple platforms. Web components increase the reusability and reduce duplicate developments. Module bundler can partition large modules to smaller ones and emphasize the hierarchy of the system. Fetch API awaits its maturity in the future to totally replace the existing AJAX so the polyfill library is applied. Graph database still needs more concerns and applications during the transition from NoSQL to NewSQL. For this system, a lot of new techniques, skills and tools requires acquisitions and proficiency, which can be current humanity realms. For larger cultural information systems, the architecture can be more complex. For huge relation graphs and real-time uses, TigerGraph can be a good choice. Diverse datasets can be transformed into linked data with some advanced methods from data mining or artificial intelligence and standards for Semantic Web from W3C and further presented as Knowledge Graphs. Server cache can be useful for some data reuses. Large distributed servers can be used for some enterprise-level applications. Excessively many functions can be spread into individual modules of a large microservice application. There can be exceptionally considerable strategies and approaches for information systems of humanity.