Keywords

1 Introduction

In the today highly dynamical digital era, a curriculum vitae (CV) - written overview of someone’s life’s work including a complete career record, academic formation, publications, qualifications, etc. - can appear not anymore as an optimal and reliable source of information about a future employee. First, information can be falsified [4, 13], and it is time-consuming, often impossible, to obtain reliable confirmation regarding the qualification of the applicant, due to, for instance, the original language of education certificate, or difficulty to reach out to a previous employer. Second, process of matching candidate’s CV with the job description is performed by hiring company and can be very costly and not very efficient. One reason is limited capacities of HR personnel regarding the number of the applications they can analyse, combined with the attempts of some applicants to maximise their job opportunities by submitting many applications simultaneously. At the same time, an applicant with a perfectly matching profile may be just not aware of a job opening due to variety of the open positions currently available and the different platforms on which the openings can be listed. Third, as, at the time being, HR tasks are usually conducted by personnel, any decision taken, especially at the selection and matching process, can be subjective and can introduce bias regarding gender, age, ethnicity, etc. Moreover, unintentional human-made mistakes can also take place. In our approach, these aspect are mainly considered for the pre-selection phase, in order to ensure a fair panel of candidates for the human task. All the above are currently affecting and making the job openings filling an ineffective and inefficient task.

This leads us to the question on how to structurally improve these inefficiencies by using approaches from distributed ledgers, from artificial intelligence and from data science. The main initial challenges identified are as follows: what other data sources can be of interest to the companies when looking for the new employees? Are there available technologies that can be used to optimise the candidates selection and matching their profile to the job description? Is it possible to remove the bias of a human decision-making, and if yes, what is the impact of such automatisation? How can a partially automated solution be integrated in the intrinsically human-based task of balancing soft and hard skills to, frequently unwritten or unsaid, job opening constraints and soft/organisational requirements?

Companies often use multiple third-party providers (e.g., Linkedin, Glassdoor) to advertise the information about open careers opportunities and attract the applicants. While being very helpful for the applicants to search for a job in specific areas, and even providing interfaces for data extraction from a CV (“Easy Apply” in Linkedin, CV upload in Glassdoor), the main benefit for the company lays in the increased visibility of an advertisement; nevertheless no proof of data-authenticity and accuracy can be provided, and even higher increase in received application load requires further matching effort and more manpower. Some partial solutions already exist in supporting the tasks of employee selection and management, e.g. people analytics, and in simplifying and promote the interaction with complex information, such as gamification. In fact, techniques used to mine consumer and industry data can help managers and executives to make decisions about their employees. By applying data science and machine learning techniques to large sets of talent data, people analytics can result in better decision-making. On the other hand, gamification approaches allowing data scientists to collect focused information, aim to build a picture of individual employees’ personalities and cognitive skills [17]. However, use of such third-party service providers and people analytics, apart form bringing some functional benefits, can create serious threats from the perspective of individual’s privacy. Additionally, no support for information validation and accuracy checking is provided by such solutions.

Using blockchain technology (BCT) has been recently proposed to ensure authenticity of human resource information, taking into account privacy concerns [16]. There also exist the approaches to employ blockchain technology for verifying authenticity of diplomas and certificates [4], and for setting up a global higher education credit platform  [13]. Michaelides provides an interesting general discussion about how combining BCT with artificial intelligence (AI) techniques will enable more accurate approaches to hiring employees [10]. However, no specific solution or implementation of BCT-based system, that is able to address all the concerns mentioned above, has been proposed yet.

The use of BCT is here of paramount importance to enforce trust creation inside an otherwise untrusted network, composed by companies, universities, students and applicants. Even though BCT is not enough to guarantee truthfulness in the data, the trustfulness is one of its precondition. For its additional requirement, we point the reader to the paragraph about the Oracle, in the next Section. Building on the existing generic ideas, such as combining BCT and AI [5], current practices of HR analytics [17], and taking into account identified problems, this work envisions a blockchain-based solution for a trusted support platform for the job placement task.

We propose to employ matching algorithms over a subset of the verifiable data in the distributed settings, while preserving privacy of the applicants. In the next section, we analyse the need for efficient and trustworthy HR information management system, define its design goals, and present a potential solution sketch. Section 3 provides a short overview of the related work, whether Sect. 4 discusses the proposed architecture and considers some expected effects and some foreseeable issues by its application, from a technological, legal, and social perspectives. Finally, Sect. 5 concludes the paper by wrapping up the idea and the next steps of this work.

2 Requirements and Solution Sketch

In this section, we first formulate the problem statement and, based on them, define desirable design goals of the system. Second, we present the design of our proposed solution.

2.1 Problem Statement and Design Goals

The goal of this work is to address inefficiencies and limits in the current process of selecting and matching candidates for a job opening, from legal and economical perspectives. In particular, we focus on the following research questions (RQs):

  • RQ-1: How to ensure trust to the data that are provided by the applicants?

  • RQ-2: How to optimise the process of selection of the candidates for the companies?

  • RQ3: How to facilitate the process of targeted dissemination of the information about the candidate, such as the skills, education, and previous work experience?

  • RQ-4: How to ensure candidate’s privacy and the data security?

  • RQ-5: Is it possible to remove possible discrimination and bias of human-made decision? What would be possible consequences of such an automatisation?

In order to address the questions defined above, we formulate the properties that the system we intend to design should hold in a form of design goals (DGs). For each DG, we also specify which research question(s) the property is aiming to solve:

  • DG-1: Distributed architecture with direct involvement of the trustworthy data sources (RQ-1).

  • DG-2: Traceability and verifiability of the data - namely, each single atomic entry of a CV - provided by the applicants (RQ-1, RQ-2).

  • DG-3: Flexible and interoperable data-expressiveness layer for (I) the definition of the desired skills (singleton expressing expertises) on the company side and (II) the data model for management of the applicants’ data (RQ-3).

  • DG-4: User-centric control over the data provided by the applicants, data privacy and security (RQ-4).

  • DG-5: Discrimination- and bias-free system that provides efficiency in selection of candidates and their matching to the corresponding job description (RQ-2, RQ-5).

2.2 Solution Sketch

Striving to ensure the desired properties, we propose to employ blockchain technology and artificial intelligence techniques to enhance how currently “HR tech” is approached.

Fig. 1.
figure 1

The design of the proposed solution towards a trusted support platform for the job placement task.

Figure 1 presents the design of the proposed solution. Below, we depict each component and the step-by-step process flow. Then, we discuss possibilities (alternative approaches) for employing gamification, distributed matching algorithm and identity management, and distributed ledger technology/blockchain in our solution.

Applicant is a user that is looking for an employment. They register in the system and provide the data about themselves, which will be verified later on, and will be used to extract some structured information to be hashed and stored on the blockchain. All the entries composing the personal records are provided solely by the users; they are the only subjects entitled to create new information about themselves.

Company is another type of actor with the aim of finding/selecting employees. Additionally, the firm is also in charge of validation of the applicants data, whenever the working experience entry is related to a job position hosted by it. To build trustworthy data sources, a company can be a participant of the permissioned blockchain network (such as Hyperledger Fabric [1]), set up to provide a possibility to verify the information about the applicants, including work and education history, using the hashes generated from the structured anonymous data and stored on the blockchain. This actor also needs to provide a well suited description of the position offered, by means of a ranked and weighted set of skill. Universities and training institutions are specific types of company, and are also expected to provide validation of the parts of the applicant records related to education.

Oracle is a verification platform that can perform proofing tasks, to establish the truth. Different approaches for validation of the applicant provided data can be employed: (i) company, where applicant worked previously, validates data, (ii) validation is outsources to a third party, (iii) crowd-based method, where other actors in the system (peers) can support the applicant claim. Obviously, there are decreasing reputation levels connected with those types of validation: from a fully credible ones, when the confirmation of an experience entry and its evaluation is performed by the company itself, to intermediate trustworthiness, achieved through the use of third-party paid validation services, till a basic validation provided by majority voting amongst large enough group of peers. The latter is the faster approach and can provide a solution for the cold-start problem, even if it results in the lowest data authenticity level guarantee.

Matching algorithm is the application of AI techniques to provide ranked list of matches to the declared skillset of a job opening. Given the relative stability and low frequency in job posting, it is possible to pre-compute a suitable representation of all the available openings, to make the solution scalable. We expect to be able to compute the matching between the applicants data and jobs profiles in distributed settings. The matching algorithm will be based on natural language processing (NLP) and document semantic similarity estimation [9], and will benefit also from some AI-based concepts identification improvement for multi-grams [15]. This approach can support fairness by construction and bias-free results, at least from individual person prejudices.

Web-Interfaces. We define the following web-interfaces: (i) an interface for the applicant to register and input the information related to his education and employment history (after verification, these data will be used as an input to the matching algorithm); (ii) an interface for the company to define and adjust the desired profile of the future employee in terms of the skills and their importance (this information will also be used as an input to the matching algorithm), as well as for after-matching selection of the candidates; (iii) an interface for the oracle (i.e., crowd-based or for the previous employers/institutions) for the verification of the data provided by the applicant. Via these interfaces it will also be possible to query the information stored on the blockchain, to verify the authenticity of the applicant data. We will also leverage gamification techniques for the users, in order to simplify the initial matching process between applicants and companies (more details will be provided further in this section).

In the following step-by-step process description, we present our vision on the system behaviour:

  1. 1.

    Applicant registration. An applicant needs to be registered in the system in order to provide his education and employment history, and therefore to be able to release his data for potential job posting matching. At the registration phase, the verification, whether an applicant has already an account in the system, is performed. Also, companies need to be registered as well, in order to guarantee their unique identification and responsibility for the job posting provided. A Self Sovereign Identity (SSI) approach is a good candidate for this function, as it can naturally manage distribute data while helping to comply with the increased difficulties stemming from the full conformity to GDPR.

  2. 2.

    Data authenticity verification. In order to ensure that provided data are trustworthy, a verification platform (Oracle) is employed.

  3. 3.

    Information extraction. After the data verification, the factual structured data about the participants are being extracted. The sources of such data must be the companies and education institutions.

  4. 4.

    Ledger updates. The ledger then can be updated directly by the data sources ensuring authenticity for the data and for the source, using hash function and digital signature for each transaction.

  5. 5.

    Job profile definition and adjustment. To introduce a new job position availability, a registered company must define a job profile, by specifying the expected skills, and their importance.

  6. 6.

    Matching algorithm. The input to the algorithm is formed by the structured data, extracted from the verified information about the applicant and the job profile generated by a company. When a match happens, both sides should be notified, still without revealing personal information.

  7. 7.

    Creation of the channel between the company and the applicant. A channel can be created automatically, or initiated by the company/applicant based on the top results of the output of the matching algorithm. Before this channel materialisation, the identity of the applicant is hidden from the company, yet the matching data authenticity is provided.

As already mentioned above, blockchain technology can be used to ensure authenticity and traceability of the HR information. However, choosing a technical solution that can effectively support the legal requirement in this domain is of paramount importance: a DTL (distributed ledger) should provide support for information revocation. Additionally, the economic aspect (cost of transactions) should be also considered, for a real scalability of the solution. Therefore, we propose to employ a permissioned blockchain technology, in order to create a network of peers, which will serve as a source of the trustworthy data related to the employment history of the candidate. Based on the properties of the permissioned blockchain technology, the access to the network and to the data stored on the ledger is governed by the membership service, thus, we can ensure that only legitimate companies can participate in the network. Moreover, we intend to store only hashes of the structured data about the candidates to enable verification and to ensure privacy (anonymity) of the candidates. Digital signature of every transaction submitting a hash as an update of the ledger will ensure authenticity of the data source. We envision to use Hyperledger Fabric - an implementation of the permissioned blockchain technology - to gain on the maturity of the framework, its flexibility in terms of the consensus protocol choice and the identity management approaches, and the built-in possibility to employ privacy-preserving mechanisms (channels, private collections, etc.) [1]. We will confirm our choice based on the business requirements of the use case.

Identity Management. To ensure that an applicant can create only one account in the system, identity verification and management mechanisms must be put in place. However, the account management has to be flexible to reflect multiple affiliations and guarantee the user’s privacy. For this, one can employ distributed self-sovereign identity management approach (SSI). SSI systems combine distributed ledger and cryptographic primitives to create immutable identity records. The individual maintains a number of claims or attributes (that define the used identity) received from any number of organizations, including the state, in a networked ecosystem that is open to any organization to participate (e.g., to issue credentials) [12]. Each organization can decide whether to trust specific credentials based on which organization verified or attested them. Difference with other identity management approaches such as centralized or federated identity management solutions, is that once the claims are generated, a user is controlling what to reveal, and he can now be authorized without involving an intervening authority every time, and without being tied to a single provider. Anonymous credential systems [3], such as the ones that are based on zero-knowledge proofs (ZKP) [2], are already integrated in the suggested blockchain technology implementation [1] and can provide strong protection of user privacy.

Distributed Matching Algorithm. Task execution using centralized approach (i.e., executing a piece of code on a single machine) often introduces a single point of failure in a system and can represent a bottleneck for its scalability. Blockchain technology aims at addressing these issues by performing independent execution of a code/smart contract on multiple nodes. Performing computations on every node enables to ensure trustworthy and reliable execution of matching algorithm and elimination of a bias, but can be highly inefficient depending on the complexity of the task, data volume, and network structure. We, therefore, plan to leverage multi-agent systems approach to define an adaptive mechanism to choose a fraction of nodes that will do the computational task independently to avoid a single point of failure and to ensure certain task execution confidence level in an efficient manner.

Gamification. This approach can be used to foster the applicant engagement and promote his participation in the system. A very simple interface can support the applicant assessment of interest towards a good match with an opening, also for automatic channel settlement. The applicants can additionally be rewarded for providing trustworthy information, for example by having more possibilities for channel creation or for being allowed to explicitly initiate a channel with a company for an open position, even if there is not a perfect match.

3 State of the Art

Solutions that partially address the problems listed in the previous section are not entirely novel. HR tech already exists and aims at optimising the internal HR processes in the companies, as well as automatising the matching algorithms.

People analytics methods are application of procedure from big data to classify and better match skills hold by a, current or potential, employee with open positions and critical areas of development [7]. This approach is internal to the company, and it is currently seen as an extension of the competences of the HR department. Despite the added value of providing a series of advantages for the people management, meaning finding better applicants, supporting smarter hiring decisions, promoting employee performance and increasing expertise retention, it normally requires the companies to hire people well-trained in the domain; and the hiring department may lack some in-depth knowledge in the specific domains.

On another level, leveraging on the vast amount of data they own, platforms such as GlassdoorFootnote 1, and LinkedInFootnote 2 propose an external one-stop solution for companies, by providing as a service an attempt to optimise HR processes and increase job openings reachability. They support matches between the skills defined in the job description and the skills of the candidate (defined based on his own input or the recommendations of his peers); while providing automatic data extraction from CV uploaded to the platform, feedback management system from the employees (Glassdoor), and an “Easy Apply” service, that enables to directly apply for a position using only the data already uploaded into the platform.

These concrete approaches, despite being already realities, allow to address only few of the research questions defined before. Such third parties store and process sensitive information of the individuals, therefore, they are required to be compliant with a number of international and local laws and regulations regarding personal data management. In addition, privacy of the individuals can be violated. For instance, LinkedIn proposes a service that allows to indicate that the candidate is in an active search for the new employment, while there is no guarantees, that this information will not be known by the current employer. In addition, from our point of view, one of the most important limitations is the lack of independent validation of the data, submitted by the user or his peers.

There exist already some propositions that aim at automatising and improving the process of validation of the education certificates, including some blockchain-based approaches [6]. To address the need of the employers to have manually verified all the diplomas that confirm the education history of the candidate by the corresponding issuer, Gresch et al. propose a blockchain based system for managing diplomas. The authors review existing initiatives and propose to use public Ethereum blockchain to store the hashes generated over the diplomas in PDF format, issued by the corresponding institutions. While using public permissionless blockchain network allows for the availability and automated verification of diplomas, the fundamental role of a verification mechanism for an institution eligibility to issue such certificates was only mentioned in the proposed solution, yet not addressed. Additionally, all the solutions existing in literature either aim at validating a single diploma or certificate or a full profile as an elementary unit of information (meaning an image or an encrypted PDF). Instead, we propose a different approach: (i) we consider each entry as a single experience stored in a raw textual format, as a part of the user record, (ii) hash is generated separately for each singly entry and stored for validation purposes on the distributed ledger. This enables more effective smart integration and usage of the matching algorithms.

EduCTX [13], a blockchain based higher education credit and grading platform, addressed aforementioned issue of validation of a new network node by requiring the members of the EduCTX distributed ledger network on receiving a registration (joining) request from a node (institution), to verify official information about this institution. Yet, there is no further details regarding the process of confirming the eligibility to issue the education certificates, and the number of the network members that must perform the verification.

BCDiplomaFootnote 3 initiated the creation of a new global certification standard, with the first use case dedicated to diplomas certification. BlockFactoryFootnote 4 proposes a solution for increasing diploma security using blockchain technology. However, no information about both of the aforementioned solutions can be found in scientific publications, therefore, a rigorous analysis of such platforms is missing.

4 Discussion

Before the implementation and integration of a system design, presented in Sect. 2, multiple matters have to be considered from the technical, legal, and social points of view. In what follows, we highlight specific questions to be addressed.

HR Tech and Blockchain-Based Intelligent Data Management. Further developing HR tech, employing people analytics and using intelligent automatised approaches to filter potential candidates and execute matching tasks, - all these aim at ensuring optimization, fairness, and bias-free HR processes. Due to the fact that the algorithms are normally designed and reviewed by multiple actors, single bias tends to be limited, reducing the effect of bias in case of automatic matching compare to a single-view approach. However, the models and logic employed in these approaches are still developed and implemented by the humans, thus, it is not possible to ensure the absence of accidental mistake and malicious behaviour that can influence the desired fairness of the aforementioned approaches. The same considerations are also relevant for the blockchain technology, in particular for the smart contracts. Regardless the fact that they are deployed and executed on multiple nodes, it can be very challenging to ensure that the implementation corresponds to the desired logic of the contract.

It is challenging to define the right format and granularity of the data, such as evaluations provided by the previous employer, so that sufficient level of expressiveness and desired level of anonymity are ensured simultaneously. Additionally, currently during the evaluation of previous working experience, there is a need to employ de-fuzzification approaches, especially when comparison between candidates takes place. Having an access to appropriately formatted structured data will allow to avoid the need for de-fuzzyfication, often leading to introducing bias in the evaluation.

Compliance and Other Legal Aspects. The EU General Data Protection Regulation (GDPR) [11] aims at regulating the way personally identifying data are being gathered and consumed and to define the legal rights of people to the use of their data. The data related to the education and working history, especially including evaluation of the candidate’s skills, fall into a category of personally identifiable data [8]. Therefore, compliance with GDPR is required. Management of sensitive data is expensive, as it requires dedicated infrastructure, security and compliance specialists, and in case of data breaches, a company’s reputation can be damaged, which can result into additional money losses. Companies often look for outsourcing the data management, which is also expensive and can lead to even more challenges when trying to ensure practical compliance with the GDPR.

Employing blockchain technology can bring certain benefits such as keeping a complete history of all the actions performed over the data (the obligation to keep records of processing activities, Article 30). However, in case of blockchain technology, as well as the other types of machine data processing algorithms, including the use of artificial intelligence techniques, it is not an obvious task to ensure the right to erasure (Article 17) that states that the data controller (the entity in charge of processing personal data) must erase without undue delay the data, if requested by the data owner [14]. Mechanisms for data extraction, encryption and anonymisation are some measures required to ensure GDPR compliance, together with the adoption of SSI approach and the use of hash functions, to avoid storing raw data on the blockchain.

Social Implications. While attempting to attain fair and bias-free matching between candidates and job profiles and employing intelligent approaches for the data management, we do not aim to devalue and/or completely remove the social component of HR processes. The goal of a framework, such as the one proposed in this paper, is to optimise the process of matching and selection, thus enabling more quality time for the personal assessment of the chosen candidate, i.e., person-to-person interaction between the candidate and HR employee.

In order to benefit from an added value brought by such framework, it is important to ensure wide adoption, which is only possible if the easiness of usage is guaranteed. Thus, developing intuitive interfaces that allow to express requirements and provide required information in a simple way is of paramount importance. The algorithms may need to be adapted to ensure proper reachability and matching without overwhelming the candidates and the companies with too many job offers or applicants to consider, respectively.

Blockchain technology enables trustworthiness of the data regarding the evaluation of an employee by the company. However, it is important to provide a possibility for the employee to express his opinion about the company as well. The reviews provided by the employees can then be used in order to build the reputation of the companies and weight their validations. Providing such functionality, however, is one of the directions of our future work.

5 Conclusion

This paper proposes a high-level architecture of a framework providing trusted support for the job placement task. Relying on the properties of intelligent data analysis approaches, blockchain technology, and self-sovereign identity management, we aim at optimising and limiting bias in the process of candidate selection, while ensuring the candidate’s data privacy. We aim at achieving optimisation by removing the need to search in multiple sources and making use of structured verified data and distribute matching algorithm; fairness and privacy by employing an SSI system and by relying on matching algorithm that enables reproducibility and bias-free execution (thus, ensuring objectivity of the algorithm, and therefore candidate selection). However, as discussed in the previous section, there is still a number of the open questions to be considered before such system can be implemented and widely adopted.