Online job boards have become a major reference for talent acquisition and job search. These job portals have given rise to difficult matching and search issues. The basic match or search is between the skills, occupations and qualifications required for the position and those present in the candidate's profile. The long list of job skills and its polyonymous nature makes it less effective for direct keyword matching. This results in a substandard job match or search results that misses a closely matched candidate due to not having the exact skills. It is important to use a measure of semantic similarity between skills to improve the relevance of the results.
This article proposes a measure of semantic similarity between skills and jobs through an approach based on business knowledge. This knowledge can be represented by an Ontology graph.
An ontology in computer science is a data model that represents a set of concepts in a domain and the relationships between these concepts. For example, in artificial intelligence, software engineering, biomedical informatics, and information architecture, an ontology is defined as a form of representation of knowledge about the world.
Sense4data proposes the use of ontology graphs in its job seeker/job offer matching software, in order to address the above mentioned issues. This approach is characterized by the use of existing job ontologies such as ESCO [1] and Onet [2] and enriching them with data collected from some online job boards such as LinkedIn to derive a similarity score between skills and jobs. Enrichment is a necessary measure to allow our ontology to take into account possible changes and evolution of the labor market. The ontology also helps to solve the problem of standardizing a basic skill or occupation from its multiple representations.
The constructed graph is characterized by different types of edges that define the types of relationships between the entities of the nodes, such as whether a skill is necessary or optional for a certain occupation, or whether one occupation is a sub-occupation of another, or whether one skill is a sub-skill of another... I recommend that you go through the ontologies mentioned in this article in order to have a clearer understanding of the different types of nodes and relationships that we could have on our graph.
This graph through its different types of nodes and relationships works like a dictionary, describing, identifying and classifying occupations, skills and professional qualifications relevant to the labor market.
To compute the similarity between 2 entries (skills or professions), we produce a vectorization of these last ones through state of the art models of word embeddings, such as Bert[3], Fasttext[4], then we project the 2 entries on our ontology graph from the cosine similarity[5] computed between each of the vectorizations of the entries and all the nodes of the graph previously vectorized.
Once each of the entries is positioned on the nodes relative to the highest cosine similarity scores, we calculate the distance between these nodes within the graph using artificial intelligence-based graph algorithms.
This distance represents both the lexical and semantic proximity and the proximity from a labor market point of view.
Beyond a representative similarity calculation between two entities of the labor market lexicon, this ontology, through its continuous enrichment, will allow to follow the evolution of the labor market, such as :
In addition to meeting the requirements of the advertised position, other factors such as the preferences of job seekers and recruiters, cultural fit, ability to adapt to the company's market and ability to evolve with the organization play an important role in employee selection. And finally, based on the knowledge embedded in our ontology, a training proposal functionality has been implemented in order to acquire new skills to evolve towards other career opportunities or towards a professional reconversion.
At sense4data, we privilege the use of graph theory. Given a set of nodes and connections, which can summarize the abstraction of the problem, graph theory provides a useful tool to quantify and simplify the many moving parts of dynamic systems. The study of graphs through a framework provides answers to many layout, networking, optimization, matching and ranking problems. Graph theory can be used to model many types of relationships and processes in physical, biological, social and information systems, and has a wide range of useful applications such as:
Article written by Adnane El-Mansouri, Data scientist.
[1] ESCO – Commission européenne – European Commission ec.europa.eu/esco/portal/home?resetLanguage=true&newLanguage=fr [2] O*NET OnLine onetonline.org [3] Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html [4] fastText fasttext.cc [5] Cosine Similarity – Understanding the math and how it works machinelearningplus.com/nlp/cosine-similarity