Good job? Work the data!

Contributed by: Michael Reusens, Wilfried Lemahieu, Luc SelsBart Baesens

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at and let’s get in touch!

Lowering unemployment is a key topic in the agendas of most governments. In Flanders (7.44% unemployment in June 2015 [1]) for example, the government has recently decided to raise the retirement age, making it ever more important for people to find interesting jobs. An important party in this context are the employment services. They fulfill the role of labor market mediators, whose task is to bridge the gap between job seekers and employers.

This short article gives an overview of one of our ongoing research projects in collaboration with the Flemish employment services (VDAB) in which we investigate how data mining techniques can be adapted and applied to help the employment services fulfill its role as labor market mediator. The research project that we discuss here investigates what selection of vacancies is best recommended to which job seekers.

Finding a job is no easy task. With over 163,000 new vacancies posted last year at the Flemish employment services it is impossible to go through all these vacancies to find those that match your interest. This problem, in which there is too much information to process yourself, is called the information overload problem. As a solution to the information overload problem, research has suggested systems that select those items that are likely to be most relevant for the user. These systems are called recommender systems. Recommender systems have been successfully deployed in context such as online retail, in which they can increase sales, and improve customer satisfaction. Also in the context of job search, recommender systems have been suggested.

The Recommender systems handbook [2] classifies the job search domain as a domain with a high degree of Risk, hence requiring good scrutability. A high risk domain is one in which users tend to be intolerant of bad recommendations. People complaining about poor job recommendations from the employment services are not hard to find. The level of scrutability refers to how well the reasoning behind the selection of the recommended items is explained. Being able to very clearly explain to the user why the system selected these specific items will increase the appreciation for the system, thereby reducing the risk. Because of this contextual requirement, content-based recommender systems are widely adapted by most employment services, amongst which the Flemish employment services.

Content-based job recommender systems work by finding those vacancies whose description (function, location, required competences, …) best match the user’s profile (desired function, location, possessed competences, …).  These type of recommender systems have as advantage that they result in stable recommendations, meaning that they will usually only recommend vacancies that are sufficiently close to the desires the user indicated in his/her profile. Because of this, recommendations they make are easily explainable, making the system suitable for the high risk recommendation context such as job recommendation. However, this type of systems has several important drawbacks. First of all, it is very tedious for users to fill in and maintain extensive profiles, and for employers to fill in an elaborate structured description for each vacancy they publish. Just think about how many question a service such as LinkedIn asks you on a daily basis! The requirement of extensive user input often leads to poorly filled-in user profiles and vacancies, causing poor recommendations: Garbage in, garbage out! Another disadvantage of these systems is that they require a user to fully express their interests in terms of features offered by the system. This can be tedious, as mentioned before, or may not always be possible, due to the complex nature of human interest. In a first experiment, we tracked the browsing behavior of over 200,000 users of the employment services website for one year, and compared what vacancies they looked at themselves (using a keyword-based search engine) with the job-types they explicitly indicate to be interested in in their profiles. The results are astonishing: An average user has 2.8 desired job types in his/her profile, but looks at vacancies covering 10.3 job types not mentioned in this profile. These numbers clearly shows that a pure content-based system misses out on big parts of a job seekers interests.

In an attempt to better capture the full spectrum of the user interest we have developed and implemented a recommender system purely based on implicit feedback of the user, instead of using a user’s profile. Implicit feedback are all the signals a user gives the system about his/her preferences without the user having to be aware that he/she is conveying a preference. An examples of explicit feedback used in our system is the browsing behavior on the employment services’ website: a user revisiting a specific vacancy several times, and looking at that vacancy for a longer time indicates more interest for that specific vacancy than if a user only visits a vacancy once, or looks at it only for a short amount of time. Because we no longer use any structured data describing job seekers or vacancies, recommending vacancies that a user will probably like becomes slightly harder. We use a popular recommendation technique called collaborative filtering. Collaborative filtering is based on the principle that if two users had a similar interest in items up until now, they will probably also like the same items in the future. The first step in our collaborative filtering strategy is finding users that have given similar feedback to the job seeker we are generating recommended vacancies for. Once we have found these similar people, we can recommend vacancies these similar people have looked at, excluding those the job seeker has already seen (because the job seeker already knows about these).

Evaluating this collaborative filtering system, and comparing it to the content-based system currently used by the employment services, is ongoing research. Our first experiments suggest interesting differences between them. Offline evaluation has shown that the collaborative filtering recommender gives more diverse recommendations, and has a recall that is five times higher than the content-based recommender system. Recall is defined by the percentage of vacancies in the set recommended vacancies that the user would have looked at on its own. A system with a higher recall is better at mimicking a user’s search behavior. Ongoing experiments include testing by employment service experts, followed by live user testing, using A/B testing. It will be immensely interesting to see if these online evaluations will support our preliminary findings.

The key takeaways of this short article can be summarized as follows:

  • Recommending jobs to people is a high risk recommendation domain. Careful consideration is required when deciding what techniques and data to use;
  • Only focusing on a job seekers explicit profile will cause us to miss out on a big part of his/her job interests;
  • So far, our experiments suggest that using a job seekers’ implicit feedback in combination with collaborative filtering will lead to more diverse recommendations that better mimic the job seeker’s natural search behavior.


  1. Unemployment numbers in Flanders.
    Retrieved from
  2. Kantor, P. B., Rokach, L., Ricci, F., & Shapira, B.
    (2011). Recommender systems handbook. Springer.