Resume parsing
Want to talk about?
It’s a boring time for a recruiter. SAP knows well how to it his job, there is nothing to do about. Position requests from managers are being processed themselves, posted to internal and external web-portals or agencies. Feedbacks are being sent back to managers by themselves and automatically. Everything is integrated. CV is being read from email [email protected], parsed to bones and stored in candidate database. Interviews are being initiated from a mobile phone, rooms are reserved. Boring, no fun at all.
Everything is clear except CV. We know every resume is made of a typical skeleton, where is personal info, contacts, work experience. Every part could be formalized, parsed to its components and analyzed by a number of factors and variants of appearance.
We understand that First and Last names could match file name, never is written with punctuation characters, always start with a capital letter or are all capital and resides in the top part of a doc.
We also understand that contact phone number has fixed number of digits, patterns are also well known and it’s placed somewhere by name or e-mail address.
We understand that work experience is a consequence of the same type blocks with company, period, position and job functions specification. It’s just a table which can be retrieved from CV somehow. Let’s say exported in XML format, where we can easily find repeating elements that appear more than once.