Remote Data Mining And Management Job In Data Science And Analytics

Python developer needed for a resume parser project

Find more Data Mining And Management remote jobs posted recently Worldwide

We are looking for a skilled Python developer to assist in the development of an intelligent resume parser for our ATS.

Key requirements for the resume parser include:

- Accurate and efficient parsing of a predetermined set of structured fields based on a resumes main sections and sub-sections

- Extraction of clean, dis-aggregated, and normalized data for each field, e.g. Bachelor of Arts in Economics --> { degree: Bachelor of Arts, major: Economics}

- Ability to automatically handle documents of multiple formats, including doc, docx and PDF. This includes both text- and image-based documents (using OCR), as well as multi-columned documents

- Output of parsed resume data in a standardized JSON format

- Ability to programatically test the accuracy of the parser with an existing sample resume dataset for continuous improvement

- Ability to programatically train the parser with a growing sample resume data set to continually increase its level of accuracy

Desirable skills and qualifications for the task include:

- Excellent command of the Python programming language

- Solid understanding of and practical experience with document parsing / data extraction

- Experience with natural language processing (NLP) and relevant Python libraries such as NLTK and / or Spacy

Work on the parser has already begun with the current version of the parser able to:

- Load and read a variety of document types with Pythons Textract library

- Break the resume into sections based on a data dictionary of common section headings

- Extract basic fields such as name, email, phone number, and skills

Current extraction of entities such as skills are determined by a keyword search method based on a local database. However this method is both time- and resource-intensive which is why a greater emphasis on machine learning with an NLP will be necessary going forward.

Keywords: Python, resume parser, Textract, PDFMiner, OCR, natural language processing, NLP, Spacy, NLTK, named entity recognition, NER, data extraction
About the recuiter
Member since Sep 5, 2017
Cooper
from California, United States

Candidate shortlisted and hiredHiring open till - Feb 11, 2021

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$19.42

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Web scraper expert needed

1. Evaluate if it is feasible to scrap product data from 2 specific websites 2. If so, develop auto scrapper to extract product data on a daily basis

Scrape a list of sourcesecurity

We want to hire a web scraper who is able to scrape different websites.

Here is the description of the current task:

We want to scrape the list of companies form. In doing so we need to keep various information attached to each co...read more

Use Public Data Sources to Create a Prospect List from AirBnB data.

I need help putting together a sales prospecting list using AirBnB data. This is a prospecting list that will be used to reach out to current AirBnB hosts and offer property management services through Direct Mail, email, and social media.

I...read more

LinkedIn Bot

Given a list of companies and possibly their websites, I want a script/bot that will scrape for people from LinkedIn whose titles are CEO or Founders.

The bot will give a first name, last name, and website from linkedIn if we dont already...read more