We have a code base in Python that calculates an NLP match score by comparing project data with all the candidate profiles on our platform. This currently takes 50-60 seconds and is using up a lot of CPU/Memory (We use PostgreSQL on AWS EB2 instance). The results of NLP processing are perfect and we do not want to change that, just the speed of processing needs to decrease. Our master code base is in Ruby which exchanges data with Python through Rabbit MQ.
We are trying to:
1. Reduce the NLP processing time to LESS THAN 1 second, EVEN as the database of candidates scales.
2. Drastically reduce the cpu/memory usage while running the NLP analyses in Python
We are looking for someone to do the following to optimize NLP processing speed:
1. Refactor the code to remove most elements of it that are not needed
2. Optimize GUNICORN workers to work on matching jobs in parallel (Leading to faster processing and less memory)
3. Use Caching and Indexing of Candidate and Project Strings for faster processing
We need this job done in a week and do not think Steps 1 and 2 will take much time.
About the recuiterMember since Mar 14, 2020 Marshal David L
from Cesar, Colombia