Remote Web Development Job In IT And Programming

Need help designing a web scraping solution to surface external calendar information

Find more Web Development remote jobs posted recently Worldwide

Hi, I need help with solving an architectural/algorithmic problem around scraping calendars on the web to have near real time updates to the times we show on our app. The current design involves having a recurring job scrape a site for all of the free times available for a month for a service that is 30 minutes long (our service duration interval) and storing the free times in our database. Then, when a user comes to our site and chooses their services, we pull the free times from our cache, resolve what times can accommodate the aggregate service duration, and show those to our users. The issue is that one of the external scheduling providers has a bug where they dont show all of the times they have available to book. So the optimization of only scraping the times for a 30 min duration and using those free times to calculate which ones will work for a larger duration at runtime gets thrown out of the window. The only other option we can think of is scraping for each individual time interval but that makes the scrape/caching job take way too long to be feasible. The scraping script takes -2min per pass and we need to do it for at least 2 months (the current and next) so for 100 stylists using our current implementation, the job takes 2min * 100 stylists * 2months * 1 (30 min service duration) = 400 min if we parallelize that on 8 machines it would run in less than an hour. However, trying to run the job for every possible aggregate service duration would be 2min * 100 stylists * 2 months * 16 (8 hours by 30min intervals) =6400 min = 106+ hours and even if we parallelize it on 8 machines, it would still take 13+ hours to run and thats too long. Were looking for a fresh pair of eyes that can see another solution we arent seeing that allows for our times to sync with the external scheduling provider on a regular, relatively small interval.

***Some people have asked why the script takes so long. It has to automate choosing a service and walking through the booking flow of the other website to see the times available for each day.

The site is (removed by Toogit admin). If you click on one service and then add some more and click save it will show you the calendar where you have to click on each individual day to see its hours ***
About the recuiter
Member since Nov 11, 2022
Kemal Morris
from Zamboanga Peninsula, Philippines

Skills & Expertise Required

Automation Data Extraction Scripting Selenium Web Scraping 

Open for hiringApply before - Aug 27, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$17.25

Cost

Offer to work on this project closes in 54 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Automated QA Engineer for Mobile Site

I have a site and I would like to build some automated regression tests to ensure key features are working when we do new deployments.
Its not for ALL features but for some key functionality.
Also I have complaints from some users that inte...read more

Google Platform (Sites, Docs, Scripting, Calendar)

The project involves scripting, intranet, google calendar, google sheets, google sites as a base for a mentoring platform, I will describe remaining details with shortlisted candidates.. Please show some interest and share your work history with prop...read more

Python scraping script fix with selenium

You need to have access to fanduel, which has a location striction. I will provide the account.
The script is written in Python and using Selenium.