Remote Web Development Job In IT And Programming

Backend developer needed to build a custom data scraper & parser service & API

Find more Web Development remote jobs posted recently Worldwide

The goal of this project is to scrape, sanitize, and organize the California State Bar Association data set. Specifically, to crawl a website for all attorneys. Data is classified by a variety of status parameters, and we are only interested in a specific subset.

From the individual attorney results, each unique datapoint will need to be stored against a key (potentially the state bar number). We do not have a specified schema, so we expect this schema to be developed based on the data. However, we will provide a list of specific fields that wed want to reconcile.

For example:
Matching and filling out a company field based on physical address and/or e-mail address when the company name is not present in the results.
Certain keywords will also be provided that will also need to be used to map specific fixed values.
Specific rules will be provided about how the presence of certain provided keywords from data we possess should add values to the specific record. There are also rules around what data wed want to exclude.

As an output, we will need all of the data in a single file in a machine-readable format (CSV, XML, etc.). Additionally, providing the data via a documented API that delivers the output in XML/JSON could be part of the initial scope or another phase.

While this custom scraper and parser project is initially focused on the California State Bar Associations members, there may be opportunities to work on subsequent projects around other state bar data depending on the success of this initial engagement.
About the recuiter
Member since Mar 14, 2020
Ravi Singh
from Antioquia, Colombia

Skills & Expertise Required

Data Scraping Web Scraping 

Open for hiringApply before - Sep 13, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$17.16

Cost

Offer to work on this project closes in 21 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Need help to stop ban when crawling a website with a spider from Scrapinghub

Im using Portia from Scrapinghub to run a spider on a website that keeps banning me. I need to avoid the ban to allow access. I get the following error:

Error 1009Ray ID: ************ - 2018-12-29 23:35:00 UTC
Access denied
What hap...read more

Bulk Scraping of Websites

We are looking to extract specific elements from a list of websites and craw the links on the website for the necessary content.

This is a quick turn around project.

The deliverable will be an excel with the specific elements from...read more

Data scrape a website and export to an excel sheet

Hi there, we are looking for someone to do a data scrape from the following site:
(removed by Toogit admin)
We would like the output to be in spreadsheet. this will mean for each row we need the data provided on that page, as well as the data...read more