Remote Data Mining And Management Job In Data Science And Analytics

Write Program To Merge List Of Leads By Finding Similar Company And Contact Names

Find more Data Mining And Management remote jobs posted recently Worldwide

We have scraped data on buildings in New York City.

Each building can have up to 3 owners and 1 Management company (or 4 owners and no management company)

(NYC buildings are expensive and often times owners partner together to buy buildings)

Each owner can own an indefinite number of buildings (depending on how wealthy they are).

Given that each building is a partnership of numerous owners, new business entities (companies) are created when each building is purchased.

That means that each building has owners (people) as well as a company that owns the building.

That also means that each owner can be associated with numerous different companies.

There are a few Many to Many relationships here (however there is always only one building)

In addition to that, sometimes an owner can use the same company to buy 2 buildings but since were dealing with scraped data thats as good as the person who entered that data on the citys platform, very often there are slight differences in spelling between the two company names or even between the two owner names of a building, making a straight comparison impossible.

(For example, there could be one building owned by The Carlton Group (company name), which is owned by John Marks and Greg Smith, and another building owned by Carlton Group, which is owned by Jonathan Marks and Gregory Smith.)

so far weve been manually comparing the data to look for duplicates.

The goal is to write a program that will merge and then divide all the data into 3 master lists of:

companies
contacts
buildings

so that all similar companies are merged into one company.

all similar contacts are merged into one contact.

we want the program to include an audit log that shows what the old data was and what the new data is. that will make the manual part easier so were just manually looking over what the program changed.

The program will allow us to enter in different leads at a later time and run it through the same process.
About the recuiter
Member since May 20, 2018
Eee Sympo
from Krasnodar, Russia

Skills & Expertise Required

Data Analytics 

Open for hiringApply before - Jul 18, 2024

Work from Anywhere

40 hrs / week

Fixed Type

Remote Job

$478.93

Cost

Offer to work on this project closes in 14 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Data Scientist/Analyst for clinical data project (ongoing)

Looking for someone to analyze our large clinical sample database, to clean, analyze and visualize our data. Experience in creating dashboards

- Lead the strategic, statistical thinking and provide valuable insights to the Development team...read more

Advanced D3, JS &Angular project. (Reporting tool)

We are developing a BI reporting platform from scratch. We have several D3 visualisations that need enhancements & also requires us to create Analytic visualisations.

This project requires advanced D3 skills (not static visualisations) Ang...read more

Data analysis and forecasting using Rapidminer

I am looking for a statistician or data analyst who is proficient in Rapidminer. Please get in touch if you have a solid background in data analysing using Rapidminer studio. Following is the task:

Task 2.1) Conduct an exploratory data analy...read more

Matlab machine learning algorithm EASY WORK

Looking to classify images using a simple dataset derived from CIFAR-10 dataset and classify the images using 1 or maybe 2 machine learning models.

You are provided with an image dataset, where there are 10 different categories of objects,...read more

CTO - Back end Architect Development with daily Code review & Team Management.

The Consultant CTO will have the following responsibilities:-
1- Make the whole solution Architecture.
2- Explain and clear doubts of the in-house developers regarding the architecture and development process
3- Every service/API code re...read more