Remote Data Mining And Management Job In Data Science And Analytics

Converting JSON or Avro files to Parquet

Find more Data Mining And Management remote jobs posted recently Worldwide

I need to convert JSON, Avro or other row-based format files in S3 into Parquet columnar store formats using an AWS service like EMR or Glue.

I already have code that converts JSON to parquet using Python but the process is very manual, accounting for NULL values in the JSON elements by looking at each and every field/column and putting in default values if theres a NULL.

I am looking for an easier, less manual way of doing this using something like Spark or other similar methods.

Since I am working exclusively on AWS, I am only looking for solutions using AWS services such as EMR, Glue or similar AWS service.

I am thus looking for someone with experience using AWS EMR, Glue, Python, Pyspark etc.

Please note: Since this is going to be a learning experience for me, this is going to be a live session on Skype, Zoom, Google Hangouts etc where you code and I watch and you answer any questions I have in the process.

Thus, I will pay in one-hour increments. The initial contract is going to be for one hour and if we need more time we can have another one hour contract and so on and so forth.

Please only apply if youre ok with all these conditions and have the required experience.
About the recuiter
Member since Nov 11, 2022
Pankaj Doot
from Gandaria, Indonesia

Skills & Expertise Required

Amazon S3 Amazon Web Services Apache Spark Pyspark Python 

Open for hiringApply before - Aug 8, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$34.41

Cost

Offer to work on this project closes in 17 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Video stream setup / OpenH264

C++ developer needed to setup a stream from UE4 to our accounts in social networks.

At first glance, everything is fine, all parameters are the same to OBS settings, but in fact - no platform except Twitch accepts our stream. It doesnt mat...read more

Developer needed for a legal tech startup , to work with expertise in Artificial intelligence &DS

The job includes development of tech legal website which will work as virtual assistance to lawyers in legal drafting , review of legal documents , legal research etc ....The skills require to complete the job is the knowledge of python/django/AI /d...read more

Azure, AWS, and GCP Tutorials Wanted

Were looking for someone to create a few cloud tutorial videos for Azure, AWS, and Google Cloud, that we can use on our YouTube channel.

The tutorials should be engaging and visually appealing, and must include as many practical demonstrati...read more

ANGULAR JS - Fix issue with recently developed Web site

Need Angular JS expert to investigate why my back-end database for a recently developed site has disappeared and future proof this against happening again. A bit of background, I am a one person business at this stage so there is no tech team and I w...read more

Setup AWS

[x] setup amazon ec2, s3, rds, cloudfront on AWS
- [ ] 1. LAMP installation on centos
- [ ] 2. WHM installation
- [ ] 3. CDN setup for one domain with S3
Work should be done remotely