Remote Network And System Administration Job In IT And Networking

Solutions Architect to help me architect my API/Infrastructure

Find more Network And System Administration remote jobs posted recently Worldwide

I am running a large-scale computer vision/machine learning platform that works by processing the following:

1) Live streams (RTSP) [https://en.wikipedia.org/wiki/Real_Time_Streaming_Protocol]
2) Videos stored in Amazon S3
3) Images extracted from videos stored in Amazon S3


I want to re-architect my solution for scalability and flexibility as well as fault tolerance.

Currently,

For Videos/Images

1) A video will be populated in Amazon S3, this will then trigger an Amazon Lambda function that will publish a task into Amazon SQS.

2) I have machine learning code (Python 2.7/Tensorflow/OpenCV) that is pulling tasks from Amazon SQS. Then downloading the videos from Amazon S3, doing processing and publishing JSON to another Amazon SQS.

3) For RTSP streams (live), I manually deploy to servers to continuously process each RTSP steam using similar code (Python 2.7/Tensorflow/OpenCV)


Challenges:

-- I have no way to autoscale my worker nodes (processing videos) based off incoming load (messages in Amazon SQS).
-- Each new use-case requires at least more Amazon SQS (for development, staging and production). This becomes very difficult to maintain.
-- Each task will require different Neural Networks depending on the type of task and the image quality etc.... This means sometimes I have to use deep neural networks which require GPU support. But, sometimes I can run the task on a light-weight CPU. So each Task needs to have a weight associated with it and be distributed to a corresponding instance (for example, Task A requires GPU so it should be run on an instance that is GPU enabled, but Task B can run on CPU or GPU because it is not as compute intensive)

For live feeds:
-- Live RTSP is very difficult because frames are stored in memory and if I dont process each frame immediately it will crash the program or it will skip to the next live frame so I lose frames.
-- each RTSP task also has a weight so I need to be able to run some RTSP tasks on GPU and some on CPU.

General issues:
-- I need to maintain a status page so that I can see if all processes are running correctly for better DevOps/Debugging.
-- I have no centralised logging for debugging
-- If the source RTSP is not working (network issues or live stream is not functional), it should free up the instance to perform another task until that RTSP is back.


All my use-cases are structured as modules, each module will be a different package in my python project, with its own instructions for running. Each module will run on RTSP or Video, never both. But, I can run Module A on live stream (RTSP) and Module B on Videos independently.
About the recuiter
Member since May 20, 2018
Aditia Resmana
from Maryland, United States

Skills & Expertise Required

Amazon S3 OpenCV TensorFlow 

Open for hiringApply before - Jul 16, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$26.85

Cost

Offer to work on this project closes in 14 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Image feature extraction using FasterRCNN with resnet version.

1. Get the opensource faster r-cnn from git(https://github.com/tensorflow/models/tree/master/research/object_detection)

2. Download model name faster_rcnn_resnet101_coco(faster_rcnn_resnet101_coco_2018_01_28.tar) from git(https://github.com/...read more

Logging Infrastructure ELK Linux DevOps Cloud

We need an Expert for this Job who deployed logging infrastructure for any web application using Redis, Filebeat, ElasticSearch, S3, Kibana, and Logstash

1. What are the Best Practices to design real time and non -real time logging Infrastru...read more

Need help in containerizing a windows desktop application

Need to containerize a windows application, run on AWS ECS and store the results on S3.

Windows application dependencies :
1. Microsoft .NET 4.5.2 or higher
2. Visual C++ Redistributable for visual studio 2015 both X86 version and X...read more

Senior Computer vision algorithm developer - for segmentation and matting on mobile

Were looking for an expert Computer vision developer for long term who is comfortable on working on both Server (testing models), and mobile (implementing via TFlite and CoreML).

** Only apply if youre experienced in TensorFlow Lite
**...read more

Need DevOps / Site Reliability / Server Administration Engineer

Need an AWS administrator who can properly provision and setup tooling (e.g. AWS IAM and security policies, RDS, EC2, ELB, CloudFront). Must also be able to do patching and alarming in case of billing/performance overrun.

Need at least 1-2...read more