- The raw data is level 2 order book snapshots and incremental updates at tick frequency (each update = new observation) for a single asset.
- To simplify the task and limit the scope, you need to work with a summarized dataset in the form of csv files with the following columns: timestamp, bidPrice_x1, bidPrice_x2, ... askPrice_x1, askPrice_x2, ..., where bidPrice_x1 = the average price at which a market sell order of size x1 would be executed if it arrived at this instant.
- The scope of this task is limited to the summarized dataset. If you believe that you could do much better if you could only calculate different features from the raw orderbook data, we could discuss it as a separate job.
- Expect to work with 10M-100M rows, 10-20 columns with possible subsampling.
Goal:
For each row output a summarizing price P_t such that P_t = E[ (best bid price + best ask price at t + dt) / 2 data available at t]. dt = at the scale of 1-10 minutes, tbd.
For example, the simplest summary price of the orderbook would just be mid price between best bid and ask, but it misses the information content of the order book imbalance (if there is more volume on bid than on ask, the price will on average go up) and momentum/mean reversion time series dynamics. You need to take the form of the orderbook and time series into account in some basic fashion. It is not a goal to outperform the market with such prediction, but just to reasonably summarize 80% of the information content in the order book l2 dynamics that is essentially common knowledge to market participants. Obviously, you can only use past data for prediction.
Deliverable:
You should deliver a script that reads the data and outputs the summarized price for each input row as well as explain to me how it works. You can use R (preferred) or Python on a single server, no cluster solutions. Please stick to the simplest and fastest algorithms, essentially linear models only, and discuss with me if you go for anything more complicated than OLS/Kalman filter.
Ill provide access to an RStudio Server for R, tbd for Python.
About you:
You have experience working with order book level 2 and time series data or at least have a solid understanding of relevant methods. You value simplicity and dont throw all the fancy machine learning stuff at the solution just because this is cool and it makes you look more sophisticated.
I would like to hire several people for this job for different assets and exchanges. Feel free to ask questions and discuss the task and conditions.","employmentType":["FULL_TIME","PART_TIME","CONTRACTOR","TEMPORARY","PER_DIEM"],"jobLocationType":"TELECOMMUTE","hiringOrganization":{"@type":"Organization","name":"Toogit","sameAs":"https://www.toogit.com/","logo":"https://www.toogit.com/images/toogit_logo_initial.png"},"identifier":{"@type":"PropertyValue","name":"Toogit","value":368250},"skills":["Quantitative Analysis","R","Python"],"applicantLocationRequirements":[{"@type":"Country","name":"IN"},{"@type":"Country","name":"Canada"},{"@type":"Country","name":"USA"},{"@type":"Country","name":"Germany"},{"@type":"Country","name":"Pakistan"},{"@type":"Country","name":"Philippines"},{"@type":"Country","name":"Indonesia"},{"@type":"Country","name":"Sri Lanka"},{"@type":"Country","name":"Nigeria"},{"@type":"Country","name":"China"},{"@type":"Country","name":"Russia"},{"@type":"Country","name":"Bangladesh"}],"validThrough":"2024-10-30T11:09:48+05:30","url":"https://www.toogit.com/freelance-jobs/MzY4MjUw"}
Remote Data Mining And Management Job In Data Science And Analytics
Find more Data Mining And Management remote jobs posted recently Worldwide
Work from Anywhere
40 hrs / weekFixed Type
Remote Job$477.61
Cost Looking for help? Checkout our video tutorial
How to search and apply for jobs
How to apply? Do you have more questions about the Job?
See frequently asked questions
Consolidate data from 3 ERP systems
Visualize data together
Segment data for better decision making ability
We have a site and 3 mobile apps. Infrastructure is growing. Site is getting slower. Currently monolithic PHP...we need someone that gets current tech and can help us plan. We have programmers but we just need a roadmap.
I think this ap...read more
We are building a data aggregation SDK that would allow external developers to call our APIs to programmatically fetch data from certain websites that requires login.
Essentially, we would like to create Lambda Functions that can generate p...read more
We are looking for Data Scientists who are willing to share their knowledge to our community by building courses on specific fields in Data Science.
Send us a message for more details !
We are looking for a developer that can help create a method for our clients to submit data to our database by sending the data via an API to our systems.
Environment:
We have a Microsoft Server 2012 environment with SQL Server 2012 (wit...read more