Remote Data Mining And Management Job In Data Science And Analytics

R Programmer needed to help create an function to process documents

Find more Data Mining And Management remote jobs posted recently Worldwide

I have assembled a large amount of text for a research project. The text is stored in .docx files (not ideal, I know, but the best option given the source) which are nested in a series of folders (-70). Each document starts with a section of metadata, followed by the main text, and concluding with another section of metadata.

I would like to create a function in R that will take as input a directory location, scan that directory for all .docx files (a package exists to do that part), separate the different documents, identify the main text of each document and separate out both sections of the metadata, parse the metadata for several important categories and populate columns with those values, paste the metadata itself into separate columns, then reassemble a data frame where the main text has been separated out from the metadata.

The deliverable is therefore an R script for this function. A freelancer who is experienced and comfortable with working in R, specifically with text-as-data and writing simple loops and function, is required.

I use R for basic dataset construction, variable manipulation, and statistical modeling, but am inexperienced at writing loops and functions. I have written up code that accomplishes the things I want to accomplish on a single document that can be used as a guide. I am also happy to provide more information about the data itself to answer any questions.
About the recuiter
Member since Jul 8, 2017
Severine T.
from Scotland, United Kingdom

Skills & Expertise Required

R 

Candidate shortlisted and hiredHiring open till - Dec 4, 2020

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$19.42

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Cancer survival analysis

Cancer survival analysis using semisupervised
learning method based on Cox
and AFT models with L1/2 regularization

Statistical Data Analyst

I am looking for Statistical person, who is fluent in the English language for checking Data Analyst, grammar and proofreading
And has a high skill in the statistical field and Information technology.

I have a lot of assignments, each pa...read more

I need to correct errors in a program written in C++ using RStudio.

I need to correct errors in a program written in C++ using RStudio. The program to find the estimate power and size of parametric test for comparing two samples of survival function. Skills required C++ , RStudio, Mathematics and Statistics.

Programmer needed to solve tasks related to Computer linguistics and ML(Python or R )

I have many tasks in python and others in R.

It is not necessary to know both python and R. ONE is enough!

When we talk , I can send you more details.

Developer needed for creating a website and an android app

Im looking for someone who can create an app where you can take pictures of your food and add a symbol of the amount of pain it gave you afterwards to eat it: happy smiley, sad smiley, i dont know smiley. The design has to be really sharp, easy,...read more