Remote Data Mining And Management Job In Data Science And Analytics

R Programmer needed to help create an function to process documents

Find more Data Mining And Management remote jobs posted recently Worldwide

I have assembled a large amount of text for a research project. The text is stored in .docx files (not ideal, I know, but the best option given the source) which are nested in a series of folders (-70). Each document starts with a section of metadata, followed by the main text, and concluding with another section of metadata.

I would like to create a function in R that will take as input a directory location, scan that directory for all .docx files (a package exists to do that part), separate the different documents, identify the main text of each document and separate out both sections of the metadata, parse the metadata for several important categories and populate columns with those values, paste the metadata itself into separate columns, then reassemble a data frame where the main text has been separated out from the metadata.

The deliverable is therefore an R script for this function. A freelancer who is experienced and comfortable with working in R, specifically with text-as-data and writing simple loops and function, is required.

I use R for basic dataset construction, variable manipulation, and statistical modeling, but am inexperienced at writing loops and functions. I have written up code that accomplishes the things I want to accomplish on a single document that can be used as a guide. I am also happy to provide more information about the data itself to answer any questions.
About the recuiter
Member since Mar 14, 2020
Piyush Mistry
from Georgia, United States

Skills & Expertise Required

R 

Open for hiringApply before - Oct 27, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$26.74

Cost

Offer to work on this project closes in 90 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Data Analytics

Hello! I am looking for someone to consult with me on a epidemiological data paper. My dataset comes from repeated, longitudinal measurements on participants cigarette use. Each day of the 21-day study, each participant provided information on how m...read more

Implementation of data mining technique using R or python

Comparison of three data mining techniques implemented using R or Python

minitab expert.

I need a Minitab expert for pharma Project

background data managment with able to find comparable option based on the data

I am in real estate and would like to provide fast comparable to potential seller based on data i have.
I have all the sales of properties we have to put the data base with filter as number of units, size of the units, rent roll, Area of the buil...read more

Cancer survival analysis

Cancer survival analysis using semisupervised
learning method based on Cox
and AFT models with L1/2 regularization