Remote Web Development Job In IT And Programming

webscraping english conversations

Find more Web Development remote jobs posted recently Worldwide

I need someone to web-scrape some English conversations

Im building a chatbot for learning English and want someone to scrape a bunch of English conversations from various websites to use as training material.
Im looking for short and simple conversations

There are two parts to the task
- some googling to find basic relevant conversations
- scraping code for different sites


Most of these sites are pretty simple plain text services. Here are some example sites, but there are hundreds of resources.
I would want the scraped results in a TSV or CSV format:

convoId | line | url | topic | who | text

convoId - an ID for each conversation so we can sort things later
line - simple increment count for each line in that conversation
url - place it was from for attribution later
topic - please try to get a topic from the page. if this is a LOT more work maybe not needed
who - usually the conversations have role playing A: xxx, B: replies
text - scraped line of text

You can use NodeJS or Python.

Let me know what experience you have in scraping, although this should not be a challenging scraping task - most of these are amateur sites with no Logins or other blockers.

If youre trying to improve your English, this also might be an interesting project!

If youre into machine learning, Ive also looked at the various online corpus for dialog training, but havent found anything great yet.
These datasets dont work for basic language learning conversations.

Id like to start with a small sample task, but then manage this as an on-going project with some regular work each month as we refine the idea. There will be on-going cleaning up of the dataset for training etc.

Respond to me with some info on what kind of scraping tasks youve done before and how many sites you think you can cover for the initial budget Ive proposed.
About the recuiter
Member since Nov 11, 2022
Vijay Lakshetty
from Uusimaa, Finland

Skills & Expertise Required

Web Scraping Node.js Scrapy Beauty Python 

Open for hiringApply before - Jul 13, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$13.42

Cost

Offer to work on this project closes in 10 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

configuring vs19 with TensorFlow 2.1

Hello,
Im trying to configure python 3.7 with tensorflow 2.1 on visualstudio 19.
As much as it sounds simple the import of tf doesnt work, Im not getting the libraries inside the tf.
looking for someone who knows how to fix this issue

Contact information collection

Looking for a freelancer to find the Youtuber contacts information based on the list of the link we gave. Need someone who is professional to look for correct information and be able to finish the job in a short time.

Full stack web developer

What were looking for:
An experienced full stack developer to help kick-start/build website. We need someone to work on specific project requirements . The project is based on software platforms, frameworks, languages, databases used by company....read more

Python SQL debugging

Im looking for someone to help me with python and SQL. If you know PySpark, that is a plus. This would be in person somewhere in the Des Moines area. A couple of hours a week.

Python scraping script fix with selenium

You need to have access to fanduel, which has a location striction. I will provide the account.
The script is written in Python and using Selenium.