Skip to content

Web scraper written with Scrapy to extract user reviews in German of organic and fair trade coffee brands

Notifications You must be signed in to change notification settings

marielledado/german-coffee-reviews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraper (Scrapy) - German Online Reviews/Ratings of Organic Coffee

This repository contains the web scraper I used to crawl the Utopia.de website to collect German-language online user reviews of organic/fair trade coffee.

The dataset is available on Kaggle: https://www.kaggle.com/mldado/german-online-reviewsratings-of-organic-coffee

Content

The scraper will collect the following data:

  • brand name of the coffee being reviewed
  • user rating of the coffee (1-5 stars)
  • user review in German

Inspiration

There aren't that many NLP datasets in German. This one is a little small, but should be enough to try out some sentiment analysis and other advancesd techniques like aspect-based sentiment analysis. It would be interesting to extract features that represent the preferences of German coffee drinkers, why they chose to buy organic/fair trade coffee brands over conventional ones, and maybe even find out what differentiates a 5-star coffee from 'just' a 4-star coffee.

About

Web scraper written with Scrapy to extract user reviews in German of organic and fair trade coffee brands

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages