this is a simple parser for the pastebin.com website.
it will iterate posts and parse their elements using lxml
pypi link
pip install simple-pastebin-parser
import simple_pastebin_parser
for paste in simple_pastebin_parser.get_pastes():
print("Title: ", paste.Title)
print("Author: ", paste.Author)
print("date: ", paste.Date)
print("Content: ")
print(paste.Content)
print("*" * 20)
initial proof of concept. nothing special, just doing the dirty work of parsing the posts.
how to execute:
- create a virtual env of python 3.6
- install requirements
- run python poc.py
- integration with travis.ci
- changing the POC code to work with installed pypi package
- created the Paste() object for pastebin posts
- ability to stream data
- small fixes
- update README
- added documentation
- cleaned most pep8 issues
- some tests
- parse date in UTC
- add some logs
- add id to Paste()
- cleanups