Hackathon (29-30 Jan 2022) based on developing a digital solution to prevent illegal wildlife trade (IWT) on online social platforms.
- Team - Sean P. Rogers, Gabriela Youngken 👩🎓 👨🎓
- Mentor - Alastair Jamieson 👨🏫 (also API-keys holder 👛)
- To build a benchmark dataset of possible instances of IWT & related information from online social platforms which could also be searched and analyzed 🔚
- According to challenge guidelines : Challenge1_Guidelines
- A benchmark dataset is a public dataset which is designed and collected for studying real-world data science/research problems.
- The benchmark dataset should be social media platform agnostic, as IWT happens across multiple platforms such as Instagram and YouTube.
-
Collect instagram posts with images related to Slow Loris hashtags (slowloris, slowlorisforsale) to build a benchmark dataset 🏛️
-
Task Duration - 26 hours 🏃⏲️
- Manually identify Slow Loris hashtags 🐵 for example data
- Call instagram api (RapidAPI, instagram85) for hashtag related feed
- Collect json (first page only), extract images & label images by user id
- Save images in folder labelled by language (see Future Prospects)
- Iterate api calls & collect images
- Import json to webpage, index.html, for human validation of images
- Manually validate images and export csv file with information from comments
- Call api recursively with 'next_page_id' to collect all pages
- Depending on image volume, project can evolve into Image Recognition for automation
- Focus on the bigger picture 🌄
- Build one-block-at-a-time 🧱
- Have consistent breaks 😌