With two years of experience in the Data field, I have worked on various fronts, and the one where I stood out the most was, without a doubt, the automation of operational processes. Among the projects I worked on in this area, the most notable were:
- Creation of a script capable of checking the update status of a given data and, when necessary, performing extraction, compilation, and upload of the same.
Creation of an automation center that reduced the team’s average response time from 4 days to just 1 day. This brought more agility and standardization to the processes.
- Creation of a robust structure to centralize GitHub Actions. This eliminated other engineers’ contact with “secrets” containing potentially dangerous information and ensured that all workflows were always up to date, using the latest version of different actions or frameworks.
- Conceptualization, planning, and construction of the data quality of the Pricing area databases. I analyzed each database to understand each column, value, and expected pattern. After that, I designed the tests that should be run to check the quality of the data within five parameters (Accuracy, Temporality, Validity, Consistency, and Uniqueness) and gave an evaluation for each criterion. PySpark and the Airflow tool were used to carry out this project.
Änderungen wurden gespeichert
0.0 · 0 Reviews
Bewertungen
Hier keine Bewertungen zu sehen!
Überprüfungen
Einladung erfolgreich zugesendet!
Danke! Wir haben Ihnen per E-Mail einen Link geschickt, über den Sie Ihr kostenloses Guthaben anfordern können.
Beim Senden Ihrer E-Mail ist ein Fehler aufgetreten. Bitte versuchen Sie es erneut.