Data Engineer Projects
Southwest data pipeline
Automated loading of data from Amazon S3 into a cloud data warehouse (Google BigQuery) using Fivetran. Transformed that data in the warehouse using dbt (Data Build Tool) and built dashboards in Tableau.
Twitter Trendiness Analysis
This project is about building a system to continuously read tweets from Twitter API to database and Kafka and score phrases and words based on Twitter trends each minute using Python and PostgreSQL
Kafka stream Yelp analysis
Device an AWS end-to-end real-time data pipeline by channeling data into Amazon S3 via Kafka, orchestrating transformation with Lambda functions, and conducting analysis using Athena.
AWS Data quality projects
The project leveraged AWS components (S3, Data Glue Brew, EventBridge, Lambda, SNS) to craft a robust data quality process, precisely identifying and addressing data quality issues for multiple data owners in alignment with project objectives.
FETCH REWARDS DATA MODELING: (PYTHON)
Designed Entity-Relation Diagram, loaded the data, transformed the JSON format into a data frame, and captured data quality issues.
Youtube End-to-End
This project leverage the AWS service to do transforming using lambda and AWS glue into 3 different S3 Buckets, including, raw, cleaned, conformed