Data Engineer Projects

Southwest data pipeline

Automated loading of data from Amazon S3 into a cloud data warehouse (Google BigQuery) using Fivetran. Transformed that data in the warehouse using dbt (Data Build Tool) and built dashboards in Tableau.

Twitter Trendiness Analysis

This project is about building a system to continuously read tweets from Twitter API to database and Kafka and score phrases and words based on Twitter trends each minute using Python and PostgreSQL

Kafka stream Yelp analysis

Device an AWS end-to-end real-time data pipeline by channeling data into Amazon S3 via Kafka, orchestrating transformation with Lambda functions, and conducting analysis using Athena.

AWS Data quality projects

The project leveraged AWS components (S3, Data Glue Brew, EventBridge, Lambda, SNS) to craft a robust data quality process, precisely identifying and addressing data quality issues for multiple data owners in alignment with project objectives.

FETCH REWARDS DATA MODELING: (PYTHON)

Designed Entity-Relation Diagram, loaded the data, transformed the JSON format into a data frame, and captured data quality issues.

Youtube End-to-End

This project leverage the AWS service to do transforming using lambda and AWS glue into 3 different S3 Buckets, including, raw, cleaned, conformed

Next
Next

DATA SCIENCE PROJECTS