What excited me from this project:

In this project, I learn how to engineer and architect business data analytics pipelines with modern data management technologies. I develop the skill in ingesting data from various sources, loading data into a data warehouse, designing warehouse schemas, programmatically querying and analyzing warehouse data. This project helps me build the basic hard skill that data engineers needed.

Project Description:

This project was conducted by Calvin Chen, Daniel Hom, Duc Le, Lucius Liu, Vitchuda Poonyakanok, and Yuyang Xie at the University of Wisconsin-Madison, Wisconsin School of Business in 2021

This project is about building a system to continuously read tweets from Twitter API to database and Kafka and score phrases and words based on Twitter trends each minute using Python and PostgreSQL

Twitter Trendiness Score Analysis

Please visit my Github to view details

This project helped me practice the following key skills.

  1. Data Technologies: PostgreSQL, Kafka, Python

  2. Data Management: Data Pipeline, Warehouse Schemas design, Querying data

  3. Machine Learning: Analyzing unstructured data and calculating trendiness score using Python and PostgreSQL

Previous
Previous

COMPUTER VISION

Next
Next

(KEY Project) SOUTHWEST DELAY ANALYSIS