What excited me from this project:
In this project, I learn how to engineer and architect business data analytics pipelines with modern data management technologies. I develop the skill in ingesting data from various sources, loading data into a data warehouse, designing warehouse schemas, programmatically querying and analyzing warehouse data. This project helps me build the basic hard skill that data engineers needed.
Project Description:
This project was conducted by Calvin Chen, Daniel Hom, Duc Le, Lucius Liu, Vitchuda Poonyakanok, and Yuyang Xie at the University of Wisconsin-Madison, Wisconsin School of Business in 2021
This project is about building a system to continuously read tweets from Twitter API to database and Kafka and score phrases and words based on Twitter trends each minute using Python and PostgreSQL
Twitter Trendiness Score Analysis
Please visit my Github to view details
This project helped me practice the following key skills.
Data Technologies: PostgreSQL, Kafka, Python
Data Management: Data Pipeline, Warehouse Schemas design, Querying data
Machine Learning: Analyzing unstructured data and calculating trendiness score using Python and PostgreSQL