Hey folks! I'm Hussain β a Computer Science grad and current Associate Data Engineer Intern at Quantrail. Lately, Iβve been diving deep into the world of data infrastructure and OLAP systems. Today, I want to walk you through my experience building TrendLite β a real-time sales analytics dashboard powered by ClickHouse.
π§ What Is TrendLite?
TrendLite is a retail analytics dashboard designed to visualize product demand, top-performing categories, and sales trends β in real time. The goal? Take a raw CSV of sales data and turn it into a fast, interactive dashboard using scalable data infrastructure.
π οΈ Tech Stack
- ClickHouse β Blazing-fast OLAP database for analytics
- Streamlit β Frontend for live data visualization
- Python & Pandas β Data preprocessing and transformation
β‘ Why ClickHouse?
While working with traditional databases, I noticed performance issues when running analytical queries on large datasets. Thatβs when I discovered ClickHouse, an open-source OLAP DBMS built for speed.
After completing the "Learn ClickHouse β A Fast Open-Source OLAP DBMS" course on Udemy, I decided to put my knowledge into practice β and TrendLite was born.
βοΈ How I Built It β Step by Step
1. Data Preprocessing
- Cleaned and formatted the raw sales data using Pandas
- Handled missing values, date formats, and basic transformations
2. ClickHouse Setup
- Installed and configured ClickHouse locally
- Loaded the cleaned data into ClickHouse tables
- Wrote optimized queries like:
SELECT product, SUM(quantity) AS total_sales
FROM retail_data
GROUP BY product
ORDER BY total_sales DESC
LIMIT 10;
3. Dashboard with Streamlit
- Built an interactive frontend using Streamlit
-
Visualized key metrics:
- π Top products by total sales
- π Revenue trends over time
- π§Ύ Category-wise breakdowns
πΈ Sneak Peek
Want to explore the dashboard in action?
π Check out the GitHub Repository
The repo includes the Streamlit app code, ClickHouse integration, and sample data.
π§ Key Takeaways
- Learned the OLAP vs OLTP difference and their use cases
- Understood how ClickHouse executes large-scale queries efficiently
- Got hands-on with basic ETL concepts and data modeling
- Built a clean UI for data storytelling using Streamlit
- Faced (and fixed!) real-world bugs and performance issues π₯
π Whatβs Next?
I plan to:
- Experiment with streaming data ingestion (Kafka might be next!)
- Open-source the project with deployment instructions
- Publish a Part 2 walkthrough on the code and architecture
π¬ Final Thoughts
Working on TrendLite made me appreciate the power of OLAP systems like ClickHouse β especially when paired with lightweight tools like Streamlit. If you're starting out in data engineering or analytics, building a project like this is a great way to learn fast, hands-on.
Feel free to connect β I'm always open to sharing ideas, learning together, or just nerding out about data π
Top comments (0)