Projects
A collection of my AI/Data Engineering projects and open-source contributions.
Side projects

VN Legal Intelligence Platform
A production-grade legal information system leveraging Agentic RAG and a 7-node LangGraph pipeline to answer Vietnamese legal queries with verifiable citations. Validated with an excellence accuracy of 4.67/5 using a custom evaluation framework. Features a fully automated MLOps pipeline on Google Cloud Run.

Tech Shop With Multi-Agent System
A full-stack e-commerce platform powered by a multi-agent AI system. Integrates 4 specialized agents (Host, Search, Advisor, Order) for product search and purchase advising using Qdrant vector database. Reduces search friction and guides users through complex tech queries.

Local Text2SQL Platform
A highly accurate natural language to SQL conversion system utilizing an optimized Gemma-2-2B SLM fine-tuned with QLoRA. Executes completely offline for 100% data privacy and low-latency database queries.

AI Interview Prep SaaS
An AI-driven SaaS platform that simulates technical interviews. Uses generative LLMs to dynamically create personalized interview questions based on resumes and evaluates candidate responses with actionable feedback metrics.

AI LinkedIn Post Generator
A production-ready AI text tool for tech professionals. Features real-time SSE streaming for smooth generation, tailored AI personas, and strict prompt guidelines to produce insightful, high-engagement LinkedIn posts.

Binance Data Lake & Analytics Pipeline
A comprehensive Medallion Architecture data pipeline. Ingests real-time crypto market data via Kafka, processes using scalable Apache Spark clusters, and orchestrates dbt transformations via Airflow for business intelligence.

FPT Stock Price Forecasting System
Industrial-grade time-series forecasting system for predicting FPT stocks. Validated 100-day long-horizon forecasting using PatchTST (Patch-based Transformer). Substantially outperformed linear baselines in capturing long-term dependencies (MAE 4.70 vs 37.71).

Istanbul Retail Data Warehouse & OLAP
An end-to-end Business Intelligence system for analyzing customer shopping behavior. Integrates the full Microsoft BI stack (SSIS, SSAS) with Machine Learning (XGBoost) to deliver multi-dimensional insights through Power BI and Looker Studio.

MLOps Customer Churn Prediction
An end-to-end MLOps pipeline on Microsoft Azure for predicting customer churn. Automates the full ML lifecycle with DVC for data versioning, Feast with Redis for feature serving, and MLflow for experiment tracking. Features a CI/CD retraining loop triggered by user feedback via Azure Event Grid and GitHub Actions.
Gen-DBA: AI-Driven Oracle Partitioner
An intelligent database administration agent using LangGraph and GPT-4o-mini to analyze Oracle workload patterns and autonomously recommend optimal data partitioning strategies. Includes a validation loop to sanitize LLM hallucinations and benchmarked on TPC-H datasets with 10.4% latency reduction and >60% I/O decrease.

Real-Time Chat Application
A full-stack, real-time chat application featuring secure user authentication, friend management (add/accept/decline), and instant messaging. Built with Node.js, Express, and MongoDB for the backend, React for the frontend, and Socket.IO for real-time bidirectional communication. The entire application is fully containerized with Docker Compose.

Bach Hoa E-Commerce Platform
A comprehensive e-commerce ecosystem including a React-based admin management interface, an Android mobile application, and a robust Node.js/MongoDB backend API. The platform features an integrated AI chatbot service built with FastAPI, utilizing Retrieval-Augmented Generation (RAG) to provide intelligent, context-aware customer support.