demo-attachment-1624-big-4-img

Data Engineering Projects

Cloud Data Warehouse with AWS

Tech Stack: AWS Glue, Redshift, S3, Delta Lake, Cube.js, Power BI
Summary: Built a scalable data warehouse pipeline with ingestion, transformation, and API exposure for dashboards.
Details:
Designed and built a full-scale cloud data warehouse using AWS services. The architecture allowed ingestion from JSON, CSV, and SQL sources using AWS Glue, stored in S3, and transformed into a snowflake schema in Redshift. Delta Lake and Lake Formation were used for data governance and schema evolution. Cube.js exposed APIs for dashboards consumed by Power BI or any front end.


Agile Metrics Dashboard

Tech Stack: FastAPI, Cube.js, Kubernetes, GitLab, Sonar, ZohoSprint
Summary: Unified DevOps performance tracking system integrating GitLab, Sonar, and project management tools.
Details:
Built a multi-tenant agile dashboard system to monitor and correlate activities from development to deployment. Integrated GitLab, Sonar, ZohoSprint, and Matomo for full SDLC visibility. FastAPI microservices handled data ingestion and user role management. Cube.js enabled KPI modeling and served the data layer. All components were containerized and deployed on Kubernetes.


Apache NiFi Data Ingestion Pipeline

Tech Stack: Apache NiFi, Hadoop, Apache Atlas, Kubernetes
Summary: Multitenant pipeline with lineage tracking and governance using Apache Atlas and secure Kubernetes deployment.
Details:
Developed a robust multitenant data ingestion system with Apache NiFi to process streaming data and store it in Hadoop. Apache Atlas managed metadata and data lineage. Role-based access control ensured secure multi-user access. All deployments were handled via Kubernetes, and the system included SSL/TLS encryption for secure communication. Documentation and training were provided for operational handoff.