top of page

Sadik Bakiu
Aug 19, 20238 min read
MultiGPU Kubernetes Cluster for Scalable and Cost-Effective Machine Learning with Ray and Kubeflow
Introduction Large Language Models (LLMs) are very much in demand right now, and they need a lot of compute power to train. Llama 1 used...


Bujar Bakiu
Oct 14, 20225 min read
Dockerizing dbt Transformations for Managed Airflow: Docker, dbt, and GCP Cloud Composer
Airflow is one of the most popular pipeline orchestration tools out there. It has been around for more than 8 years, and it is used...


Kejdi Tako
Sep 14, 20223 min read
Distributed Machine Learning Model Training with Spark (PySpark)
GitHub repo: https://github.com/data-max-hq/pyspark-3-ways What is Spark? Apache Spark was designed to function as a simple API for...

Bujar Bakiu
Aug 29, 20226 min read
Serving Dog Breed Classification model with Seldon-Core, TensorFlow Serving and Streamlit
GitHub Repo: https://github.com/data-max-hq/dog-breed-classification-ml In a modern Machine Learning workflow, after figuring out the...


Igli
Aug 24, 20224 min read
Deploy Airflow and Metabase in Kubernetes using Infrastructure-as-Code
A step-by-step guide to deploying Airflow and Metabase in GCP with Terraform and Helm providers. With the extensive usage of cloud...

Bujar Bakiu
Jul 12, 20227 min read
A hands-on project with dbt, Streamlit, and PostgreSQL
Data Engineering with dbt and streamlit. How to build a project with dbt, Streamlit and PostgresSQL.

Sadik Bakiu
Apr 23, 20223 min read
Modern Data Team Hats
This blog was written together Martin Rusnak from Rusnak Consulting and Bujar Bakiu. Not that long ago (maybe somewhere this is still the...
bottom of page