top of page

Sadik Bakiu
Aug 19, 20238 min read
MultiGPU Kubernetes Cluster for Scalable and Cost-Effective Machine Learning with Ray and Kubeflow
Introduction Large Language Models (LLMs) are very much in demand right now, and they need a lot of compute power to train. Llama 1 used...


Bujar Bakiu
Oct 14, 20225 min read
Dockerizing dbt Transformations for Managed Airflow: Docker, dbt, and GCP Cloud Composer
Airflow is one of the most popular pipeline orchestration tools out there. It has been around for more than 8 years, and it is used...

Bujar Bakiu
Sep 19, 20226 min read
Orchestrating Pipelines with Dagster
A complete guide on how to integrate dbt with Dagster and an automated CI/CD pipeline to deploy on an AWS Kubernetes cluster This blog...


Kejdi Tako
Sep 14, 20223 min read
Distributed Machine Learning Model Training with Spark (PySpark)
GitHub repo: https://github.com/data-max-hq/pyspark-3-ways What is Spark? Apache Spark was designed to function as a simple API for...

Bujar Bakiu
Aug 29, 20226 min read
Serving Dog Breed Classification model with Seldon-Core, TensorFlow Serving and Streamlit
GitHub Repo: https://github.com/data-max-hq/dog-breed-classification-ml In a modern Machine Learning workflow, after figuring out the...
bottom of page