K. Kumar
Available · Junior data engineering · Remote & freelance

Kelash
Kumar.

Data Engineer Sukkur · Pakistan
BS Computer Science
Sukkur IBA · Class of 2026

I build production data systems — streaming pipelines that ship and stay alive under load.

Selected work

§ 01 / Pipelines
01
Live Streaming Production

Real-Time Crypto Streaming Pipeline

Python · Redpanda · PostgreSQL · dbt · Airflow · Metabase · Docker · OCI

A nine-service streaming pipeline. Live crypto prices flow from API to dashboard in under five minutes, deployed to a free-tier ARM VM with CI/CD shipping changes in ninety seconds.

Airflow · orchestration CoinGecko Redpanda PostgreSQL dbt · 3 layers Metabase ingest queue store transform visualize
14,400/day
Throughput
< 5 min
Freshness
~ 90 s
Deploy
9 svc
Containers
02
CDC Open Source

Change-Data-Capture Pipeline

PostgreSQL · Debezium · Redpanda · Python · dbt · Airflow · Docker

Log-based replication that turns hourly batches into sub-ten-second streams. Idempotent consumer survives restarts; SCD Type 2 dimensions preserve history.

DLQ · bad records PostgreSQL WAL events Debezium Redpanda Idempotent PK upserts dbt SCD T2 source capture stream consume history
< 10 s
Lag
Real-time
vs. 60-min batch
SCD T2
Dimensions
At-least-once
Semantics
03
Lakehouse Quality Gates

Medallion Data Lakehouse

MinIO · DuckDB · Pandas · Great Expectations · Airflow · Parquet

Cloud-native lakehouse semantics on object storage. Bronze, Silver, Gold — with Great Expectations gating every promotion to keep the warehouse clean.

Great Expectations · gate Source CSV · JSON · API Bronze raw parquet Silver cleaned · validated Gold analytics DuckDB SQL engine source land refine serve query
3 layers
Bronze · Silver · Gold
GE gates
Quality
Idempotent
Backfills
Parquet
Columnar

Technical stack

§ 02 / Tooling
Languages
Python · SQL · Bash · Java
Data Engineering
Airflow · dbt · Kafka / Redpanda · Debezium · Great Expectations · Pandas · Medallion · CDC · SCD Type 2
Storage
PostgreSQL · DuckDB · MinIO · Parquet · MongoDB
DevOps & Cloud
Docker · Linux · GitHub Actions · CI/CD · Oracle Cloud · pytest · Caddy · iptables · SSH
Visualization
Metabase · Grafana · Plotly · Power BI · Tableau
Foundations
Database Systems · DSA · Operating Systems · Computer Networks · System Design

About & record

§ 03 / Background

Education & certifications

BS Computer Science
Sukkur IBA University · 2022 — 2026
Database Systems · Data Structures & Algorithms · Operating Systems · Computer Networks · Software Engineering · System Design.
Google Cloud Data Engineer
Coursera · 2026
IBM Data Engineering
Coursera · 2025
Google Data Analytics
Coursera · 2025

Honors

National Skill Competency Test — 92.2 percentile
NSCT · 2026
Top marks in Programming, Database, and AI/ML & Data Analytics across a 10-subject national competency exam.
Sindh Talent Hunt Program scholarship
STHP · 2022
Fully-funded merit scholarship covering BS Computer Science at Sukkur IBA University.
Three production pipelines shipped
2025 — 2026
All open source. Streaming, CDC, and medallion lakehouse — each fully containerized and orchestrated.
§ 04 / Get in touch

Let's build something good.

Open to junior data engineering roles, freelance work, and collaborations — remote, hybrid, or on-site in Pakistan. The fastest way to reach me is email.