Tanel Treuberg

A short introduction.

I'm a software engineer with three years across backend, infrastructure, and applied ML. Most recently I refactored internal tooling at Estonia's national grid operator — cutting data-export time 10× and UI load from ~8s to under 2s — and shipped Python automation that erased ~15 hours of manual ops work per month. Before that, real-time computer vision on autonomous Navy patrol vessels: TensorRT FP16 inference on Jetson AGX Xavier, averaging a 3.5× speedup with a 27% drop in energy use. Based in Warsaw.

Projects worth showing.

Case study

Vision pipeline for edge inference

Context

A 2023 bachelor’s thesis at TalTech: a real-time computer-vision system that gives an autonomous Baltic Workboats Navy 18 WP patrol vessel the situational awareness it needs to run its own sea trials.

A joint project between TalTech’s Department of Electrical Power Engineering and Mechatronics, the TalTech Small Boat Competence Center, and Baltic Workboats. The goal was to automate käigukatsed — vessel speed and maneuverability sea trials — that had traditionally required a human captain to run the boat manually while a second person logged readings on paper. Different captains meant inconsistent inputs; manual logging meant inconsistent data. Removing the captain from the loop needed a vessel that could see what was around it. The thesis built the perception layer: an embedded computer-vision system mounted in the boat’s forward control mast, fused with the existing x-band radar, AIS, and wind sensor, feeding object detections to the autopilot. Solo implementation, mentored on hardware + hypothesis by Heigo Mõlder (PhD) and Karl Janson (PhD).

The problem

Three camera streams at 1080p+, real-time ship detection at ≥15 FPS and ≥75% accuracy, on a sealed embedded box mounted in a boat mast — without monopolising the same Jetson that ran the radar bridge and the autopilot’s control loop.

The brief was concrete: simultaneous input from three cameras (≥1000×1000 px each), ship-detection accuracy ≥75%, throughput ≥15 FPS, all running in a sealed enclosure exposed to weather and a wide thermal envelope. Software-wise, the harder constraint was sharing the embedded computer — radar, AIS, and the autopilot’s control loop all lived on the same Jetson, so leaving CPU and GPU headroom was non-negotiable. The early version of the pipeline failed both ways: pulling raw video off three camera streams saturated the CPU on its own (radar communication degraded, thermal headroom evaporated), and the unoptimised PyTorch models barely touched the GPU’s actual capacity. The system needed re-architecting so each piece of silicon did the work it was best at.

The approach

Containerised PyTorch + YOLOv5 (Ultralytics) pipeline on a Jetson AGX Xavier 32GB, with NVDEC-accelerated camera decoding through GStreamer, and TensorRT FP16 conversion via ONNX — built mirrored across x86_64 dev and aarch64 production containers.

Compute: Jetson AGX Xavier 32GB, chosen against the smaller Jetsons specifically for its 512-core Volta GPU, 64 Tensor Cores, dual NVDLA deep-learning accelerators, and 256-bit LPDDR4x memory at 136.5 GB/s — enough headroom to run a larger model and still leave CPU cycles for the radar + AIS bridging code. Cameras: started with the SurveilsQUAD Sony IMX290 multi-camera system, hit a cable-length limitation in the mast enclosure, and switched to Arducam Fisheye 5MP modules — the layout flexibility was worth the slight resolution trade-off. Model: YOLOv5 (Ultralytics), chosen over FasterRCNN / MobileNet SSD V2 / EfficientDet because of its single-pass architecture (CSPNet backbone, PANet neck, YOLO head) and the cleanest path through TensorRT. Pipeline: Docker containers mirrored across x86_64 (dev, NGC PyTorch base) and aarch64 (production, L4T-ML base) so the same code ran on both — small architecture-detecting helpers swapped the GStreamer decoder element (nvv4l2decoder on Jetson, software path in dev). Optimisation: PyTorch → ONNX → TensorRT FP16 conversion to push the model onto the Tensor Cores; OpenCV reserved for what the GPU couldn’t do.

The results

Average 3.5× inference speedup (best case 4.3×) from TensorRT FP16 optimisation, with a negligible 1.7×10⁻³ mAP loss, while energy use dropped 26.8% (20.1W → 14.7W) and GPU utilisation went from 43.1% to 99.8%.

Measured over 1000 inference iterations on a Jetson AGX Xavier 32GB, across YOLOv5n/n6/s/s6/m/m6/l/l6/x/x6 at 640×640 and 1280×1280 input resolutions, with one and three parallel camera inputs. The headline numbers: average 3.5× speedup after TensorRT FP16 conversion, with the best model pair (YOLOv5l6) hitting 4.3× faster than its unoptimised PyTorch counterpart. Accuracy was effectively preserved — average mAP loss across the model lineup was 1.7×10⁻³, well within margin of error for the application. Power and thermal envelope improved in lockstep: 26.8% less energy (20.1W → 14.7W average), operating temperature down 3.3% (46°C → 44.5°C), GPU utilisation 43.1% → 99.8% — the GPU was finally doing the work the CPU had been doing badly. Accuracy plateaued at the YOLOv5m / YOLOv5l models; bigger architectures didn’t earn their inference cost on this dataset, so the deployed model sat at the knee of the curve.

Case study

Remote test bench for electric motors

Context

Strato_Pi was a five-person TalTech practical product engineering project: retrofit a benchtop electrical-motor test rig with remote control so coursework could continue during the COVID lockdown without anyone in the lab.

The course brief landed during the COVID period: a benchtop motor measurement rig at TalTech needed someone physically in the lab to operate it — adjust load, read torque / current / RPM off the panel, log values by hand. With campus access restricted, the rig had become a bottleneck for the courses that depended on it. Our team of five — two backend engineers, two frontend engineers, and me as full-stack integration lead — had one semester to make the same measurements happen from a browser, without sacrificing data trustworthiness.

The problem

The rig was built for in-person operation — local knobs, paper logs, eyeball-on-the-multimeter — and offered no API, no live telemetry, no way to drive it from outside the lab.

Pre-pandemic, a student walked into the lab, set the motor load with a dial, waited for readings to stabilise, wrote values into a notebook, repeated. There was no software-controllable load command, no streaming readings, no historical record beyond the notebook. To turn this into a remote rig, we had to build the entire interaction layer from scratch: a control surface the hardware would accept, a backend that exposed it safely, a frontend students could actually use, and a deployment story that would keep the rig accessible to the course every weekday for the rest of the semester — all on top of a Strato Pi (a Raspberry-Pi-based industrial controller from Sfera Labs).

The approach

Five-person team, two months end-to-end: two backend engineers built the control + telemetry layer on the Strato Pi, two frontend engineers built the student-facing dashboard, and I owned the integration — the front-to-back contract, deployment, and the public web surface.

The system landed in three layers. Backend (Python on the Strato Pi) drove the motor controller, sampled telemetry at the rate the course needed, and exposed it as an HTTP API plus a live telemetry stream. Frontend (vanilla HTML / CSS / JS) gave students a live dashboard with control inputs, the streaming readings, and an in-page documentation panel for the course material. My slice was the join: nailing the API contract so the two halves could work in parallel without thrashing each other, packaging the whole stack into a reinit.sh script that redeployed the system to the production server with a single command, and putting the rig behind a public web surface with the right access controls so students could reach it from home. The repo’s deliberate split — Frontend/ and Backend/ folders for in-progress work, a separate myproject/ folder for what actually deployed — fell out of that integration workflow.

The results

Shipped a working MVP in 2 months, graded 5/5 by the course, and the rig ran without interruption for 3 months of remote coursework.

Full MVP delivered and demoed end-to-end at the two-month mark, with a 5/5 course grade. After delivery, the rig stayed in service for three months without an interruption — students booked remote sessions, ran their measurement sequences, and pulled the recorded readings without needing campus access. The integration discipline held across the run: a clean API contract between the frontend and backend halves let both teams iterate independently, and the single-command redeploy meant operating the rig didn’t depend on whichever team member had touched the code last.

Case study

Edge-native commerce stack

Context

Organic Flow is a Polish dance school running Brazilian Zouk classes, weekend retreats around Poland, and occasional ski trips abroad — previously selling registrations through WordPress + WooCommerce.

Organic Flow organises Brazilian Zouk dance classes — regular weekly sessions, weekend retreats around Poland, and the occasional ski trip abroad. Booking and payment ran through a WordPress 6.7.1 install with WooCommerce 9.4.3 and a GTranslate plugin for Polish/English copy. The stack worked — registrations happened — but the operating model leaked into everything: each plugin needed patching, theme updates risked layout regressions, abuse defense leaned on whatever plugin du jour, and a single copy change went through a CMS the brand owner didn’t fully trust. The rebuild brief was simple: keep the booking flow, keep the brand voice, but trade the WordPress sandwich for something the brand owner could run alone.

The problem

The WordPress + WooCommerce sandwich shipped 5× more HTML per page, fanned out across 13 separate asset files, and locked the brand into a plugin maintenance loop just to keep the standard stack standing.

The old homepage rendered as 121 KB of HTML — most of it WooCommerce theme markup the visitor never saw — pulling in eight JavaScript files and five stylesheets before the cart even rendered. Beyond raw weight, the WordPress operating model leaked into everything around it: each plugin needed patching, theme updates risked layout regressions, the abuse-defense story changed whenever a security plugin was deprecated, and the editorial workflow ran through a CMS the brand owner didn’t fully trust. The architectural cost was high too — i18n, payment, captcha, rate limiting, transactional email, and admin auth were each a separate plugin surface with its own update cycle and failure mode. None of the plugins were wrong. They just compounded into a maintenance loop nobody wanted to be inside.

The approach

Traded the WP plugin sandwich for a tight custom stack: Astro SSR on Cloudflare Workers, Supabase for everything stateful, Przelewy24 + bank transfer for payment, Resend for email, layered Cloudflare primitives for abuse defense.

The new stack is deliberately small. Astro SSR on Cloudflare Workers renders pages at the edge — Workers runtime, not Pages — with partial hydration meaning the default response ships almost no JavaScript. Supabase holds everything stateful: product catalog (including admin-edited markdown bodies), cover images via Storage, auth via SSR session cookies. Payments run through Przelewy24 (Poland’s standard gateway) with a manual bank-transfer fallback for customers who prefer it — that fallback is a real reservation, expired by a second cron Worker after seven days. Abuse defense is layered: Cloudflare Turnstile on auth/cart, Rate Limit bindings on the same endpoints, CSRF double-submit tokens on every POST, signed Przelewy24 webhook verification, and admin pages return 404 (not 403) to non-admins so they leave no trace. Outbound email runs through Resend on a separate billing subdomain. Order state mirrors to a Google Sheet on every change, so the organiser can read the ledger without logging in. The brand owner edits products through a custom /admin web form; changes reflect on the public site within ~60 seconds without a redeploy.

What got chosen against: keeping WooCommerce (too much surface area for too little signal), Next.js (heavier default JS bundle, more runtime to ship at the edge), Stripe (Polish customers expect Przelewy24 by default).

The results

5× smaller HTML on the wire, sub-500ms TTFB, content edits live in ~60 seconds, and abuse defense built from Cloudflare primitives that don’t need update cycles.

The homepage HTML dropped from 121 KB to 25.4 KB — a 5× reduction on the wire — with TTFB sub-500ms from a cold connection. JavaScript ships only where there’s an island that needs it (registration wizard, Turnstile widget, cart badge); everything else is server-rendered HTML the browser can use immediately. The admin workflow is the quieter win: product copy, prices, cover images, and active/inactive state are now self-serve, with changes visible to a logged-out visitor within ~60 seconds — no developer involved. Abuse defense moved from “whichever plugin we trust this month” to layered Cloudflare primitives (Turnstile + Rate Limits + CSRF + signed webhooks) that don’t require update cycles. Per-PR Cloudflare preview URLs catch layout regressions before merge; the cron Worker quietly cleans up stale orders every five minutes without supervision.

Tools I actually reach for.

Languages

Python
C++
JavaScript
SQL
Bash
Swift

AI / ML

PyTorch
TensorFlow
TensorRT
ONNX
OpenCV
GStreamer
Nvidia Jetson

Back end

Flask
REST APIs

Databases

PostgreSQL

DevOps

Docker
Jenkins
GitLab CI/CD
Kubernetes
Grafana
ELK
AWS S3

Tools

Git
Postman
Selenium
Scrapy

If you have a thoughtful thing to put into the world,
so do I.

tanel.treuberg@gmail.com LinkedIn GitHub

A short introduction.

Projects worth showing.

Context

The problem

The approach

The results

Context

The problem

The approach

The results

Context

The problem

The approach

The results

Tools I actually reach for.

Languages

AI / ML

Back end

Databases

DevOps

Tools

If you have a thoughtful thing to put into the world,so do I.

If you have a thoughtful thing to put into the world,
so do I.