Writing, Doing, and Building an ML Productivity Pipeline

Machine Learning

Knowledge Work

Software Engineering

How I balance writing and rapid prototyping to learn ML faster: using Jupyter, Kaggle, Hugging Face, and Quarto to turn experiments into publishable apps.

Author

Dominik Lindner

Published

November 11, 2025

1 Two ways to learn new software technologies

I tend to learn in two modes. Sometimes I rush in: open the editor, copy a snippet, patch things together until it runs. Other times I slow down: write as I go, explain decisions, leave a trail I can follow later. The first approach builds momentum while the second builds understanding. The hard skill is knowing when to switch.

Working with modern machine learning tools like Kaggle notebooks, Hugging Face Spaces, Gradio, I wanted a workflow that remains flexible. While writing can feel slower in the moment it clarifies thinking and prevents rework. In contrast, the speedy approach is energizing; yet without notes I quickly get lost.

The solution, for me, is a simple pipeline where exploration, implementation, and communication are part of the same loop.

This post outlines that loop.

I drafted much of it while building a small computer-vision project: the cheese classifier.

2 The pipeline at a glance

Problem definition and model training
App for inference of the model
Writing and publishing

In practice, this linear process is done in cycles. Training informs the demo, the demo shapes the story, the story clarifies what to train next.

2.1 Stage 1: problem definition and model training

Most projects begin with a problem statement and end with a model. I often do the reverse when I’m learning. I try a new tool or method first which I find interesting. Then I look for a problem it can help to solve. A word of warning: that approach is definitely not how you should build a successfully commercial product

Machine learning splits cleanly into training and inference. Inference behaves like ordinary software, which consists of functions with inputs and outputs. Training is exploratory: choices about data, features, and objectives evolve as you learn. Without a record, it’s easy to lose your course.

This is where notebooks earn their place. Jupyter notebooks let code, results, and reasoning sit together. If kept in a lab style with comments, you can see what changed and why, when refering back to them.

However, raw notebooks don’t play well with Git. Tools like nbdev help by turning notebooks into maintainable modules. When you work on Kaggle data, sometimes it can be better to run your code locally, or another remote environment. I use a small environment-check snippets to let notebooks run outside or in Kaggle; full code for kaggle auth and data import.

2.2 Stage 2: building the app

What better badge for your portfolio than making a small demo into a public one. I use Gradio hosted on Hugging Face to wrap inference in a minimal interface.

The app shifts the mindset from internal exploration to external communication.

A public demo also sets a direction for the write-up. If a reader can click and see the behavior, the post can focus on choices and trade-offs rather than screenshots. The reverse is also true: no demo, then I use screenshots in the post.

2.3 Stage 3: writing and publishing

Writing closes the loop. It converts a set of experiments into a sequence of decisions. As in all good research writing the order might not be the same. I publish with Quarto because it treats Markdown, Jupyter, and Git automatically. The pipeline feels coherent rather than stitched together. See also my other article.

There are two workable ways to organize files:

Single repository: everything, experiments, app, and post, lives together: simple to navigate, slightly messier over time.
Separate repositories with selective syncing: experiments and app live in their own project directories. Curated notebooks, flow into the blog: cleaner long term, but requires a small amount of helper scripts.

I use the second approach. In my blog all imported files live under projects/ sometimes with subdirectories. When a notebook is worth sharing, I redact it in the original repo and sync it into the blog repository. While this means I loose some linking ability in my notes app Obsidian, It keeps the writing close to the work.

3 Summary snippets

Decide the mode before you start: sprint when you need momentum; write when choices are piling up and you can’t see the path.
Document decisions in Jupyter notebooks, not everything: note what changed, why it changed, and what you learned.
Treat the demo as a learning tool: if the interface feels confused, the model probably is; fixing the demo often clarifies the model.
Keep writing near the code: whether you use one repo or two, reduce the distance between experiments and narrative.
Publish smaller; publish sooner: a short post attached to a working demo beats everything.

4 Conclusion

Paraphrase from everyone’s favourite ChatBot: Productivity isn’t doing more; it’s designing loops that help you learn while you work.