The Rebirth of the recipescanner
1 Cooking without ingredients is difficult
Dear Reader, are you fond of cooking? If yes, then you are facing the challenges of meal planning and searching for recipes. Even more, you still have to go out and buy groceries every week. I do not dislike those tasks. They can be fun but they take time. Time better spent cooking. Personally, I find cooking relaxing, whereas the preparation and shopping often add more strees to my busy schedule.
1.1 Tired of skimming books
There are many apps, which can plan, guide and organize shopping list. I did not intend to reinvent the wheel. My specific issue was that my recipes are in books. I have to read through books and then assemble a shopping list. Much like my Grandmother would have done.
Tired of skimming books, I was searching for a way to digitize my existing recipe book collection as easy as possible. In April 2019, I decided to build a small sidekick application whose differentiator would be to to scan and digitize physical cookbooks. This endeavour was about to breathe new life into the dusty cookbook.
In the beginning, I started prototyping in python using OpenCV and TesseractOCR.
I had non-working drafts for the entire software. What I did not have was a working software.
This came much to my suprise, as I followed good engineering practices, including test driven development. At least what I believed good practices to be. This is related to the Dunning Kruger Effect. A popular, but wrong representation below. Why this is misleading and more on the actual outcome of the study here.
Test-Driven Development (TDD) is a software development practice where you write tests before writing the implementation. The ideal programming cycle becomes: write a failing test → write just enough code to pass the test → refactor.
In contrast, to me, it meant writing tests close to the implementation. Often after the fact. I only later in my life became to grasp the idea and benefits of test-first.
After two months of experimentation with various new technologies, I had developed a rudimentary approach. It worked based on simple rules of text position and length, as well as some keywords. I relied on domain knowledge of recipe books: title is at the top, ingredients appear as a block and often start with numbers. I realized the limitations of this approach as I developed a working prototype. Too much variation and the rules would fail. However, i thought of creating just enough data to bootstrap an ML process.
Then the project went dormant for ten months.
1.2 Machine learning on mobile devices
In the meantime, I had discovered Paprika as my recipe organization app.
The previous app suffered from poor OCR
results using Tesseract
. Tesseract
works best on traditional scans. I was working on phone pictures shot with a shaky hand during low evening light.
I got interested in mobile machine learning applications around that time. Any privacy issue becomes much easier compared to cloud based solutions. In addition, any compute cost in data centers could be reduced to almost zero.
I explored Google's ML Kit
, which offers on-device OCR
optimized for smartphone. I downloaded the according demo app.
Around November 2020, I had figured out how to program a OCR
app on android
using Kotlin
. Compared to tesseract the results where not even better, Google’s structured OCR
output simplified the recipe extraction problem.
My complex problems of getting ingredients from books was reduced:
- Paprika handles recipe organizatio and shopping list. Everything must be in English.
- Google ML kit is the
OCR
scanner, and translates to English. - My app focuses on parsing structured OCR Output to recipe data.
By January 2021, I had the core logic working in python. In another project, I was working on custom C++
applications for an Android camera. Therefore, I went the C++
route and used Android NDK
. In hindsight, that was a terrible decision as it bloated the tech stack.
From there it took another 4 months until April 2021 up to the working release of the Recipe scanner.
Even though the app worked and I had cleared many operational stages in the Google Play Store, it never made it to the ultimate public release.
Recipe Scanner Android Version
![]() |
![]() |
![]() |
---|
1.3 2023 The year ChatGPT killed the app
The android app worked well by analysing the Json output of the Google OCR
. The rule system was more complex, but still failed if the books were too complex. I decided in 2022 to not go done the machine learning alley, because I focused on storytelling (this blog) and software engineering management.
By 2023, managing machine learning projects had become part of my professional life. Simultaneously, the emergence of ChatGPT offered new possibilities. Instead of going the difficult route of doing embeddings, I tried simple prompt engineering.
Initially, single-step prompts struggled with accuracy, especially with non-English recipes. To overcome this, I designed the following structured prompt:
The user provides a recipe. Do not translate anything. Create a YAML file formatted as follows:
---
name: My Tasty Recipe
servings: 4-6 servings
prep_time: 10 min
cook_time: 30 min
nutritional_info: 500 calories
difficulty: Easy
notes: |
add interesting notes here
ingredients: |
ingredient 1
ingredient 2
(do not translate)
directions: |
list necessary steps
(do not translate)
Surprisingly, this simple prompt effectively solved the parsing issue.
The remaining work reduced to produce glue code between the Google api, the ChatGPT api and the Paprika api to automate recipe uploads to Paprika.
Taking the pictures had become the most difficult aspect of digitizing recipe books. In a way this had become a low - to no code solution.
Frustrated that all the hard work of learning had been for nothing, I stopped development again and did not even bother to write this post.
1.4 What I learned
Selecting the right programming language is critical but challenging. I explored Python, Java, Kotlin, and C++, each with strengths and drawbacks:
Python: Excellent prototyping, weak Android support
Kotlin: Intuitive syntax but Android-centric
Java: Robust but hindered by asynchronous callback complexity
C++: High performance but complexity and segmentation faults with Android NDK
Large Language Models (LLMs) changed all this.
The value of 90% of my skills just dropped to $0. The leverage for the remaining 10% went up 1000x.
Kent Beck
I agree. knowing the ins and outs of each language has become low value. For me, what still bears high value: Algorithm design and knowledge of the complete software development cycle. For the very technical aspects a good book on software architecture: Software Architecture in Practice - Len Bass et al..
For the recipescanner, what matters most: data structures, architecture, deployment.
It feels good to have put out a piece of software in the world. Software, which at least for me makes my life easier.
Hopefully, in the future also for other people as I decided to continue this project.