“Every noble work is, at first, impossible.” Thomas Carlyle.

This article is about the longest week in our lives and the masterpiece it birthed.

Ideation–The Process

Building a relevant product starts from somewhere. For us—Okechukwu Nwagba and I, it started with research on Firebase one week before the deadline and stumbling on one of the many programs hosted by Firebase Builders to promote Firebase products. However, this hackathon was to relaunch their AI-powered Firebase AI extensions and motivate users to build with them on Firebase.

The next few hours after learning of this opportunity and learning of the restrictions involved were filled with intensive brainstorming. The question governing the process was, “What solution can we build within the confines of our limitations?”. These limitations were

Time constraint
Use of two or more Firebase extensions
Building an AI-powered solution

From these constraints, we modified our brainstorming question to become: “What AI-powered solution could be built in one week using at least two Firebase extensions?”

Several discarded ideas later we came up with the idea to use MML modelling to check the physical condition of farm products, particularly potatoes. However, this defied our time constraints as it would take more than one week to model products at different points of their lifecycle—from good to bad.

Prototyping and Testing

Ultimately, we refined the application of the initial idea of MML modelling to pharmaceutical products— for medication prediction and prescription. The solution, an app, would allow users to scan the medication label, using an ML model to extract text from the captured images of product labels and parse the extracted details with custom prompts based on a patient’s health data to get insight from Open AIs ChatGPT. With the now-defined process flow, we built the app focusing on the three core steps a user would take to utilise it.

1. Scan the Product

This is a simple step of placing the area to be scanned within the frame so it can be scanned. The app utilises the user’s camera to scan for text even as the user gets the camera to focus. High-resolution images with a clear view of the text area can also be uploaded from the phone’s gallery.

2. Extract the Details

Our initial pick for this step was the Firebase Cloud Vision ML Kit. Ideally, this ML Kit would extract the text in the image by grouping it into

Lines— words adjacent to each other on the same vertical line.
Elements— alphanumeric characters adjacent to each other on the same vertical line.
Blocks— paragraphs or any other form of text lines adjacent to each other.

The role of building the prototype was mine, and I started building immediately, making good progress in a couple of days. However, I found the Firebase Cloud Vision ML kit lacking during experimentation. This was because the cloud-dependency of the process was error-prone and would result in a broken user experience.

As such, I had to pivot to localizing the image capture process so that users could capture the labels properly, thus minimising human error. To this effect, I switched from the Firebase Cloud Vision ML Kit to the Google Commons ML Kit for Flutter. This tool guides users on how to take acceptable images with prompts from the computer as they are captured. The captured image is then sent to a Firebase document node, which triggers a cloud function to extract the text, clean it up and return it to our database.

3. Use Custom Prompts to Get Insight from ChatGPT

From this point, the app grabs the text, processes it, and allows users to fill in their health details. Alternatively, it could draw already existing health data in the app. Finally, the extracted text and user health data are parsed to ChatGPT.

Now, the user can ask questions, which ChatGPT will give personalised responses to as it has the necessary background knowledge to make reasonable predictions.

The Result

What started as an idea to apply AI modelling to medication prescription and prediction evolved into something bigger. Now, the app works not only for medication but also reads product labels of different categories and uses that info to respond to prompts related to the scanned product, its applications, outlook, and more.

Much work is still required to get the prototype ready for large-scale use; however, its relevance was proven by the result of the hackathon, as it came up as one of the top 8 solutions. You can find the demo video here

https://www.youtube.com/watch?v=RlW2Rvy3XGU

and email us at hello@technomad.pro if you would like to partner or fund this project. To join the beta tester program (Android Only for now) kindly follow this link.

Tech Nomad