đŸ’» The UI Challenge: Serving a PyTorch Model with Flask and FastAPI (Part 2)

In the first part of this series, I forged a powerful machine learning pipeline. It was modular, validated, and lived happily in my terminal. The hard part was over, right? I couldn't have been more wrong. A model that only a developer can run is a solution in search of a problem. The real challenge, the "last mile" of any MLOps project, was to give it a body—an interactive web interface that made it useful to everyone.

This is the story of that last mile: a journey through the clash of two worlds—the predictable, stateful domain of machine learning and the fast, stateless chaos of the web.

The First Great Failure: The Frozen App

I started with Flask because it was familiar. I whipped up a simple UI with a "Start Training" button, hooked it up to my training script, and clicked it with confidence. The terminal sprang to life, loss values started ticking down... and my web browser froze solid. The entire application was locked, unable to serve even a simple request, until the model finished its 10-minute training run.

I had fundamentally misunderstood the nature of a web server. It needs to be available, always ready to respond in milliseconds. My long-running ML task was holding it hostage. My first instinct was to use Python's threading, but that quickly led to a tangled mess of race conditions and a terrifying realization that I couldn't easily stop a runaway training thread.

The solution came from thinking about isolation. The training job didn't need to be in the web app; it just needed to be started by it. The subprocess module was the perfect tool.

# The solution: launch training as a completely separate process
command = ["python", "train.py", "--model", model]
state.training_process = subprocess.Popen(command)

This was the breakthrough. The web app could launch the process and immediately return control to the user, keeping the UI responsive. It was a powerful lesson in decoupling long-running tasks from the request-response cycle.

The Architectural Epiphany: Drowning in Duplicated Code

With my model training happily in the background, I turned to building out the rest of the features. Then, for the sake of learning and expanding my skills, I decided to build an identical twin of my app using FastAPI.

Within hours, I was drowning. Both app_flask.py and app_fastapi.py had near-identical code for finding saved models, handling image uploads, and checking the status of the training process. I was violating the most sacred rule of software engineering: Don't Repeat Yourself (DRY). A bug fix in one file meant remembering to fix it in the other. It was a maintenance nightmare waiting to happen.

This pain forced me to stop and rethink my entire architecture. The solution was to ask: what is the "business logic" of my app, and can it exist without a web framework? The answer led to the project's most important design decision: the creation of a framework-agnostic webapp package.

I extracted every piece of shared logic into this new layer:

  • webapp/logic.py: The "how-to" functions (perform_prediction, start_training_process). It contains zero Flask or FastAPI code.
  • webapp/state.py: A simple global dictionary to track the training status.
  • webapp/config.py: A single source of truth for all paths and settings.

Suddenly, my app_flask.py and app_fastapi.py files became beautiful, thin "wrappers." Their only job was to handle the web stuff (routes, requests) and then call the shared logic layer to do the real work. This 3-tier architecture (Interface -> Logic -> ML Core) was the ultimate "aha!" moment. It made the system clean, testable, and incredibly easy to maintain.

The Final Frontier: A Glimpse of the Cloud

So, here we are. The project has a validated ML brain (food_vision) and a responsive, well-designed body (webapp + FastAPI/Flask). It works beautifully on my local machine. We've journeyed from a chaotic notebook to a sophisticated, multi-layered application. This is a huge victory and a complete project in its own right.

But in the world of MLOps, a local machine is just a sandbox. The true final step is to give our application a home on the internet, where it can be accessed by anyone, anywhere, at any time.

This next phase of the journey would involve a whole new set of challenges:

  • Containerization with Docker: How can we package this entire complex application—Python, PyTorch, system libraries and all—into a single, portable container that runs identically everywhere?
  • CI/CD Pipelines: How do we automate testing and deployment, so that a simple git push can safely update the live application?
  • Cloud Deployment: Where will it live? On AWS, Google Cloud, or Hugging Face Spaces? How do we manage resources, handle scaling, and monitor its health?

The model now has a body and can interact with the world. The final step is to give it a home in the cloud, and that... will be a story for another time.