Espai / Regen
Technical Deep Dive
Project Technologies
Feature Overview
Room restyling was enabled by leveraging Stable Diffusion alongside the creation of a ChatGPT-style UX for generating specific interior setup instructions based on user input.
Image Generation Pipeline
The implementation centers around a custom SDXLImageGenerator class responsible for:
- Loading ControlNet, Autoencoder, and SDXL pipelines efficiently.
- Memory optimization utilizing
xformersfor stable generation. - Managing dual modes: Text-to-Image and ControlNet Style Transfer (Canny edge detection).
- Streaming intermediate steps via a queue and clearing CUDA caches post-completion.
Instruction Gen w/ GPT-4 Vision
After a user accepts a restyled design, the final image is passed to GPT-4 Vision. Utilizing a specific prompt and schema blueprint, the system enforces a structured JSON response containing:
"material_types": [...],
"suggested_furniture": [...],
"style_notes": "...",
"search_terms": [...]
Redo Requests & Compute Cost
To manage the significant real-time compute costs incurred via Replicate API calls, a hard 3-try limit was implemented per job. This allowed users three rapid redos before requiring a full process restart—balancing user creativity with operational cost control during the beta phase.

Streaming & Partial JSON Parsing
To deliver a real-time, ChatGPT-like conversational streaming experience inside the mobile app, a specialized PartialJSONParser was engineered in Go.
- 1It intercepts incomplete JSON tokens arriving from the OpenAI stream and dynamically syntheizes closing braces/brackets on-the-fly to prevent parser crashes.
- 2The successfully parsed, but partial, JSON fragments are immediately serialized into Protocol Buffers.
- 3These chunks are streamed via gRPC down to the Flutter frontend client, enabling live rendering of design instructions as they are generated.
Real-Time UX in Flutter
The Flutter frontend was fully gRPC-based. Streaming updates were displayed in a live-typing manner to simulate AI 'thinking'. This dramatically improved perceived responsiveness and contributed to Espai's brand as an intelligent interior assistant. I also added an option to save the final setup and view associated design instructions at any time.
Learnings & Takeaways
This feature was both a technical and UX challenge. Key takeaways:
- 1 Real-time image generation requires thoughtful cost management
- 2 Structured inputs outperformed open-text for generating usable results
- 3 Streaming incomplete JSON from GPT-4 required creative backend handling
- 4 Mimicking ChatGPT's UI created user trust and strong engagement
- 5 Fine-tuning prompts for JSON output structure made GPT-4 outputs easier to parse and present
References
- #
Hugging Face Diffusers
Image generation pipelines using pretrained diffusion models
- #
Replicate Cogs
How to build and deploy AI models on Replicate
- #
gRPC
High-performance communication framework used for real-time streaming
