Large Language Models (LLMs) like ChatGPT have rapidly become commonplace tools. They’re the most transparent AI application I can think of, users know they’re interacting with an AI. Despite that, I see chat as a complex and difficult user experience for most applications. It’s similar to interacting with your computer via the command line, powerful but at the cost of a high bar. The dominance of OSX as a user friendly OS hints that chatting to AI will become reserved for power users.
Why it matters: Chat offers breadth as an interface, but it has thus far proved to be somewhat unnatural. Guides on prompt engineering show these models need help to perform at their peak. To bring the value of these models to everyone, we need to wrap them in accessible and intuitive interfaces.
- Chat is unlikely to disappear but it'll likely be for power users, much like the command line is today.
- Graphical user interfaces have replaced the command line for day to day users. Teams of user experience designers work with end users to try and remove friction.
I had a moment recently that triggered this post. There's a new feature on YouTube that plays an effect on the like button when someone says “hit the like button”. This was a small moment but likely involved several ML models. One translating speech to text, another looking for phrases about the like button. As far as I can tell, this all happens behind the scenes with no user intervention.
What’s holding it back: Designing user experience is hard. It is a specialty in itself. Chat is wide open and flexible which has made it an attractive default. Despite that, our job is to remove friction in giving users what they’re seeking. The current interface for many LLMs is setting system level prompts. This is a behaviour which doesn’t lend itself to moving away from ‘chat’ at the primary interface.
- Providers like OpenAI will likely start to introduce other ways to guide models. Other models for image generation have some options with things like temperature.
- The challenge building on top of off-the-shelf models is that adding other levers requires more training. Without that level of access to a model, chat becomes the only option.
- It’s also worth noting that even the best of today’s generative AI models have some serious shortcomings. Hallucination, regurgitation of copyrighted content and safety are still being worked on. Where they land and what that means for the future of these models is yet to be determined.
Ultimately, users buy products to solve problems for them. We need to make sure the products we build are solving problems with as little friction as possible. Explaining a problem, correcting mistakes, and chanting secret prompts won't work long term. We’re still early in the world of UX for AI so I wont call it a loss at this stage. Five years from now we'll likely look back on chat as the primary UX the same way we look at the command line today.