AI Agents Transparency and Vibe Reporting Artwork

UX - The User Experience Podcast

Help me improve the show: https://forms.fillout.com/t/txqbF3seyNus

Welcome to the User Experience Podcast, the podcast where we (ex)change experiences! I am a firm believer that sharing is caring. As we UX professionals are all aspiring to change User Experiences for the better, I have put together this podcast to accelerate learning and improvement! In this podcast, I will: Share learning experiences from myself and UX professionals, answer most common questions and read famous minds.

All Episodes

UX - The User Experience Podcast

AI Agents Transparency and Vibe Reporting

April 08, 2026 • Jeremy

0:00 | 29:35

I'd love to hear from you. Get in touch!

🤖 How To Identify Transparency Moments In Agentic AI — Smashing Magazine

Victor Yocco's article is one of the best practical frameworks I've read for designing agentic AI experiences
The core problem: agentic AI disappears while it works — it acts on your behalf in the background and surfaces information only when it's done — and that creates a trust gap
Two failure modes to avoid: the black box (user has no idea what happened or why) and the data dump (so many status updates that users develop notification blindness and ignore everything)
The fix is a decision node audit — map every step in your agent's logic, identify where it branches or makes a judgment call, and ask: does the user need to know about this?
The impact risk matrix helps prioritise: low stakes and reversible = auto-execute and inform quietly; high stakes and irreversible = ask for explicit permission first
Status messages matter more than we think — "processing" tells the user nothing; "liability clause varies from standard template, analysing risk level" tells them exactly what they need to know
My favourite method from the article: have a user watch the agent work and think aloud — timestamp every moment they say "wait, what?" or "what did it just do?" — those are your transparency gaps

🚀 Rocket — A Startup That Tells You What To Build — TechCrunch

Rocket connects research, competitive intelligence, and product strategy into one workflow — input a prompt, get a McKinsey-style PDF with pricing, go-to-market recommendations, and product requirements
The pitch: generating code and designs is now a commodity — the real gap is knowing what to build in the first place
I like the idea, and I think it will genuinely accelerate a lot of early-stage thinking
But here's my challenge: it synthesises data that already exists on the internet — it cannot tell you what real users think, feel, or struggle with, because that data isn't publicly available
My bigger concern: we are removing barriers to creation faster than we are strengthening the filters that determine if something is worth creating — the majority of products already fail because of insufficient user research, and commoditising product ideation will make that worse, not better
My take: the more we accelerate creation, the more we need to invest in user research as a compensatory mechanism — not less

Support the show

Help me improve the show HERE

SPEAKER_01 0:00

In today's episode, how can we identify necessary transparency moments in agentic AI and an AI startup that tells you what to build? I'm happy to be back. I was out for some days. I was in a long, prolonged weekend, and it was really refreshing. And so I told myself I would pause the podcast for some days, but I'm happy to be back. And for today, we have two articles that I want to discuss really briefly. One is about how can we identify transparency moments in agentic AI? How can we make a gentic AI outputs more transparent? How can we foster trust for the user, the end user? And then we'll cover a brief article about a startup, an AI startup called Rocket, which offers Vibe McKinsey style reports at a fraction of the costs, and what are some implications we can think of when it comes to user experience research, design, and the like. Okay, so I'm gonna start with the first article, which is from Victor Yoko. Uh, I found this article on smashingmagazine.com, and it's about how can we identify necessary transparency moments in agentic AI. I just want to preface my episode today, saying that I read this article very quickly just to get a glimpse of what was it talking about, if I resonated with it, if I could choose to have it on the podcast. But ultimately, it needs further reading. I want to read it again because it has really, really great ideas for UX professionals. I think that everyone should read it. So um congrats to you, Victor, for putting that together, and that's really helpful. So thank you. So um, yeah, so the main idea uh from Victor is that when we design for agentic AI, it requires some attention um to both the system's behavior and the transparency of its action. So agentic AI, first and foremost, as we know, AI systems and more specifically generative AI and LLMs, it's kind of a black box for the user. And I'm not a professional of the technicalities of AI systems, but I know it's kind of a black box for the user. I feel it's a black box for me sometimes when you input a something, a request for the LLM, and then depending on how you phrase it, and even if you phrase it exactly the same way, you will get a different answer every time. So it's kind of a black box in the end, because it's probabilistic, as we used to say, it's not deterministic, meaning that the answer it will provide to you is um based on probabilities. Anyways, and so that's for the LLM generally speaking, but if we extend that to agentic AI, agentic AI involves the same probabilistic nature, but extend it to the idea that the AI will choose what tool to use to answer your request. So it's like if you added some tools to your AI. So you ask not only to retrieve something for you informationally speaking, but also you give it the ability to act on your behalf. So let's say it can send an email for you, it can whatever um change the roles on a spreadsheet, and so on and so forth. So that for me pushes the boundaries of the black box because it goes even further, it's not just retrieving information and communicating that to you, it's also acting on your behalf, and so that's that adds to the whole set of uncertainty that you already have, and with all the danger that lies ahead. And so the point of Victor is that we need to map all the decision points to reveal the key moments to build trust and through clarity, not necessarily noise. Because we could think that, and I think that's Victor's point at the very beginning of the article. Uh, when we use when we interact with an EI agent, basically we will handle it a complex task and it will do it, but it will disappear while it does it. So it will go in the back end, do its task, and then it will surface information to you. And it will surface the information depending on how the designer or the developer, like the design team, broadly speaking, not necessarily only designers, but the design team broadly speaking will have configured this tool and this experience to surface information to you. And it can lie between two extremes, which are a complete black box, so it does the thing and it barely surfaces anything to you, or a data dump, as Victor calls it, which consists in uh telling you anything and everything, but not everything will be useful to you because at some point you will have some noise. It's it's kind of it creates more noise than it's useful to you. So that's the idea. And Victor says that the black box leave the users feeling powerless because you ultimately don't have any more control over anything, and even knowing what's happening, you don't know what's happening. And the data dump, it's the idea that it's what how do we call that TMI? Too much information. You are communicated too much information, and so I like how Victor calls it, which is notification blindness. Because if we if we have a constant stream of information, we tend to ignore it altogether because because not everything is useful. So we there is this kind of habituation process. Habituation, for those who don't know, is described in neuroscience as the as the process through which when we have a stimulus coming our way, and we learn that this stimulus is not harmful or dangerous or novel, well, we tend to dismiss it more and more. So if we have constant notifications, but we know that this will not be critical, we learn, we learn to ignore it. That's kind of what happen happens to me. I don't know if you're also confronted with this problem, but when I said reminders on my phone, I tried some approaches to set reminders like every day. But if I have too many of them and I know it's not important, I will simply dismiss them. Like it could be great if I would act on them, but ultimately it's designed in a way that I see that every day, and it's not really my brain doesn't interpret that as an alert anymore. And so that's kind of this idea. And so Victor mentions that we need an organized way to find a balance between this data dump, information dump, and this black box aspect. And so there are many, many, many great ideas to unpack here. Um, basically, there is a method, so he walks us through a method which is called the decision node audit. And it requires to map the logic to the user interface. And then he also covers, interestingly, the impact risk matrix to help prioritize which decision nodes to display. So, in a nutshell, what I understood from this article is that it's not just a question of, and this is something that you user experience designers and also to some extent this researchers, we oftentimes ask when we are confronted with these experiences, like what to communicate. And it's not so easy because what to communicate is kind of a solution-oriented question. We need to ask first how does the system work, and then what are the steps, and also what happens at each step, and how can I deviate, how will I deviate from the expected behavior, and then and only then we can ask the question how to communicate stuff to the user, interestingly enough. So I will spare you the examples, although they are very, very interesting, but uh our author goes through some examples, and ultimately, this is the idea of the impact risk matrix. So and the decision node audit. So it's basically the idea that you map your system logics, so step by step, what are all the steps what what what are all the steps your agent sorry what will your agent go through when we think about steps? So this is the system logic. So for instance, if you have a contract reviewing process and workflow, you will have some steps like you ingest the contract, you identify the clauses, and then you have a decision node, which is like to approve or not to approve the contract clause, right? And so at some point you have a decision node, and this is this this decision node is like a branching moment because depending on the answer, you will either approve or reject. And so the author calls that a decision diamond, depending on the confidence. If the confidence is low, if low, uh you will um tell the user what's happening, and if the confidence is high, you will auto-approve. So that's one first thing to consider is like what to surface to the user, and at what stage and how. And so here our author says that when it deviates from let's say the usual and the automated, we should surface a status update to the user and not necessarily just processing, like because you can probably remember if you interact enough with LLMs that sometimes we have agents embedded in their answers. So for instance, if you call Cloud to perform some tasks, it will tell you at some point thinking or performing some actions and so on. And more often than not, we have this processing stuff, or we have this kind of very useless feedback or status communication that that doesn't help us. Like what is it actually doing right now? So the author says that if we just say processing, it will not help the user. We should communicate better what is the current status and what will be the next step. So, for instance, liability clause variance detected, investigating options, and this is something that will be way more informative for the user. So, for instance, Victor says instead of reviewing contracts, the interface updated to say liability clause varies from standard template, analyzing risk level. Okay, and so here then that's really helpful. So again, thank you for that. Um, the author uh walks us through a step-by-step process to decide what to surface on at each stage. So, first we should get the team together. This is something that is not specific to, although very useful, but not specific to AI agents. This is something we should always, always do. But I would probably agree with the author here. Although it's not directly mentioned, I think this is more and more the case because we are dealing with AI agents and non-eterministic systems, so it's something we should do even more. So get the team together, bring in the product owners, business analysts, designers, and so on, and engineers who build the AI so that everyone can bring their perspective. Then you should draw the whole process, document every step, every step in the workflow naturally. This is something we should do every time for all kinds of products, but even more so in this day and age of AI. And if you haven't listened to it, I highly encourage you to listen to my episode talking about talking about evaluating AI, but from a conceiver or slash designer perspective and standpoint, not only on the user and at the very end of the process. So let's say I make a chatbot to onboard my users who want my clients who want to have specific law services. Um, well, the chatbot, what do I want to communicate with the chatbot? Do I want professional tone? Do I uh how how does the chatbot handle when things go wrong, and so on and so forth. So there are a lot of things to consider, there are a lot of ingredients, and there is a whole workflow, automation workflow. And so, and so this is something you need to map out first. Like if my client uploads documentation, how can I, for instance, communicate to my client that that everything is secure, for instance, or how should I go through the process of scheduling a call with my client and so on and so forth, anyways? I I guess you you have the point, which is that we should know our technology and our process and our flow, the ins and outs, before evaluating it with our end users. So that's that's what is meant in the third step of our author's framework. Find where things are unclear, look at the process map for any spot where the AI compares options or inputs that don't have one perfect match, and then identify the best guess steps, examine the choice, write clear explanations, update the screen and check for trust. Make sure the new screen messages give users a simple reason for any wait, time, or result. So that's what I really like about this framework, because we tend to only create the thing and then eventually, not everyone even will test it, but eventually, if we test it, we only test it at the end. It's like, yeah, yeah, yeah. So it's like it's like assuming that you know all of it. It's like, yeah, let's do this and this and that. We take we take we take a thousand of assumptions along the way. We don't even think about all the steps, what to communicate, when to communicate it. We we we sit on a on a thousand of assumptions. But in the end, you need to communicate, you need to make sure that you're communicating the right amount of trust to your users, the right amount of information to generate the right amount of trust. So this is this whole idea of trust calibration, under trust or over trust, we need to be in the middle. All users need to trust the the work, let's say the agent or the the product the right amount. So if your product fails 90% of the time, all users should trust it at that level and not and not overtrust it, because when it will fail, it will be terrible for them. And ultimately, if your technology is really secure and really reliable and it performs without issues, same, you need to communicate that to your users so that they will trust it so that they can use it. Anyway, so that's the process through which the author walks us through, and then there is the impact risk matrix. This is the idea of dividing all your situations and scenarios into quadrants, whether it's low stakes or high stakes, and whether it's low impact or high impact. And we can imagine that when it's lost low stakes and low impact, for instance, organizing a file structure or renaming a document, the transparency needs. So based on the quadrant that you're in, whether it's low stakes, high stakes, and low impact, high impact, you would have a transparency need. And so here the idea is that is that when it's low stakes, for instance, and low impact, the transparency need is minimal. You can just, for instance, communicate with a notification. And with when it's high stake or high impact, for instance, rejecting a loan application, or maybe rejecting a candidate for a job, or rejecting whatever. Um yeah, this kind of decisions. Executing a stock trade, the author mentions, the transparency need is high. You need you need to demonstrate the rationale, how you decided and how you came to that conclusion. And then there is the idea that depending on the impact that it has on the user and how reversible is the decision taken by the AI, if it's low impact but reversible, well you can auto-execute and inform the user. But we can imagine the complete opposite of this and being high impact and irreversible, and in that case, you need explicit permission, and then you have all the other cases in between. There is also the qualitative validation. I like this test that I have never thought about, which is you have your user go through watching the agent perform a task and you instruct them to speak out loud. This is a traditional user research methodology, the think aloud protocol. You ask your users to think out loud while they use your product and you observe them, and based on that, you have kind of a verbalization of their thoughts to some extent, to the extent to which they can verbalize their thoughts. And you marked a typestamp at each time the user says, wait, what? Or kind of did it hear me, or what did it do that, and so on and so forth. So because you have a whole black box, again, the agent performs tasks and it will inform your users at some point. And it's really useful to have this kind of framework because or this kind of test because it will help your user, it will help you determine when would your user have liked to be notified about XYZ decision. So that's more or less all the ideas that the author goes through. I re uh goes through. Sorry, I really like this article, uh, and I think it provides a great a great framework. So again, um kudos to you, Victor, for that. And then we have another article from TechCrunch sharing that we have a NAI startup called Rocket, which offers Vibe Macinty style reports at a fraction of the cost. So basically, the article says that um everyone can now generate codes and designs, it has become a commodity, but what to build is something which everyone is missing. And this is true. This is true. Um, what to build? Okay, so basically, Rocket, for those who don't know, is a startup which connects research, product building, competitive intelligence in a single workflow. So it generates product strategy documents including pricing, unit economics, go-to-market recommendations. And so it does generate product requirements in PDF format from simple prompts. It does so in a consulting style report. And so the idea is that of course, and this is the ever going challenge with AI and LLMs, we are leveraging the data that already exists. on the internet. So basically it will combine pricing models, user behavior patterns, competitive analysis, insights, and it will all it will synthesize everything into one report. But it doesn't it doesn't do it with independently very verifiable information. I need more info on this statement. I haven't understood it quite clearly. So it can also track competitors including changes of their websites and so on and so forth. It has subscription plans. Okay, great. Okay, so as a reflection if you could have all the data and all the insights needed to launch a product and everything is done in the background for you and then you land on a report would that make the product would what would that be enough for a product to be successful? Because that's more or less I might be wrong, but I feel it's more or less the take here. And so I would challenge that with the need to actually interview users to hear from them. Because ultimately the process itself here is really asynchronous. It's really how can I say yeah it's acting on data that is public and that is uh over the internet yes that can be good that will significantly accelerate a lot of stuff I'm not saying the opposite but at the same time I'm wondering so several things the first thing is if everyone uses that what is the value what is the value of competitive analysis what is the value of coming up with a product idea and so on and so forth on what do we base our decisions to to think that okay we have a new product idea and yeah I have a lot of questions about that. So this is this is great I like this product I like this idea I mean but I'm wondering we are thinking that we take all the data that we have sitting on the internet we digest it through a giant machine and then we come up with product recommendations. And I'm wondering are product recommendations only dependent on what the data we can acquire in like clicking of a button? I'm just wondering because I do remember and sorry for sounding like probably a how can I say an old schooler to some extent I am all in for for how can I say for reinvention for adapting with new technologies and so on but ultimately at some point you will present you will share your idea or your product in front of users and your users are real humans they will have perceptions they will have they will evaluate what you offer them and they will they will use something it will have a dynamic and so how can I say at some point I'm not sure that you can do without interviewing users and testing it in front of them. And I'm not saying this is the position taken by this starter I'm not saying that I mean it's just a reflection on how will that impact the shifting of things. So yeah sorry for the it's maybe a for what it's worth I think I think I need more time to process and digest that this kind of um news and so probably we'll have more thoughts in the future but I do think that more and more and more to filter our ideas because it looks like everyone will have so many ideas and everyone will generate so many products like that's my take I feel that yeah okay we have removed the barrier to uh to to create a product we have removed the barrier sorry first we have removed the barrier to create a prototype and working prototype so a developed product let's say code and then we're like okay now that we have removed the last step of the process we need to also remove the rest not remove the step remove the barrier for this step so now we need also to remove the barrier to to let's say do market analysis it's kind of market analysis if I understand it correctly this this product this service but then okay you do market analysis but market analysis is one of the steps one of the things you can do to decide if you should come up with a new product idea and if this product idea is good or not. So ultimately if we commoditize the generation of product ideas I'm just wondering to what extent the filters we put in place to really determine if it's a good idea should be also adapted. Because right now our filters I would advocate that they are they are what they are we already have a lot of products being done and startups being created every day and the majority of them fail because we don't have enough market research and we don't have enough user research and so I'm wondering if we exaggerate all this process by creating more and more and more and removing the barriers to creation which is great. I'm not saying this is not great this is perfect but should we also put more emphasis on user research more than ever now because everyone will be creating new apps and products and services every day and not all of it will be useful. This is already not the case so if we exaggerate the creation we will need to have a compensatory mechanism for the filters as well to be to be strengthened in my opinion because yeah this is this can be risky. So I don't know that's just a thought maybe today I say the opposite I haven't thought about this a lot so take it with a grain of salt of course I hope this helps someone at least and if it's the case please I hate to be that guy but if you can subscribe to the show that would really help me and I will put a system in place for you to give feedback so that I can improve every day and deliver something that really really helps you um in the long term so thank you for listening and for taking some time see you in the next episode bye bye