Beyond the Chatbot: Navigating the Next AI Frontiers

MARKETING AND SALES

11:15 AM – 12:00 PM

Room 260-262

SPEAKER

Rachel Holmes

AI Strategist, Zirous

Use with AI

Copy this session's complete context to paste into ChatGPT, Claude, or any AI assistant.

Preview context block

## Session: Beyond the Chatbot: Navigating the Next AI Frontiers
**Track:** Marketing and Sales | **Time:** 11:15 AM–12:00 PM | **Room:** 260-262 | **Type:** Expert Talk
**Conference:** CIRAS AI Summit for Iowa — May 6, 2026, Scheman Building, Iowa State University, Ames IA

### Speaker(s)

**Rachel Holmes** — AI Strategist, Zirous (West Des Moines, IA)
Rachel Holmes is an AI Strategist at Zirous, guiding organizations in applying AI to deliver practical, business-aligned outcomes. She advises clients, champions enablement, and leads cross-functional workshops.

A frequent speaker on generative AI, Rachel recently earned a Master’s in Professional Communication from Iowa State University, where she built custom AI chatbots to streamline collaboration. She brings a decade of B2B communications and strategy experience to connect business goals with technical delivery.

### Session Description

The first wave of generative AI was about surfacing insights by asking a chatbot to summarize a document or draft an email. Now, businesses are shifting toward systems that don't just talk, but act. This session moves past the chatbot to explore how to leverage the next generation of AI capabilities:

 	Agentic AI: 2025 was widely heralded as the year of agentic AI, but McKinsey reports that only one in four organizations are scaling an agentic AI system in their enterprises. Agentic AI systems automate tasks by making decisions and taking actions that require several steps. We will demonstrate how organizations are securely connecting AI agents to the core systems that run their business—like CRMs and ERPs—while maintaining governance and security using techniques such as Model Context Protocol (MCP).
 	Multimodal AI: Multimodal capabilities—or the ability to leverage multiple kinds of data such as imagery, video, and audio—are expected to surge in 2026. These improvements enable organizations to tap into all the different facets of the business to enhance and improve data analytics and insights, such as “unstructured data” found in recorded audio calls, photos from in-the-field work, and video from logistics hubs. To fully access business intelligence from all data, businesses need a strong strategy on consolidating and governing unstructured data.
 	Spatial Intelligence: As multimodal matures, spatial intelligence is predicted to build strong momentum. The capability for AI systems to understand, interpret, and interact with a physical or virtual world, spatial intelligence promises to optimize throughput, guided maintenance, and industrial training. This exploratory section of the session will inspire organizations to consider how immersive learning, experiences, and simulations that mirror work conditions for use cases may impact ROI in their business.

Attendees will leave with the ability to confidently evaluate which AI capabilities offer the right ROI for their business needs. By understanding the true capabilities behind AI solutions, organizations can move from AI experimentation to actionable innovation rooted in use case and business needs.

### Other sessions in the Marketing and Sales track

- Using AI to Drive Customer Clarity, Stronger Messaging, and Smarter Sales Decisions (10:20 AM–11:05 AM)
- AI Search Optimization Explained: Leveraging the Shift in Search Visibility (2:15 PM–3:00 PM)
- The Strategic Stack: Overcoming AI Slop (3:10 PM–3:55 PM)

### Suggested prompts for this session

- "What questions should I prepare to ask the speaker(s) at this session?"
- "Create a structured note-taking template for this session focused on actionable takeaways"
- "Based on this session description, what background reading should I do to get the most value?"
- "After I attend, help me create an action plan for implementing what I learned"
- "How does this session connect to the other sessions in the Marketing and Sales track?"

## Session: Beyond the Chatbot: Navigating the Next AI Frontiers
**Track:** Marketing and Sales | **Time:** 11:15 AM–12:00 PM | **Room:** 260-262 | **Type:** Expert Talk
**Conference:** CIRAS AI Summit for Iowa — May 6, 2026, Scheman Building, Iowa State University, Ames IA

### Speaker(s)

**Rachel Holmes** — AI Strategist, Zirous (West Des Moines, IA)
Rachel Holmes is an AI Strategist at Zirous, guiding organizations in applying AI to deliver practical, business-aligned outcomes. She advises clients, champions enablement, and leads cross-functional workshops.

A frequent speaker on generative AI, Rachel recently earned a Master’s in Professional Communication from Iowa State University, where she built custom AI chatbots to streamline collaboration. She brings a decade of B2B communications and strategy experience to connect business goals with technical delivery.

### Session Description

The first wave of generative AI was about surfacing insights by asking a chatbot to summarize a document or draft an email. Now, businesses are shifting toward systems that don't just talk, but act. This session moves past the chatbot to explore how to leverage the next generation of AI capabilities:

Agentic AI: 2025 was widely heralded as the year of agentic AI, but McKinsey reports that only one in four organizations are scaling an agentic AI system in their enterprises. Agentic AI systems automate tasks by making decisions and taking actions that require several steps. We will demonstrate how organizations are securely connecting AI agents to the core systems that run their business—like CRMs and ERPs—while maintaining governance and security using techniques such as Model Context Protocol (MCP).
 	Multimodal AI: Multimodal capabilities—or the ability to leverage multiple kinds of data such as imagery, video, and audio—are expected to surge in 2026. These improvements enable organizations to tap into all the different facets of the business to enhance and improve data analytics and insights, such as “unstructured data” found in recorded audio calls, photos from in-the-field work, and video from logistics hubs. To fully access business intelligence from all data, businesses need a strong strategy on consolidating and governing unstructured data.
 	Spatial Intelligence: As multimodal matures, spatial intelligence is predicted to build strong momentum. The capability for AI systems to understand, interpret, and interact with a physical or virtual world, spatial intelligence promises to optimize throughput, guided maintenance, and industrial training. This exploratory section of the session will inspire organizations to consider how immersive learning, experiences, and simulations that mirror work conditions for use cases may impact ROI in their business.

Attendees will leave with the ability to confidently evaluate which AI capabilities offer the right ROI for their business needs. By understanding the true capabilities behind AI solutions, organizations can move from AI experimentation to actionable innovation rooted in use case and business needs.

### Other sessions in the Marketing and Sales track

- Using AI to Drive Customer Clarity, Stronger Messaging, and Smarter Sales Decisions (10:20 AM–11:05 AM)
- AI Search Optimization Explained: Leveraging the Shift in Search Visibility (2:15 PM–3:00 PM)
- The Strategic Stack: Overcoming AI Slop (3:10 PM–3:55 PM)

### Suggested prompts for this session

- "What questions should I prepare to ask the speaker(s) at this session?"
- "Create a structured note-taking template for this session focused on actionable takeaways"
- "Based on this session description, what background reading should I do to get the most value?"
- "After I attend, help me create an action plan for implementing what I learned"
- "How does this session connect to the other sessions in the Marketing and Sales track?"

TRACK Marketing and Sales

FORMAT Expert Talk

ROOM 260-262

The first wave of generative AI was about surfacing insights by asking a chatbot to summarize a document or draft an email. Now, businesses are shifting toward systems that don’t just talk, but act. This session moves past the chatbot to explore how to leverage the next generation of AI capabilities:

Agentic AI: 2025 was widely heralded as the year of agentic AI, but McKinsey reports that only one in four organizations are scaling an agentic AI system in their enterprises. Agentic AI systems automate tasks by making decisions and taking actions that require several steps. We will demonstrate how organizations are securely connecting AI agents to the core systems that run their business—like CRMs and ERPs—while maintaining governance and security using techniques such as Model Context Protocol (MCP).
Multimodal AI: Multimodal capabilities—or the ability to leverage multiple kinds of data such as imagery, video, and audio—are expected to surge in 2026. These improvements enable organizations to tap into all the different facets of the business to enhance and improve data analytics and insights, such as “unstructured data” found in recorded audio calls, photos from in-the-field work, and video from logistics hubs. To fully access business intelligence from all data, businesses need a strong strategy on consolidating and governing unstructured data.
Spatial Intelligence: As multimodal matures, spatial intelligence is predicted to build strong momentum. The capability for AI systems to understand, interpret, and interact with a physical or virtual world, spatial intelligence promises to optimize throughput, guided maintenance, and industrial training. This exploratory section of the session will inspire organizations to consider how immersive learning, experiences, and simulations that mirror work conditions for use cases may impact ROI in their business.

Key Takeaways

Understand when to move AI from a chat interface to an active participant that executes tasks within your business systems.
Why unlocking all facets of your business data—including audio, video, and imagery—helps you make better business decisions.
Consider whether to augment physical operations with immersive simulations.

Continue the conversation with Rachel Holmes at the Marketing & Sales Facilitated Discussion — 1:20 PM - 2:05 PM, Room 220-230-240

Session Recording

Session Data

Download SRT (Captions) Attendee Slides (PDF) AI-Formatted PDF Download Session Bundle (ZIP)

Transcript from Summit:

00:00 Introduction of Rachel Holmes Slide: 1

rachel holmes xerus ai strategist chatbots ai agents

I'll quick introduce myself if you are new to the room. I'm Gail Masbergen. I'm Xerus Marketing Manager. I get the fun job of talking about all the cool things our staff at Xerus do. I'm more behind the scenes, not necessarily client facing unless I get pulled in on something. But I'm excited to be here. And it's my pleasure to introduce our second speaker this morning of the track, Rachel Holmes. Rachel is an AI strategist at Xerus. where she works with organizations to apply artificial intelligence in ways that derive practical business-aligned results. Rachel brings a decade of experience in B2B communications and strategy, along with a master's in professional communication from Iowa State University. Go Cyclones. Her work includes building custom AI chatbots and leading cross-functional workshops that help teams connect business goals with technical solutions. In today's session, she will explore what lies beyond the basic chatbot use, covering AI agents, data-driven insights, and emerging immersive capabilities, so you can better evaluate which tools and approaches are right for your organization.

01:04 Session Goals and Speaker Background Slide: 1

ai frontiers xerus technology consulting west des moines business processes

Please join me in welcoming Rachel Holmes. Thank you. good morning, everyone. Can you hear me okay? I turned the mic on just now, but I think you can hear me. Thank you so much for being here. I'm really excited to be talking today about how to get beyond the chatbot and how we start to think about navigating the next frontiers of AI. I know that this is the sales and marketing track. This presentation will hopefully be really valuable to everybody in the room as we think about what's coming in the next year, two, three years for AI, and start thinking now about what kinds of questions you should ask yourself, how to start preparing now, and maybe what kinds of partnerships you might start looking into so that you can take advantage of these things as we go. As we were introduced, my name is Rachel Holmes. I'm an AI strategist for Xerus, which is a technology consulting organization based out of West Des Moines. We support organizations with technology consulting, business processes, all that kind of good stuff. So to kick us off, I just kind of want to get a temperature check from the room.

02:05 Audience Poll on AI Adoption Slide: 2

chatbot adoption agentic ai claude copilot gemini

Can you raise your hands and tell me who here is regularly using chatbots every day for work? This could be Claude, Copilot, Gemini. This is about what I expected, right? I would say majority, right? 90, 95% of us. At this point, chatbots are really becoming kind of table stakes for the organization, right? As we think about, you know, the different kinds of tools and technologies at our fingertips. Now I'd love to know if you could raise your hand and let me know who here has agentic AI capabilities in their organization. This is where you have a tool that's actually going out and doing work for you on your behalf. Okay. I'm going to make a guess and say about half, maybe about 50% of us. This is great as we start to think about, how do we get beyond chatbots? How do we take advantage of agentic capabilities as well as other tools and technologies along the way? Now also before I totally dive into what's coming next for AI, I kind of want to give us a broad sweeping landscape of how we got here.

03:09 Evolution of Generative AI Timeline Slide: 2

chatgpt generative ai timeline reasoning agentic ai digital coworkers

Because it's been a crazy four years. I'm sure I'm not the only one that feels that. But just to kind of give us a sense of where we're at and how we started. And really the whole generative AI craze really started in that late 2022 with OpenAI releasing ChatGPT. And this initial burst of generative AI is all around how we prompt AI like an assistant. We start to see some early reasoning in early to mid 2023 where these technologies are really able to better understand what we want to get out of them so that they can deliver these really creative and really engaging kinds of outputs. As time goes on, we start to see that generative AI boom turn more from prompting as an assistant to orchestrating an agent. We see the birth of agentic AI in early to mid-2024. Last year was supposed to be the year of agents. We're going to talk about that in just a little bit. But this is where we're moving from I'm prompting a chatbot to get what I want to now I'm having an agent do work for me. We start to see that really develop last year. And now we're even starting to see managing teams of agents going out and doing this work for us.

04:14 Multimodal AI Capabilities Expansion Slide: 2

multimodal ai image generation audio video text interaction

We see machines being able to talk to machines in late 2025. In early 2026, we start to hear about things like digital coworkers. If you're familiar with like Claude Cowork, that released an early preview in just January and now is released in full. And just a couple of weeks ago, ChatGPT says, hey, now you can create agents directly within the OpenAI platform. So this is kind of the landscape as we've been seeing it. And in addition to that, along the way, we've started to see, we can see the transition and the evolution of interacting with these tools beyond just text. Beyond just, I'm chatting to a chatbot and it's delivering text back to me. We're also seeing how much things have changed with multimodality, with images, with audio, with video and all of these kinds of things. You might remember some of those really janky looking pictures in 2023. And now, just today, we heard this morning, it's really hard to distinguish between what's AI and what's human in our audio, in our video, in our pictures, in our images.

05:16 Three Focus Areas Overview Slide: 10

agentic ai multimodal ai spatial ai session agenda ai capabilities

So this is the landscape that we're working with. You can see how much change we've experienced in the last few years alone. And this change is only getting more and more rapid and rapid. You can see how much more condensed and consolidated it is later in the timeline. So you can imagine that the next few years, the next six months, 12 months, 18 months, is going to feel even more rapid. So this is a great time for us to really slow down and think about what makes sense for us in our organization and how we can start preparing and taking advantage of those things now so that we're ready when that time comes. So today there's three particular areas I want us to talk through. We're going to start by talking about agentic AI. Of course, many of us might already be familiar with at least the concepts of agents, knowing that agents kind of exploded in the last little few months, but we're going to talk more about what that means for your business. We're going to talk more about multimodal AI, which is again all those different modes of of information that AI can now leverage and work with, and what that means for us as businesses and as organizations.

06:18 Reality of Agent Adoption in 2025 Slide: 11

2025 year of agents nvidia mckinsey survey agent adoption mid-market companies

And then we're going to spend a little bit of time talking about spatial AI, which is a little bit about what we heard in our keynote this morning of this technology that can actually be embedded in our physical and our virtual environments. So let's Let's kick it off. Let's talk about agents. 2025, the year of agents, right? We heard this in January of 2025. The NVIDIA CEO stands up on the CES stage and says, 2025 is the year of agents. Okay, well, how do we feel about that? Only about half of us raised our hands when we said we're using it. using agents in the workplace. So what does that mean for the reality? The reality is a lot more complicated than that. This is a survey from McKinsey that went out in November of 2025 that says two in three companies are stuck in experimentation and pilots, and one in four mid-market companies are still in the process of scaling agents across the organization. I think most of us here are probably in that small to medium, mid-market range. So this is generally reflective of probably what the room is feeling right now. Now, coupled this, we were able to attend TAI's conference.

07:19 Challenges in Enterprise Agent Deployment Slide: 11

technology summit enterprise deployment scaling challenges team adoption business processes

I see Tyler in the room with us today. If anybody was at the Technology Summit back in April, Xerus had our ears to the ground when we were listening. You know, what are folks talking about when it comes to chatbots versus agents? And some of this is reflected, there's so much experimentation, so much piloting, and so much internal or individual enablement when it comes to agents, but it's a lot harder to scale that across teams and from end-to-end businesses, end-to-end business processes. We also hear this from folks like Andrei Karpathy. If you don't know this name, this is one of the co-founders of OpenAI. He was the director of AI at Tesla for a little bit. He's currently a researcher and educator in the AI space. And he put out this podcast a few months ago. If you haven't heard it, I highly recommend giving it a listen. It's fascinating. One of the things that he says is that this is the decade of agents. It's not the year of agents. And he says one of the reasons because of that is because as of right now, agents are still lacking some of that intelligence. They're lacking some of that contextual learning.

08:20 Andrei Karpathy on Agent Development Slide: 11

andrei karpathy openai agent intelligence contextual learning enterprise value

They're lacking some of that multimodal capabilities. That's a term you've already heard today. You'll hear more throughout the session. And so one of the most interesting things that I think that he says is that it's really, really easy to demo an impressive-looking agent, but it's a lot, lot harder to turn that into something that's really meaningful and valuable across the entire enterprise. So all of this to say there's a lot of hype, there's a lot of potential with agents. Some folks are, of course, experiencing that value right now in their workflows and that individual workflow of embedding agents in their day-to-day. And It's time to start thinking about how we pull that out and how we scale that across teams and across businesses. Now, I do want to do just a little bit of a back to basics on what does agentic AI really mean, just to make sure we're all on the same page of what the heck I'm even talking about right now. Agents are all about solving problems and developing a way to solve a goal for you.

09:21 Agent Definition and Examples Slide: 11

agent definition hubspot workato enterprise tools autonomous execution

So versus a chatbot where you might ask it a question and it generates a response. An agent is all about how can I tap into your enterprise tool sets? How can I take multiple steps? How can I come up with a plan and execute it for the human user? So there's a couple different ways that you can build or experience agents. A lot of agents and agentic capabilities are built directly into our enterprise tool sets. So for example, along the right here, you see HubSpot. Probably many of us here are probably familiar with that, where you can see that an agent is going out and researching some kind of an entity, an organization, an individual, something like that, on behalf of the user. You can also see along the left-hand side, this is a screen cap from a tool called Workato. And it's a little small, but I'll read it for you here, is that somebody says, I want to complete a new sales proposal. And you can see that the tool is going to analyze gong calls. It's going to tap into your CRM. It's going to look at your Google Docs. So you can see that these agents are going into multiple different places. You've given it a goal of, I want to do this thing.

10:22 Revenue Delta Analysis Example Slide: 11

revenue analysis llm comparison slack integration jira email analysis

And the tool in the AI system is looking at the different tools it has available to it, coming up with an action plan and delivering that result back to you. This is kind of what sets agents apart. And that really matters because in our conversations and then there are certain vendors and things like that, they might use the word agent, but maybe it's not actually agentic. Maybe it's not actually taking action for you. So here's an example to kind of help distinguish for you the difference between an agent and being able to go to your LLM and type in a question and have it deliver answers back to you. Let's say you want to understand the difference between your estimated and your actual revenue for Q1. What is that difference and why did it happen to begin with? you might be able to have your AI system with defined actions that it can take. And it might be able to go through and say, hey, I can find out that delta is $500,000, but it doesn't have the reasoning, it doesn't have the ability to understand your goal and help you accomplish that goal that's rooted in your business knowledge and in your business contexts.

11:24 Agent Workflow and Decision-Making Slide: 11

agent workflow decision making crm erp database access

But an agent might be able to look at not only your CRM where this number lives, but it might have access into your Slack, into your JIRA, into your e-mail. And it can go through, it can find that information and say, wow, I'm seeing a lot of conversations about maybe shipment delays, or I'm seeing information about how ports have been impacted in the last six months. And it can do some reasoning and say, okay, well, the delta is not only $500,000, but it's because shipments were delayed in the Los Angeles port. This is the difference between a true agent that's taking lots of steps and tapping into your business systems and an LLM that's surfacing information for you. To be clear, this is also helpful. It's also super duper helpful to be able to go to your chat bot that's plugged into your Salesforce CRM and be able to ask questions and surface insights. That's great. And there's so much more that you can do when you think about agentic capabilities and agentic workflows. Now, if I pull this out in kind of a mind map, because I know that some folks understand things a little bit easier when you can see it in a nice pretty flow chart, is that this gray box is all about what an agent's doing.

12:30 Model Context Protocol (MCP) Introduction Slide: 18

model context protocol mcp trusted connectivity access control read access

It's observing what's going on in your business systems and your business tools. It's making decisions about what kinds of systems it's going to go into. And it's acting upon that decision by tapping into your business applications, like your CRM, your ERP, or even in your business database. But as we think about this, questions come to mind, right? How do we know that the agents have access to the information that I want them to have access to and not other things? How do I know that what it's going to do? It's not going to delete my code base. We hear things like that happening sometimes in some of these tools, or at least I do. So you have to think about, well, how do we create trust and how do we create governance in these agentic systems? And that's where you're going to start to hear and think about things like model context protocol or MCP. I'm not going to get super duper technical here. I know this is the marketing and sales track, but some things to think about here is that this is a method in which you can create trusted connectivity between your AI systems and the tools that you have.

13:33 Security and Compliance Considerations Slide: 19

security compliance identity management permissions auditability

So for example, this MCP is being programmed to say, I can go into the CRM and the ERP, but I absolutely cannot go into the business database. And then within those systems and tools, I have approved activities that I can do. I can fetch data, I have read access, or I have write access in maybe some systems, but not all. So then when I, the user, say, hey, tell me why my delta between my estimated and my actual revenue is so different, it knows the actions that it can take, and it knows the tools that it can tap into, and the system for itself chooses which of these it's going to do in order to deliver that result back to you. MCP is a great way to make sure that you have that secure, trusted connectivity between your system and between your tools. Now, there are some other considerations that you would also need to consider when you're building or using agents. The top one here is security. And I know what you're thinking, Rachel, you just said that MCP is a secure, governed way to do this. And it is.

14:33 Agent Activity Configuration and Cost Trade-offs Slide: 19

approved activities usage cost cost optimization user experience activity configuration

And MCP doesn't have compliance automatically built into it. is all fully dependent on how you implement and how you build it out. So if you need things to consider like identity and permissions and auditability and tracing and all of that kind of good stuff, you have to build that into the system. It doesn't just automatically come with that. So those are some things that you need to think about. You also need to think about what allowed actions do you want your agent to take. Because the way that these agents are built, sometimes you can tap directly into an agentic capability through Salesforce, but maybe you don't want to be able to edit data in Salesforce from your agent. Maybe you only want it to be able to surface insights, or you only want it to be able to close opportunities but not edit them. These are the types of things that you have to think about. And the more approved activities you allow your LLM to take, or your agents to take, Then when your agent is going through the process of planning out how it's going to accomplish the goal you want it to do, the more activities that it has available to it, the more that usage cost is going to go up, because it has to look at all of those activities and decide for itself which ones it's going to use.

15:44 Agent Governance Requirements Slide: 19

governance monitoring approval process agent oversight accountability

So you have to start to think about the trade-offs of user experience and cost. And then finally, of course, always, you want to think about governance. Who's running this agent? Who's approving what it can and can't do? Who's monitoring that it's doing what it's supposed to do? These are all the important questions that you need to be thinking about. If you have other questions about agents, I'm running one of the lunch roundtables all about agents. You're welcome to come talk to me then, or I'm hoping to leave about 10 minutes at the end of this session for Q&A. So hold those questions, and we can absolutely talk through anything. Okay, final takeaways for agents. Number one thing, agents take action. They solve problems. They understand the goal that you're trying to accomplish without you telling it specifically how you want it to go about solving that problem. It can make that decision and make that plan for itself based on the tools and the systems that it has access to. You can absolutely custom build your own agents Or you can leverage the agentic capabilities that are already built into your enterprise tool sets.

16:48 Agent Implementation Approaches Slide: 19

custom agents enterprise tools salesforce marketo hubspot

We saw things like HubSpot. We saw that screencap of Workato. There are other tools like Salesforce and Marketo and all of these other tools that have agentic capabilities built into them. And you can always do a little bit of something in between. You can do a little bit of custom and a little bit of what's built in. And finally, Capabilities like MCP, like model context protocol, gives you that governed access, secure access into your systems, but you have to think broad scale about how you want this system to run and operate in your organization. Okay, that's agents. Now I want to move on to multimodal, which is a very fancy way of saying multiple different kinds of information, multiple different kinds of data. You saw in one of those earlier slides, we talked about how traditionally, the first few rounds of generative AI was all about typing text in and receiving text back. And multimodal is all about all of those other different kinds of data, not just text, but images, audio, video, sensor data, charts, diagrams, all of these different types of data.

17:56 Multimodal AI Data Types Slide: 40

multimodal definition data types images audio video

A multimodal AI system can not only understand all of it and take that all in and ingest it, but it can also deliver it back out in a multitude of different kinds of outputs. And this gets really, really important when you start to think about things like generating insights or reports or KPIs or understanding what's happening with your customer base. These are the types of things that multimodal AI can really, really flourish in. And an example that I like to use when I think about generating richer insights across the organization, especially in manufacturing and construction, which I know that many of us here today are in that kind of field, is all around predictive maintenance, right? We want our systems and we want our technologies to have as much uptime and runtime as possible. When things go down, you lose money, it gets really bad, really fast. So you want to stay on top of that, right? And historically, maintenance teams might be looking at things like work order history. They might be looking at sensor data. You might have field techs going out. and checking on things periodically. Multimodal AI can build upon that by saying maybe it's important to you to look at thermal imagery changes across a wide period of time.

19:01 Predictive Maintenance Use Case Slide: 8

predictive maintenance thermal imaging acoustics manufacturing construction

Maybe it's listening to acoustics on the factory floor to hear how the technology sounds or the parts sound, and maybe something is starting to sound a little bit funky, and now we can make better understanding and better predictions of what's happening with that piece of technology. Maybe there's other sensors and other types of information that now is being layered in on top of and alongside that work order history and some of those sensor triggers to actually give you better predictive maintenance, better predictive analytics. Now, if we flash back to this image, obviously we have start of image processing in 2023, but we also have what I've just kind of called multimodality boosts in starting around middle of 2025. I want to talk about this a little bit because there have been some major advancements in multimodal in the last 6 to 12 months. And this is signaling to us that not only is multimodal capabilities growing, but it's going to continue to grow. And what does that mean for us as organizations?

20:02 Recent Multimodal Capability Advances Slide: 8

gemini video understanding chatgpt ui comprehension claude

A couple quick things is that we saw some major advancements in our foundational models last year. Gemini, for example, went from being able to just look at a video, take screenshots, and put it alongside A transcript to now actually understanding video content. We saw big jumps. You can see this chart here. This is how ChatGPT understands screenshots of user interfaces. It jumped from 64% to 86%, I think. Let me double check. Something like that, of understanding. And we have visual reasoning and Claude jumped to 80.7%. These are huge increases in capability. But Rachel, that was 2025. It's now five months into 2026. What else is going on? Even in the last few months, we've seen major changes. The one that I really want to point out, GPT-5-2 is what was last year. GPT-5-5 is what's this year. And in this top row, you can see it went from 47% to 78% in this particular benchmark. And this benchmark is talking about how agents act with multimodal content and multimodal information.

21:07 GPT-5 Multimodal Performance Leap Slide: 21

gpt-5 performance benchmark multimodal agents 2026 improvements agent capabilities

This is a massive jump. This is huge implications for us as business users within our business processes of Now we can look at other kinds of data beyond just text. And you can see that there were some changes in Claude in the last couple of weeks as well. And we're seeing that... With these changes and these improvements in the technology, the market at large is taking notice. If I just pull up a few screenshots here, we have Fast Company saying that 2026 belongs to multimodal AI, that we're evolving beyond static text into dynamic immersive interactions. We have Gartner saying that multimodal generative AI will transform enterprise applications, saying that by 2030, 80% of enterprise software is going to be multimodal. And we have folks at IBM saying that multimodal AI is going to interpret the world like humans through visual, through visual language, visual processing, not just text processing. So here's my personal take.

22:09 Market Signals for Multimodal AI Slide: 28

fast company gartner ibm 2030 forecast enterprise software

Do you remember how we said 2025 was the year of agents and then like half of us raised our hands and said we were using agents? I think something very similar is going to happen here. We're seeing major changes. We're seeing big market signals saying that multimodal is the thing. And so I think it's going to take us as business users a little bit of time to fully understand what does that implication look like for us? How can I take advantage of multimodal data or multimodal capabilities in my business processes? And I think that leaves us with a really unique opportunity to start thinking now, start planning now of what might change in your business, what might change in your workflows if you had access to these kinds of multimodal capabilities. And there's one thing that I'm going to kind of walk through here, and we have to talk just a little tiny bit about structured versus unstructured data. And I promise this is not going to turn into a data engineering talk, but we do have to talk about this just a little bit. So structured data is historically how we've been able to make sense of our business data and our business intelligence.

23:12 Structured Versus Unstructured Data Slide: 28

structured data unstructured data databases kpis call transcripts

This is text data that lives in databases, in rows and columns, and this is how we pull and generate KPIs and reports of how our business is operating, based on any number of things that might be important to you. You also have a whole swath of what we call unstructured data. This is data that doesn't have any kind of special format. This could be text. It could be like social media posts or call transcripts or e-mail chains and things like that. But it can also be some of that non-text data. It could be those audio acoustics, those thermal imaging clips. It could be recordings. It could be customer call recordings and hotkey tracing and all of these kinds of things. And historically, if you wanted to access this unstructured data and use it in your reporting or in your KPIs, it was really, really hard to do that. You might need a significant amount of machine learning, as an example, to be able to make sense of all this data alongside of your structured data. Multimodal changes that.

24:13 Multimodal Data Integration Opportunities Slide: 28

data integration business context data analytics personalized interactions workflow optimization

Because multimodal AI capabilities can understand unstructured data, Just as well as structured data, now you suddenly have a whole host of opportunities of how you can think about integrating that into your business. And what does that mean for organizations? Well, it means that you might be able to develop more precise, tailored, and custom workflows that are deeply, deeply grounded in your business context and your business intelligence. You can, we talk about this a lot at Xerus, and we have already mentioned it a few times today, all around enhancing and improving your data analytics, being able to combine that unstructured data alongside of that structured data to get better insights and better understanding of what's happening in your business. And of course, we have personalized interactions. I think all of us can agree, and we see this out in market, that consumers kind of are demanding personalized interactions now because of the capabilities that AI gives us. Multimodal helps you get there. One example is like natural spoken dialogue through these AI systems.

25:13 Strategic Data Selection for Multimodal Slide: 28

data strategy customer sentiment acoustic monitoring thermal imagery business priorities

That's just one simple example, right, of how multimodal AI can change the game for businesses beyond just creative outputs of images and videos and audio clips of like what we heard earlier today. Now, there are some things that you have to be thoughtful of and aware of. And the first thing is that you have to think about what kind of unstructured data, if you've had access to all the data in your organization, what makes sense for you to actually implement, right? I think it was IBM that said something like 80% of the world's data is unstructured. It would be... a really interesting endeavor to try and incorporate all of that into your organization at once, but it might not be the most successful. So you have to think about, well, what kind of unstructured data, if I had access to it, would make sense for me? What would actually help me move the needle in my organization? For example, if you are the type of organization that really thrives on like a customer call center, does it make sense for you to be able to understand customer sentiment and voice and tone and all of that type of information through audio recordings that we as humans can kind of naturally comprehend?

26:25 Multimodal Access Governance Slide: 28

access governance data permissions team access role-based access organizational policy

Or is it fine that some of that data just kind of lives separately and is turned into something structured and lives in the database? right? Or if we go back to that maintenance example that I mentioned earlier, my watch just buzzed telling me that I've got my steps in for the day. If we think about that predictive maintenance example, you know, does it make sense for you to have a lot of information about acoustic clips and thermal imagery and things like that? Or do you only need a couple of those things to make really big, impactful changes in your organization? So now is the time to start thinking about what kind of data makes the most sense for your organization that will help you drive forward change. And of course, I've said this before, so I won't elaborate too much on it, but you also have to think about access governance. If you have all of this new data, who in your organization needs access to it? Who should be able to take that data and make decisions with it? Is that a certain team? Is that certain members within the team? Is that everybody in your organization?

27:26 Multimodal AI Key Takeaways Slide: 28

bidirectional processing model improvements governance policies data engineering partnerships

Especially as you bring these capabilities into your systems, into your AI agents, or into your AI chatbots, or into any of these kinds of experiences, who needs to have access to what? And that's something you need to start thinking about now as well. Okay. Some key takeaways from multimodal. The number one thing is that multimodal goes both ways. It can take in all different kinds of data, and it can put out all kinds of data. We mostly talked today about how it's ingesting information to give you more business intelligence, but you can also think about, how does that change how I visualize my reporting or how I visualize my KPIs and things like that. We've seen so much advancement and so much change in the last 12 months when it comes to our foundational technology and our foundational models. We saw a lot of that massive, massive improvement, and we can only expect that those things are going to continue to improve. So as our models and as our technology gets better and better and better, how do we make sure that we're staying aware of all those changes and making sure we're taking advantage of all of those changes as well beyond just improving how we output images or output creative concepts in our chat bots.

28:40 Spatial AI Definition and Scope Slide: 28

spatial ai spatial intelligence computer vision physical ai embodied ai

And finally, we talk about this. You have to consider your data access, your governance policies. If you have a data engineering team on step or if you have a data engineering partner, you might want to be starting to ask questions and think about, hey, how can we start thinking about bringing in other kinds of data into our reporting structures and things like that? Now, on the heels of multimodal, right, where we think about all different kinds of content and all different kinds of data and information, that takes us into spatial AI. And we talked a little bit about this morning. JC referred to it as physical AI, but this is all about an AI capability that's really, really good at understanding the physical world and virtual environments. This is still a relatively new space in our kind of market. You might have heard terms like spatial AI, spatial intelligence, computer vision, physical AI, embodied AI. These are just words that you might have heard, and it's all kind of under this general umbrella.

29:41 Spatial AI Real-World Applications Slide: 31

real-world applications autonomous vehicles construction agriculture manufacturing

They all mean slightly different things, but it's all this general concept of bringing AI into our physical and 3D virtual worlds. Now, This matters because this allows our systems to understand and interpret and interact with either the real world or that immersive visual world. It can perceive physical spaces and it can choose which actions to take based on the feedback that it receives from like real physical things around it. Obviously this morning we saw some examples of like the autonomous car driving itself. We heard about the robots that are running marathons. And those things are definitely cool and interesting. And we also want to think about how does that apply to us in our businesses and in our space. Now, if you're the kind of business that relies on the physical world, if you're in construction, if you're in agriculture, if you have a factory floor, if you have disaster recovery or any kinds of that type of real-world aspect to your organization, spatial AI matters to you a lot.

30:49 Spatial AI Training Simulation Example Slide: 23

training simulation ar vr real-time coaching personalized training factory floor

And there's so many cool and interesting and valuable use cases you can get out of this kind of technology. And one of the examples that we like to use is if you're training somebody on the factory floor or if you're trying to get somebody up to speed for like a real-world product out in the field. Historically, if you're training somebody, you might have like job shadowing, you might have manuals and videos, you might have like a deprecated piece of equipment that somebody's going to practice on, and all of that is super helpful, but it's hard to mimic and imitate real life with that kind of experience. Spatial AI gives you the ability to plug AI into like a headset or a virtual environment or AR or VR or all these different kinds of things, and be able to give you real-time in-person coaching based on exactly what the person is doing. It might adjust the scenario on the fly. It might repeat steps that maybe somebody didn't quite get the first time around. So we talked in multimodal about personalized experiences and personalized interactions.

31:51 World Models for Physical Understanding Slide: 21

world models physics simulation cause and effect 3d understanding real-world interaction

Spatial AI is all about that, especially in this particular kind of use case and this kind of workflow. Now, Besides seeing things like autonomous driving cars and robots running marathons, what else is telling us that spatial AI is coming? And there's two areas that I want to talk about. And one, the first one is world models. Now, our generative AI tools right now, think Claude, think OpenAI, think Gemini, these tools are great at what they do, but what they're not great at is understanding physics and understanding simulation and understanding cause and effect and understanding 3D. You need a different kind of technology, underlying technology, to be able to do those types of things and do those types of experiences. And that's where world models come in. World models can understand physics and simulation and all of those different kinds of things that are needed for the real world interactivity that you might be looking for. Last year, we saw both Google DeepMind and Meta release world models.

32:56 World Model Development Status Slide: 21

google deepmind meta stanford world labs 3d reasoning

We also saw out of Stanford Labs, that's the screenshot on the left, Stanford Labs released a lab called World Lab. Let me double check that that's what it's actually called. Yes, World Labs, where they're building world models that are perceiving and generating and reasoning with the 3D world. So at Xerus, we have a division that is very focused on immersive visualization in AR and VR. My colleague Luke runs that division. He's here today. If you have questions about any of this kind of stuff, I am happy to make an introduction. My knowledge on this particular area is a little bit foundational. His take on these is that these have so much potential and so much possibility for real-world applications, but they're still a little bit rudimentary. Thank you, 10 minutes. They're a little bit rudimentary for our real-world business cases because they're not quite niche enough, they're not quite specific enough to be able to get you exactly to where you want to go.

33:56 Enterprise Spatial AI Solutions Slide: 36

nvidia omniverse pepsico ai factories digital twins enterprise solutions

So you've got kind of world models sitting at right now, I would loosely call it the consumer level. But what's happening at the enterprise level is that you're seeing organizations like NVIDIA, Omniverse, creating these enterprise-grade solutions. You can see here that there's this real-life example of PepsiCo recreating factory operations for AI agents. And there's this partnership that's promising to deliver blueprints for AI factories and AI agents within that physical 3D space. So if you've got enterprise, with NVIDIA Omniverse up here, and you've got kind of world models down here at the consumer level, well, where does that leave us, right? This mid-market, small to medium space that can get a lot of value out of this kind of experience. Although these advancements are exciting, they are still kind of introductory, but again, that leaves us with this great opportunity to start thinking about, okay, if I had access to this kind of capability, what kind of experience would I want to build for workers, for my maintenance workers, for my staff, for training purposes, for any of these kinds of things.

35:04 Spatial AI Implementation Considerations Slide: 31

photorealism digital twins 3d coding mid-market opportunity training scenarios

And there are, of course, organizations that could help you get there along the way. Spatial AI is one of those areas where you're probably going to need to partner with somebody at this stage to be able to take advantage of the capabilities that it offers. So final takeaways for spatial AI, again, it's kind of growing at this enterprise level for photorealism, for digital twins. We've got consumer-based more directions coming from things like world models. And that leaves mid-market with this opportunity to start thinking now about what kinds of experiences you want to build in your own organization, especially if you're the kind of organization that is thinking about training scenarios, tailored personalized experiences, brand recognition, all of these kinds of things. There's so much potential there. And then that one kind of caveat is that if you are thinking about use cases right now, these tools at this time do require some fairly intensive 3D coding knowledge. So if you don't have that within your organization, you might want to start thinking about what kind of organization can help partner with you and get you to the end there.

36:11 Matching AI Capabilities to Business Needs Slide: 31

capability matching business needs chatbots autonomous systems multimodal strategy

All righty. Final takeaways here as we wrap up time today. I know that not everybody in this room can take advantage of all things that we're looking at here. And what I want to encourage everyone to think about is that as hype continues and as things grow, it's more and more important for us to understand what's real and thinking about what are the capabilities that I can think about for my organization, not just getting swept away in what's kind of dominating the space as far as what's happening in the news or what's happening on social media, et cetera, et cetera, that you might be thinking about. Because if you're the kind of organization that just really focuses on chat and creativity, then a chatbot is perfect for you. There's nothing wrong with chat experiences. But if you're the kind of organization that says, hey, I really want my AI systems to start taking action and doing things on my behalf, then autonomous agentic systems are the pathway for you. And if you're the kind of organization that has a multitude of business data that you really want to tap into to really understand your business intelligence, then it's time to start coming up with a strategy for multimodal interactions.

37:15 Q&A on Multimodal Web Scraping Slide: 40

web scraping vision capabilities security blocks screenshots metadata

And then finally, if you rely really heavily on the 3D world, the physical space, things like that, spatial context, then that spatial AI, spatial intelligence, computer vision kind of market is really something that you need to be starting to think about how you can implement that in your organization today. That's all I have for today. Thank you very much. If you are interested in Talking about agents at the roundtable, you're more than welcome to come say hi. We also have an on-site workshop that we do for organizations if you're interested in learning anything more about that. So that's what I got. Thank you very much. Questions? I have a question. Yeah. Multimodal. That's pretty interesting. I know that we've done, let me phrase this correctly, I guess, but web scraping, and a lot of times web scraping will get closed down by the security of the website or editor's website or whatnot.

38:27 Differentiating LLM Capabilities Slide: 40

llm comparison artificial intelligence index model selection openclaw cost management

Would multimodal web scraping be a possibility so that it's actually working off more vision than actually trying to download information? It's a great question. So for those in the room that didn't hear the question, it was all around, can multimodal help with web scraping? Because a lot of the times our websites can block AI agents from coming in and being able to see some of that code on the back end. I would say yes. These vision capabilities are getting much, much better, like taking screenshots of your desktop and translating that and understanding exactly what it's seeing. It's not going to get the metadata, if that's something that's important to you to be able to scrape. But what's actually on the website itself that a human can see, absolutely multimodal can be a great path towards that. OK. How do you differentiate the different LLMs and what's better? Like, for example, if you open an open claw agent that you can tie different LLMs, API keys.

39:34 LLM Selection and Cost Management Slide: 40

gpt-5 opus gemini openclaw limitations cost control

The problem with that is the cost gets really high. So how do you, I'm just having a hard time deciding what's good for what. Or are they all the same? I mean, what's your take on that? It's a good question. So the question was like, how do you kind of differentiate between what type of type of LLM is good at what kinds of activities, especially if you're looking at building agents through something like OpenClaw. It's a really good question because a lot of the AI tools are, they're constantly improving and they're constantly neck and neck. There's a website that I like to go to and I think it's, I can't remember the exact URL, but it's the Artificial Intelligence Index. And essentially what it does is it looks at all of the tools. It's a third party. looks at all the tools that are available and kind of rates them on overall intelligence, agentic capabilities, things like that. So you can always go there and kind of get a sense for what tool is better at different things. And honestly, a lot of them, because they change back and forth, many of them feel interchangeable when it comes to how intelligent they are.

40:34 Managing Technical LLM Questions Slide: 40

technical expertise colleague referral implementation details technical support

I'm talking about like the LLMs themselves, like Chat55 versus Opus versus Gemini 3.1. Now, if you're thinking specifically about like open claw, that's one of the trickier ones because open claw doesn't have as much of the like governance layer into it. So like when you're running into like cost issues and things like that, it's harder to like set limits or set caps on that. You might want to look at other options for building agents that allow you to kind of meld that cap into it, or be able to say like, this is how many tokens you can use per day or per week or per month or whatever. Quad would be able to enable some of those different types of things, and we can also talk about that. One of my more technical folks is here today as well, and he would probably be able to answer that a little bit better than I can. Yeah, of course. Another question.

41:37 HubSpot AI Security Concerns Slide: 40

hubspot integration security concerns read-only access lead vetting automation

Yeah. Going back, going to agents. I brought this question up in the last session that we've kind of roadblocked allowing AI into HubSpot. So we use HubSpot, my company that I work for. What kind of reassurance could I give Steve, our security guy, so I can put... AI into HubSpot for lead vetting or any type of automation? It's a great question. And I think that a lot of that goes into, we talked a lot about what kinds of approved activities can you allow the AI to take and what do you say, nope, you can't do that at all. So one of the things that you might think about is saying to, you said Steve, what's his name? You might go to Steve and say, hey, why don't we start with read-only access? we won't allow AI to change anything in the database. We won't allow it to edit anything. No edit, no delete, no none of that. Let's just focus on read. And that way, your users could then query your LLM of choice and say, like, what opportunities do I have in the pipeline that are supposed to close in the next two weeks?

42:42 Session Conclusion Slide: 40

read-only strategy opportunity pipeline query access low-risk adoption session closing

The agent can go and look at everything, but it can't change anything. And then it would just deliver that answer back to you. That might be a place for you to start. All right. Thank you, everyone.

I'll quick introduce myself if you are new to the room. I'm Gail Masbergen. I'm Xerus Marketing Manager. I get the fun job of talking about all the cool things our staff at Xerus do.

I'm more behind the scenes, not necessarily client facing unless I get pulled in on something. But I'm excited to be here. And it's my pleasure to introduce our second speaker this morning of the track, Rachel Holmes. Rachel is an AI strategist at Xerus. where she works with organizations to apply artificial intelligence in ways that derive practical business-aligned results.

Rachel brings a decade of experience in B2B communications and strategy, along with a master's in professional communication from Iowa State University. Go Cyclones. Her work includes building custom AI chatbots and leading cross-functional workshops that help teams connect business goals with technical solutions. In today's session, she will explore what lies beyond the basic chatbot use, covering AI agents, data-driven insights, and emerging immersive capabilities, so you can better evaluate which tools and approaches are right for your organization.

Please join me in welcoming Rachel Holmes. Thank you. good morning, everyone. Can you hear me okay? I turned the mic on just now, but I think you can hear me.

Thank you so much for being here. I'm really excited to be talking today about how to get beyond the chatbot and how we start to think about navigating the next frontiers of AI. I know that this is the sales and marketing track. This presentation will hopefully be really valuable to everybody in the room as we think about what's coming in the next year, two, three years for AI, and start thinking now about what kinds of questions you should ask yourself, how to start preparing now, and maybe what kinds of partnerships you might start looking into so that you can take advantage of these things as we go.

As we were introduced, my name is Rachel Holmes. I'm an AI strategist for Xerus, which is a technology consulting organization based out of West Des Moines. We support organizations with technology consulting, business processes, all that kind of good stuff. So to kick us off, I just kind of want to get a temperature check from the room.

At this point, chatbots are really becoming kind of table stakes for the organization, right? As we think about, you know, the different kinds of tools and technologies at our fingertips. Now I'd love to know if you could raise your hand and let me know who here has agentic AI capabilities in their organization. This is where you have a tool that's actually going out and doing work for you on your behalf.

Okay. I'm going to make a guess and say about half, maybe about 50% of us. This is great as we start to think about, how do we get beyond chatbots? How do we take advantage of agentic capabilities as well as other tools and technologies along the way?

Now also before I totally dive into what's coming next for AI, I kind of want to give us a broad sweeping landscape of how we got here. Because it's been a crazy four years. I'm sure I'm not the only one that feels that. But just to kind of give us a sense of where we're at and how we started.

And really the whole generative AI craze really started in that late 2022 with OpenAI releasing ChatGPT. And this initial burst of generative AI is all around how we prompt AI like an assistant. We start to see some early reasoning in early to mid 2023 where these technologies are really able to better understand what we want to get out of them so that they can deliver these really creative and really engaging kinds of outputs. As time goes on, we start to see that generative AI boom turn more from prompting as an assistant to orchestrating an agent.

We see the birth of agentic AI in early to mid-2024. Last year was supposed to be the year of agents. We're going to talk about that in just a little bit. But this is where we're moving from I'm prompting a chatbot to get what I want to now I'm having an agent do work for me.

We start to see that really develop last year. And now we're even starting to see managing teams of agents going out and doing this work for us. We see machines being able to talk to machines in late 2025. In early 2026, we start to hear about things like digital coworkers.

If you're familiar with like Claude Cowork, that released an early preview in just January and now is released in full. And just a couple of weeks ago, ChatGPT says, hey, now you can create agents directly within the OpenAI platform. So this is kind of the landscape as we've been seeing it. And in addition to that, along the way, we've started to see, we can see the transition and the evolution of interacting with these tools beyond just text.

Beyond just, I'm chatting to a chatbot and it's delivering text back to me. We're also seeing how much things have changed with multimodality, with images, with audio, with video and all of these kinds of things. You might remember some of those really janky looking pictures in 2023. And now, just today, we heard this morning, it's really hard to distinguish between what's AI and what's human in our audio, in our video, in our pictures, in our images.

So you can imagine that the next few years, the next six months, 12 months, 18 months, is going to feel even more rapid. So this is a great time for us to really slow down and think about what makes sense for us in our organization and how we can start preparing and taking advantage of those things now so that we're ready when that time comes. So today there's three particular areas I want us to talk through. We're going to start by talking about agentic AI.

Of course, many of us might already be familiar with at least the concepts of agents, knowing that agents kind of exploded in the last little few months, but we're going to talk more about what that means for your business. We're going to talk more about multimodal AI, which is again all those different modes of of information that AI can now leverage and work with, and what that means for us as businesses and as organizations. And then we're going to spend a little bit of time talking about spatial AI, which is a little bit about what we heard in our keynote this morning of this technology that can actually be embedded in our physical and our virtual environments. So let's Let's kick it off.

Let's talk about agents. 2025, the year of agents, right? We heard this in January of 2025. The NVIDIA CEO stands up on the CES stage and says, 2025 is the year of agents. Okay, well, how do we feel about that?

Only about half of us raised our hands when we said we're using it. using agents in the workplace. So what does that mean for the reality? The reality is a lot more complicated than that. This is a survey from McKinsey that went out in November of 2025 that says two in three companies are stuck in experimentation and pilots, and one in four mid-market companies are still in the process of scaling agents across the organization.

I think most of us here are probably in that small to medium, mid-market range. So this is generally reflective of probably what the room is feeling right now. Now, coupled this, we were able to attend TAI's conference. I see Tyler in the room with us today.

If anybody was at the Technology Summit back in April, Xerus had our ears to the ground when we were listening. You know, what are folks talking about when it comes to chatbots versus agents? And some of this is reflected, there's so much experimentation, so much piloting, and so much internal or individual enablement when it comes to agents, but it's a lot harder to scale that across teams and from end-to-end businesses, end-to-end business processes. We also hear this from folks like Andrei Karpathy.

If you don't know this name, this is one of the co-founders of OpenAI. He was the director of AI at Tesla for a little bit. He's currently a researcher and educator in the AI space. And he put out this podcast a few months ago.

If you haven't heard it, I highly recommend giving it a listen. It's fascinating. One of the things that he says is that this is the decade of agents. It's not the year of agents.

And he says one of the reasons because of that is because as of right now, agents are still lacking some of that intelligence. They're lacking some of that contextual learning. They're lacking some of that multimodal capabilities. That's a term you've already heard today.

You'll hear more throughout the session. And so one of the most interesting things that I think that he says is that it's really, really easy to demo an impressive-looking agent, but it's a lot, lot harder to turn that into something that's really meaningful and valuable across the entire enterprise. So all of this to say there's a lot of hype, there's a lot of potential with agents. Some folks are, of course, experiencing that value right now in their workflows and that individual workflow of embedding agents in their day-to-day.

And It's time to start thinking about how we pull that out and how we scale that across teams and across businesses. Now, I do want to do just a little bit of a back to basics on what does agentic AI really mean, just to make sure we're all on the same page of what the heck I'm even talking about right now. Agents are all about solving problems and developing a way to solve a goal for you. So versus a chatbot where you might ask it a question and it generates a response.

An agent is all about how can I tap into your enterprise tool sets? How can I take multiple steps? How can I come up with a plan and execute it for the human user? So there's a couple different ways that you can build or experience agents.

A lot of agents and agentic capabilities are built directly into our enterprise tool sets. So for example, along the right here, you see HubSpot. Probably many of us here are probably familiar with that, where you can see that an agent is going out and researching some kind of an entity, an organization, an individual, something like that, on behalf of the user. You can also see along the left-hand side, this is a screen cap from a tool called Workato.

And it's a little small, but I'll read it for you here, is that somebody says, I want to complete a new sales proposal. And you can see that the tool is going to analyze gong calls. It's going to tap into your CRM. It's going to look at your Google Docs.

So you can see that these agents are going into multiple different places. You've given it a goal of, I want to do this thing. And the tool in the AI system is looking at the different tools it has available to it, coming up with an action plan and delivering that result back to you. This is kind of what sets agents apart.

And that really matters because in our conversations and then there are certain vendors and things like that, they might use the word agent, but maybe it's not actually agentic. Maybe it's not actually taking action for you. So here's an example to kind of help distinguish for you the difference between an agent and being able to go to your LLM and type in a question and have it deliver answers back to you. Let's say you want to understand the difference between your estimated and your actual revenue for Q1.

What is that difference and why did it happen to begin with? you might be able to have your AI system with defined actions that it can take. And it might be able to go through and say, hey, I can find out that delta is $500,000, but it doesn't have the reasoning, it doesn't have the ability to understand your goal and help you accomplish that goal that's rooted in your business knowledge and in your business contexts. But an agent might be able to look at not only your CRM where this number lives, but it might have access into your Slack, into your JIRA, into your e-mail. And it can go through, it can find that information and say, wow, I'm seeing a lot of conversations about maybe shipment delays, or I'm seeing information about how ports have been impacted in the last six months.

And it can do some reasoning and say, okay, well, the delta is not only $500,000, but it's because shipments were delayed in the Los Angeles port. This is the difference between a true agent that's taking lots of steps and tapping into your business systems and an LLM that's surfacing information for you. To be clear, this is also helpful. It's also super duper helpful to be able to go to your chat bot that's plugged into your Salesforce CRM and be able to ask questions and surface insights.

That's great. And there's so much more that you can do when you think about agentic capabilities and agentic workflows. Now, if I pull this out in kind of a mind map, because I know that some folks understand things a little bit easier when you can see it in a nice pretty flow chart, is that this gray box is all about what an agent's doing. It's observing what's going on in your business systems and your business tools.

It's making decisions about what kinds of systems it's going to go into. And it's acting upon that decision by tapping into your business applications, like your CRM, your ERP, or even in your business database. But as we think about this, questions come to mind, right? How do we know that the agents have access to the information that I want them to have access to and not other things?

How do I know that what it's going to do? It's not going to delete my code base. We hear things like that happening sometimes in some of these tools, or at least I do. So you have to think about, well, how do we create trust and how do we create governance in these agentic systems?

And that's where you're going to start to hear and think about things like model context protocol or MCP. I'm not going to get super duper technical here. I know this is the marketing and sales track, but some things to think about here is that this is a method in which you can create trusted connectivity between your AI systems and the tools that you have. So for example, this MCP is being programmed to say, I can go into the CRM and the ERP, but I absolutely cannot go into the business database.

And then within those systems and tools, I have approved activities that I can do. I can fetch data, I have read access, or I have write access in maybe some systems, but not all. So then when I, the user, say, hey, tell me why my delta between my estimated and my actual revenue is so different, it knows the actions that it can take, and it knows the tools that it can tap into, and the system for itself chooses which of these it's going to do in order to deliver that result back to you. MCP is a great way to make sure that you have that secure, trusted connectivity between your system and between your tools.

Now, there are some other considerations that you would also need to consider when you're building or using agents. The top one here is security. And I know what you're thinking, Rachel, you just said that MCP is a secure, governed way to do this. And it is.

You also need to think about what allowed actions do you want your agent to take. Because the way that these agents are built, sometimes you can tap directly into an agentic capability through Salesforce, but maybe you don't want to be able to edit data in Salesforce from your agent. Maybe you only want it to be able to surface insights, or you only want it to be able to close opportunities but not edit them. These are the types of things that you have to think about.

And the more approved activities you allow your LLM to take, or your agents to take, Then when your agent is going through the process of planning out how it's going to accomplish the goal you want it to do, the more activities that it has available to it, the more that usage cost is going to go up, because it has to look at all of those activities and decide for itself which ones it's going to use. So you have to start to think about the trade-offs of user experience and cost. And then finally, of course, always, you want to think about governance. Who's running this agent?

Who's approving what it can and can't do? Who's monitoring that it's doing what it's supposed to do? These are all the important questions that you need to be thinking about. If you have other questions about agents, I'm running one of the lunch roundtables all about agents.

You're welcome to come talk to me then, or I'm hoping to leave about 10 minutes at the end of this session for Q&A. So hold those questions, and we can absolutely talk through anything. Okay, final takeaways for agents. Number one thing, agents take action.

They solve problems. They understand the goal that you're trying to accomplish without you telling it specifically how you want it to go about solving that problem. It can make that decision and make that plan for itself based on the tools and the systems that it has access to. You can absolutely custom build your own agents Or you can leverage the agentic capabilities that are already built into your enterprise tool sets.

You can do a little bit of custom and a little bit of what's built in. And finally, Capabilities like MCP, like model context protocol, gives you that governed access, secure access into your systems, but you have to think broad scale about how you want this system to run and operate in your organization. Okay, that's agents. Now I want to move on to multimodal, which is a very fancy way of saying multiple different kinds of information, multiple different kinds of data.

You saw in one of those earlier slides, we talked about how traditionally, the first few rounds of generative AI was all about typing text in and receiving text back. And multimodal is all about all of those other different kinds of data, not just text, but images, audio, video, sensor data, charts, diagrams, all of these different types of data. A multimodal AI system can not only understand all of it and take that all in and ingest it, but it can also deliver it back out in a multitude of different kinds of outputs. And this gets really, really important when you start to think about things like generating insights or reports or KPIs or understanding what's happening with your customer base.

These are the types of things that multimodal AI can really, really flourish in. And an example that I like to use when I think about generating richer insights across the organization, especially in manufacturing and construction, which I know that many of us here today are in that kind of field, is all around predictive maintenance, right? We want our systems and we want our technologies to have as much uptime and runtime as possible. When things go down, you lose money, it gets really bad, really fast.

So you want to stay on top of that, right? And historically, maintenance teams might be looking at things like work order history. They might be looking at sensor data. You might have field techs going out. and checking on things periodically.

Multimodal AI can build upon that by saying maybe it's important to you to look at thermal imagery changes across a wide period of time. Maybe it's listening to acoustics on the factory floor to hear how the technology sounds or the parts sound, and maybe something is starting to sound a little bit funky, and now we can make better understanding and better predictions of what's happening with that piece of technology. Maybe there's other sensors and other types of information that now is being layered in on top of and alongside that work order history and some of those sensor triggers to actually give you better predictive maintenance, better predictive analytics. Now, if we flash back to this image, obviously we have start of image processing in 2023, but we also have what I've just kind of called multimodality boosts in starting around middle of 2025.

I want to talk about this a little bit because there have been some major advancements in multimodal in the last 6 to 12 months. And this is signaling to us that not only is multimodal capabilities growing, but it's going to continue to grow. And what does that mean for us as organizations? A couple quick things is that we saw some major advancements in our foundational models last year.

Gemini, for example, went from being able to just look at a video, take screenshots, and put it alongside A transcript to now actually understanding video content. We saw big jumps. You can see this chart here. This is how ChatGPT understands screenshots of user interfaces.

It jumped from 64% to 86%, I think. Let me double check. Something like that, of understanding. And we have visual reasoning and Claude jumped to 80.7%.

These are huge increases in capability. But Rachel, that was 2025. It's now five months into 2026. What else is going on?

Even in the last few months, we've seen major changes. The one that I really want to point out, GPT-5-2 is what was last year. GPT-5-5 is what's this year. And in this top row, you can see it went from 47% to 78% in this particular benchmark.

And this benchmark is talking about how agents act with multimodal content and multimodal information. This is a massive jump. This is huge implications for us as business users within our business processes of Now we can look at other kinds of data beyond just text. And you can see that there were some changes in Claude in the last couple of weeks as well.

And we're seeing that... With these changes and these improvements in the technology, the market at large is taking notice. If I just pull up a few screenshots here, we have Fast Company saying that 2026 belongs to multimodal AI, that we're evolving beyond static text into dynamic immersive interactions. We have Gartner saying that multimodal generative AI will transform enterprise applications, saying that by 2030, 80% of enterprise software is going to be multimodal.

And we have folks at IBM saying that multimodal AI is going to interpret the world like humans through visual, through visual language, visual processing, not just text processing. So here's my personal take. Do you remember how we said 2025 was the year of agents and then like half of us raised our hands and said we were using agents? I think something very similar is going to happen here.

We're seeing major changes. We're seeing big market signals saying that multimodal is the thing. And so I think it's going to take us as business users a little bit of time to fully understand what does that implication look like for us? How can I take advantage of multimodal data or multimodal capabilities in my business processes?

And I think that leaves us with a really unique opportunity to start thinking now, start planning now of what might change in your business, what might change in your workflows if you had access to these kinds of multimodal capabilities. And there's one thing that I'm going to kind of walk through here, and we have to talk just a little tiny bit about structured versus unstructured data. And I promise this is not going to turn into a data engineering talk, but we do have to talk about this just a little bit. So structured data is historically how we've been able to make sense of our business data and our business intelligence.

It could be like social media posts or call transcripts or e-mail chains and things like that. But it can also be some of that non-text data. It could be those audio acoustics, those thermal imaging clips. It could be recordings.

It could be customer call recordings and hotkey tracing and all of these kinds of things. And historically, if you wanted to access this unstructured data and use it in your reporting or in your KPIs, it was really, really hard to do that. You might need a significant amount of machine learning, as an example, to be able to make sense of all this data alongside of your structured data. Multimodal changes that.

And of course, we have personalized interactions. I think all of us can agree, and we see this out in market, that consumers kind of are demanding personalized interactions now because of the capabilities that AI gives us. Multimodal helps you get there. One example is like natural spoken dialogue through these AI systems.

It would be... a really interesting endeavor to try and incorporate all of that into your organization at once, but it might not be the most successful. So you have to think about, well, what kind of unstructured data, if I had access to it, would make sense for me? What would actually help me move the needle in my organization? For example, if you are the type of organization that really thrives on like a customer call center, does it make sense for you to be able to understand customer sentiment and voice and tone and all of that type of information through audio recordings that we as humans can kind of naturally comprehend?

So now is the time to start thinking about what kind of data makes the most sense for your organization that will help you drive forward change. And of course, I've said this before, so I won't elaborate too much on it, but you also have to think about access governance. If you have all of this new data, who in your organization needs access to it? Who should be able to take that data and make decisions with it?

Is that a certain team? Is that certain members within the team? Is that everybody in your organization? Especially as you bring these capabilities into your systems, into your AI agents, or into your AI chatbots, or into any of these kinds of experiences, who needs to have access to what?

And that's something you need to start thinking about now as well. Okay. Some key takeaways from multimodal. The number one thing is that multimodal goes both ways.

It can take in all different kinds of data, and it can put out all kinds of data. We mostly talked today about how it's ingesting information to give you more business intelligence, but you can also think about, how does that change how I visualize my reporting or how I visualize my KPIs and things like that. We've seen so much advancement and so much change in the last 12 months when it comes to our foundational technology and our foundational models. We saw a lot of that massive, massive improvement, and we can only expect that those things are going to continue to improve.

So as our models and as our technology gets better and better and better, how do we make sure that we're staying aware of all those changes and making sure we're taking advantage of all of those changes as well beyond just improving how we output images or output creative concepts in our chat bots. And finally, we talk about this. You have to consider your data access, your governance policies. If you have a data engineering team on step or if you have a data engineering partner, you might want to be starting to ask questions and think about, hey, how can we start thinking about bringing in other kinds of data into our reporting structures and things like that?

Now, on the heels of multimodal, right, where we think about all different kinds of content and all different kinds of data and information, that takes us into spatial AI. And we talked a little bit about this morning. JC referred to it as physical AI, but this is all about an AI capability that's really, really good at understanding the physical world and virtual environments. This is still a relatively new space in our kind of market.

You might have heard terms like spatial AI, spatial intelligence, computer vision, physical AI, embodied AI. These are just words that you might have heard, and it's all kind of under this general umbrella. They all mean slightly different things, but it's all this general concept of bringing AI into our physical and 3D virtual worlds. Now, This matters because this allows our systems to understand and interpret and interact with either the real world or that immersive visual world.

It can perceive physical spaces and it can choose which actions to take based on the feedback that it receives from like real physical things around it. Obviously this morning we saw some examples of like the autonomous car driving itself. We heard about the robots that are running marathons. And those things are definitely cool and interesting.

And we also want to think about how does that apply to us in our businesses and in our space. Now, if you're the kind of business that relies on the physical world, if you're in construction, if you're in agriculture, if you have a factory floor, if you have disaster recovery or any kinds of that type of real-world aspect to your organization, spatial AI matters to you a lot. And there's so many cool and interesting and valuable use cases you can get out of this kind of technology. And one of the examples that we like to use is if you're training somebody on the factory floor or if you're trying to get somebody up to speed for like a real-world product out in the field.

Historically, if you're training somebody, you might have like job shadowing, you might have manuals and videos, you might have like a deprecated piece of equipment that somebody's going to practice on, and all of that is super helpful, but it's hard to mimic and imitate real life with that kind of experience. Spatial AI gives you the ability to plug AI into like a headset or a virtual environment or AR or VR or all these different kinds of things, and be able to give you real-time in-person coaching based on exactly what the person is doing. It might adjust the scenario on the fly. It might repeat steps that maybe somebody didn't quite get the first time around.

So we talked in multimodal about personalized experiences and personalized interactions. Spatial AI is all about that, especially in this particular kind of use case and this kind of workflow. Now, Besides seeing things like autonomous driving cars and robots running marathons, what else is telling us that spatial AI is coming? And there's two areas that I want to talk about.

And one, the first one is world models. Now, our generative AI tools right now, think Claude, think OpenAI, think Gemini, these tools are great at what they do, but what they're not great at is understanding physics and understanding simulation and understanding cause and effect and understanding 3D. You need a different kind of technology, underlying technology, to be able to do those types of things and do those types of experiences. And that's where world models come in.

World models can understand physics and simulation and all of those different kinds of things that are needed for the real world interactivity that you might be looking for. Last year, we saw both Google DeepMind and Meta release world models. We also saw out of Stanford Labs, that's the screenshot on the left, Stanford Labs released a lab called World Lab. Let me double check that that's what it's actually called.

Yes, World Labs, where they're building world models that are perceiving and generating and reasoning with the 3D world. So at Xerus, we have a division that is very focused on immersive visualization in AR and VR. My colleague Luke runs that division. He's here today.

If you have questions about any of this kind of stuff, I am happy to make an introduction. My knowledge on this particular area is a little bit foundational. His take on these is that these have so much potential and so much possibility for real-world applications, but they're still a little bit rudimentary. Thank you, 10 minutes.

They're a little bit rudimentary for our real-world business cases because they're not quite niche enough, they're not quite specific enough to be able to get you exactly to where you want to go. So you've got kind of world models sitting at right now, I would loosely call it the consumer level. But what's happening at the enterprise level is that you're seeing organizations like NVIDIA, Omniverse, creating these enterprise-grade solutions. You can see here that there's this real-life example of PepsiCo recreating factory operations for AI agents.

And there's this partnership that's promising to deliver blueprints for AI factories and AI agents within that physical 3D space. So if you've got enterprise, with NVIDIA Omniverse up here, and you've got kind of world models down here at the consumer level, well, where does that leave us, right? This mid-market, small to medium space that can get a lot of value out of this kind of experience. Although these advancements are exciting, they are still kind of introductory, but again, that leaves us with this great opportunity to start thinking about, okay, if I had access to this kind of capability, what kind of experience would I want to build for workers, for my maintenance workers, for my staff, for training purposes, for any of these kinds of things.

And that leaves mid-market with this opportunity to start thinking now about what kinds of experiences you want to build in your own organization, especially if you're the kind of organization that is thinking about training scenarios, tailored personalized experiences, brand recognition, all of these kinds of things. There's so much potential there. And then that one kind of caveat is that if you are thinking about use cases right now, these tools at this time do require some fairly intensive 3D coding knowledge. So if you don't have that within your organization, you might want to start thinking about what kind of organization can help partner with you and get you to the end there.

Because if you're the kind of organization that just really focuses on chat and creativity, then a chatbot is perfect for you. There's nothing wrong with chat experiences. But if you're the kind of organization that says, hey, I really want my AI systems to start taking action and doing things on my behalf, then autonomous agentic systems are the pathway for you. And if you're the kind of organization that has a multitude of business data that you really want to tap into to really understand your business intelligence, then it's time to start coming up with a strategy for multimodal interactions.

We also have an on-site workshop that we do for organizations if you're interested in learning anything more about that. So that's what I got. Thank you very much. Questions?

I have a question. Yeah. Multimodal. That's pretty interesting.

I know that we've done, let me phrase this correctly, I guess, but web scraping, and a lot of times web scraping will get closed down by the security of the website or editor's website or whatnot. Would multimodal web scraping be a possibility so that it's actually working off more vision than actually trying to download information? It's a great question. So for those in the room that didn't hear the question, it was all around, can multimodal help with web scraping?

Because a lot of the times our websites can block AI agents from coming in and being able to see some of that code on the back end. I would say yes. These vision capabilities are getting much, much better, like taking screenshots of your desktop and translating that and understanding exactly what it's seeing. It's not going to get the metadata, if that's something that's important to you to be able to scrape.

But what's actually on the website itself that a human can see, absolutely multimodal can be a great path towards that. OK. How do you differentiate the different LLMs and what's better? Like, for example, if you open an open claw agent that you can tie different LLMs, API keys.

The problem with that is the cost gets really high. So how do you, I'm just having a hard time deciding what's good for what. Or are they all the same? I mean, what's your take on that?

It's a good question. So the question was like, how do you kind of differentiate between what type of type of LLM is good at what kinds of activities, especially if you're looking at building agents through something like OpenClaw. It's a really good question because a lot of the AI tools are, they're constantly improving and they're constantly neck and neck. There's a website that I like to go to and I think it's, I can't remember the exact URL, but it's the Artificial Intelligence Index.

And essentially what it does is it looks at all of the tools. It's a third party. looks at all the tools that are available and kind of rates them on overall intelligence, agentic capabilities, things like that. So you can always go there and kind of get a sense for what tool is better at different things. And honestly, a lot of them, because they change back and forth, many of them feel interchangeable when it comes to how intelligent they are.

Quad would be able to enable some of those different types of things, and we can also talk about that. One of my more technical folks is here today as well, and he would probably be able to answer that a little bit better than I can. Yeah, of course. Another question.

Yeah. Going back, going to agents. I brought this question up in the last session that we've kind of roadblocked allowing AI into HubSpot. So we use HubSpot, my company that I work for.

What kind of reassurance could I give Steve, our security guy, so I can put... AI into HubSpot for lead vetting or any type of automation? It's a great question. And I think that a lot of that goes into, we talked a lot about what kinds of approved activities can you allow the AI to take and what do you say, nope, you can't do that at all.

So one of the things that you might think about is saying to, you said Steve, what's his name? You might go to Steve and say, hey, why don't we start with read-only access? we won't allow AI to change anything in the database. We won't allow it to edit anything. No edit, no delete, no none of that.

Let's just focus on read. And that way, your users could then query your LLM of choice and say, like, what opportunities do I have in the pipeline that are supposed to close in the next two weeks? The agent can go and look at everything, but it can't change anything. And then it would just deliver that answer back to you.

That might be a place for you to start. All right. Thank you, everyone.