Success Story #2 - Natural Language Search for Member Benefits - CIRAS AI Summit

PRODUCTION AND OPERATIONS

3:10 PM – 3:55 PM

Room 275

SPEAKER

Nick Nystrom

Experience Lead, Wellmark Blue Cross & Blue Shield

Use with AI

Copy this session's complete context to paste into ChatGPT, Claude, or any AI assistant.

Preview context block

## Session: Success Story #2 – Natural Language Search for Member Benefits
**Track:** Production and Operations | **Time:** 3:10 PM–3:55 PM | **Room:** 275 | **Type:** Success Story
**Conference:** CIRAS AI Summit for Iowa — May 6, 2026, Scheman Building, Iowa State University, Ames IA

### Speaker(s)

**Nick Nystrom** — Experience Lead, Wellmark Blue Cross & Blue Shield (Des Moines, IA)
Nick Nystrom is an Experience Lead focused on designing and delivering enterprise scale digital experiences that help members navigate complex healthcare journeys. His work sits at the intersection of human centered design, product strategy, and emerging technology, with a particular focus on how AI can reduce friction and improve clarity for customers. Nick has led cross functional teams across design, research, analytics, and engineering to translate complex systems into intuitive, accessible experiences.

### Session Description

This session showcases a hands‑on, end‑to‑end exploration of how natural language search can transform the member benefits experience within the myWellmark ecosystem. The presentation begins by introducing the core problem statement: members struggle to locate and understand their PQF benefits due to unintuitive, jargon‑heavy search tools. Using real usability testing findings, the session grounds the problem in real‑world user experience challenges.

From there, the session shifts into a practical, interactive walkthrough of the proposed AI‑powered solution. Attendees are guided through the architecture in an accessible format, visualizing how AWS S3, Bedrock, Lambda, and API Gateway work together to deliver deterministic responses to human‑language queries. The session includes three live, scenario‑based benefit searches—diagnostic colonoscopy, maternity benefits, and shoe inserts—demonstrating how natural language inputs return precise and category‑aware benefit results.

Throughout, the format blends technical explanation, real interface screenshots, and storytelling to make complex AI and NLP concepts relatable. The session concludes with measurable value insights, team learnings, production considerations, and projected cost savings, creating a clear connection between innovation, user experience, and operational impact.

### Other sessions in the Production and Operations track

- Success Story #1 - Vision AI Efforts in Attribute Detections and Measurements (3:10 PM–3:55 PM)
- Industrial AI Success Stories: Because Even My Title Needed Machine Learning (10:20 AM–11:05 AM)
- Tabular Foundation Models Meet Manufacturing: A Practical Exploration (11:15 AM–12:00 PM)
- AI Attribute Intelligence: Automating Detection, Extraction, and Standardization at Scale (1:20 PM–2:05 PM)

### Suggested prompts for this session

- "What questions should I prepare to ask the speaker(s) at this session?"
- "Create a structured note-taking template for this session focused on actionable takeaways"
- "Based on this session description, what background reading should I do to get the most value?"
- "After I attend, help me create an action plan for implementing what I learned"
- "How does this session connect to the other sessions in the Production and Operations track?"

## Session: Success Story #2 &#8211; Natural Language Search for Member Benefits
**Track:** Production and Operations | **Time:** 3:10 PM–3:55 PM | **Room:** 275 | **Type:** Success Story
**Conference:** CIRAS AI Summit for Iowa — May 6, 2026, Scheman Building, Iowa State University, Ames IA

### Speaker(s)

**Nick Nystrom** — Experience Lead, Wellmark Blue Cross & Blue Shield (Des Moines, IA)
Nick Nystrom is an Experience Lead focused on designing and delivering enterprise scale digital experiences that help members navigate complex healthcare journeys. His work sits at the intersection of human centered design, product strategy, and emerging technology, with a particular focus on how AI can reduce friction and improve clarity for customers. Nick has led cross functional teams across design, research, analytics, and engineering to translate complex systems into intuitive, accessible experiences.

### Session Description

This session showcases a hands‑on, end‑to‑end exploration of how natural language search can transform the member benefits experience within the myWellmark ecosystem. The presentation begins by introducing the core problem statement: members struggle to locate and understand their PQF benefits due to unintuitive, jargon‑heavy search tools. Using real usability testing findings, the session grounds the problem in real‑world user experience challenges.

From there, the session shifts into a practical, interactive walkthrough of the proposed AI‑powered solution. Attendees are guided through the architecture in an accessible format, visualizing how AWS S3, Bedrock, Lambda, and API Gateway work together to deliver deterministic responses to human‑language queries. The session includes three live, scenario‑based benefit searches—diagnostic colonoscopy, maternity benefits, and shoe inserts—demonstrating how natural language inputs return precise and category‑aware benefit results.

Throughout, the format blends technical explanation, real interface screenshots, and storytelling to make complex AI and NLP concepts relatable. The session concludes with measurable value insights, team learnings, production considerations, and projected cost savings, creating a clear connection between innovation, user experience, and operational impact.

### Other sessions in the Production and Operations track

- Success Story #1 - Vision AI Efforts in Attribute Detections and Measurements (3:10 PM–3:55 PM)
- Industrial AI Success Stories: Because Even My Title Needed Machine Learning (10:20 AM–11:05 AM)
- Tabular Foundation Models Meet Manufacturing: A Practical Exploration (11:15 AM–12:00 PM)
- AI Attribute Intelligence: Automating Detection, Extraction, and Standardization at Scale (1:20 PM–2:05 PM)

### Suggested prompts for this session

- "What questions should I prepare to ask the speaker(s) at this session?"
- "Create a structured note-taking template for this session focused on actionable takeaways"
- "Based on this session description, what background reading should I do to get the most value?"
- "After I attend, help me create an action plan for implementing what I learned"
- "How does this session connect to the other sessions in the Production and Operations track?"

TRACK Production and Operations

FORMAT Success Story

ROOM 275

Key Takeaways

Natural language transforms benefit search.
A practical, scalable architecture they can model.
Clear business impact and measurable value.

Continue the conversation with Nick Nystrom at the Production & Operations Facilitated Discussion — 2:15 PM - 3:00 PM, Room 220-230-240

Session Recording

Session Data

Download SRT (Captions) Attendee Slides (PDF) AI-Formatted PDF Download Session Bundle (ZIP)

Transcript from Summit:

00:00 Introduction to AI-Driven Natural Language Search in Healthcare Slide: 1

natural language search healthcare benefits AI member experience Wellmark Blue Cross Blue Shield

good afternoon. It's my pleasure to introduce Nick Nystrom, an experienced lead specializing in design and delivery of enterprise-scale digital experiences. His work brings together human-centered design, product strategy, and emerging technologies with a focus on using AI to simplify complex customer journeys, particularly in healthcare. So Nick has led cross-functional teams. across design, research, analytics, and engineering to translate high-complexity systems into intuitive and accessible user experiences. So in today's session, he will explore how natural language search can improve the way members find and understand their benefits, demonstrating how AI can reduce friction, increase clarity, and deliver meaningful value in real-world applications. So please join me in welcoming Nick Nystrom. Thank you. Appreciate it, everybody, for being here. Good afternoon. I'm Nick Nystrom from Walmart Blue Cross Blue Shield of Iowa and South Dakota. I'm going to walk you guys through today a story about how to apply AI in a place maybe where information is very complex, regulated, and really incredibly human.

01:06 Wellmark Organization and Member Experience Team Slide: 2

wellmark iowa south dakota member experience team human-centered design

Member benefits and customer service is really where we're going to focus today. This isn't a product launch or a live demo. It's a real exploration. What worked, what didn't work, and really what we learned as a team through that lens of AI-driven experiment. So, what I want to start with first today is just a quick overview about myself and a little bit about Wellmark. Wellmark is in Iowa and South Dakota. And we have about 2,000 employees, and the headquarters is in Des Moines, where I work. I'm on a member experience team that's embedded within technology at our company, and there's about 60 of us on our team. We have everybody from disciplines ranging from design to analytics to user testing to research to experience delivery. a lot of different, we have content writers, we have designers, obviously. So a lot of disciplines that really help me as an experience lead deliver epic experiences to our members. So just by a raise of hands, how many have Walmart Insurance? Anybody here? Okay, a lot of people. We have a lot of members, about 2 million members in Iowa and South Dakota. My team's focused on really the first part of the member's journey.

02:08 Human-Centered Design and Measurement Approach Slide: 2

human-centered design luma measurement google analytics voice of customer

So that's the invite, the shop, the enroll, and the welcome moments. So how do we show up for those members in those service moments? That's what my team's focused on, right? So a lot of it's driving adoption to our self-serve platform, which is our mobile app. But there's also initiatives I'm leading across our enterprise as well, and I'll talk a little bit about that as well. We are all trained in human-centered design. We use Luma, which is a methodology for practitioners of human-centered design. I'm A Luma certified instructor as well, so we're also rolling out Luma practitioner trainings to the rest of the organization on different product teams. So Human-centered design, if you're not familiar with it, there's a lot of different methodologies. Luma is the one that we use at Walmart. It's really just about putting the user at the center of how to solve your problem, which yields really good results for us. We do a lot of data-driven decisions, so we measure everything we build. There's been a lot of themes throughout today's talks around measurement. We measure in the form of interaction metrics using Google Analytics and our digital properties.

03:09 Speaker Background and AI Music Production Slide: 2

career background kingland product manager TAI ITLI wedding DJ

We also have Voice of Customer. We have a really extensive Voice of Customer program. So if you interact with something, whether it's physical customer service phone call on our mobile property, you're going to get an e-mail, right? And we're going to ask questions about that. And we track that on a scorecard so we can start to see sentiment on an experience that we deliver. And that really helps us iterate and refine that once we launch. And then the final thing is outcomes, right? So when I drive an Epic experience. I'm looking for outcomes. Some of them are business, some of them are member. But ultimately, outcomes, interactions, and VOC is how we really make decisions at Wellmark on the experience team. I've been at Wellmark for four years. Before that, I was at a company called Kingland, right down the road here in Ames, Iowa. I spent 5 1/2 years there as a product manager and product analyst. Got a lot of good experience there. I was also part of the Technology Association of Iowa's ITLI, which is their leadership program last year. I was a graduate of that. And I'm a DJ, 20 years in the wedding and corporate event space. And we've had a lot of talks about AI today.

04:11 AI Adoption Challenges and Persona-Based Research Slide: 3

AI adoption organizational culture personas journey maps foundational research

I dove in last year, actually, into creating my own music, and I leveraged Suno AI to do the vocals. I produce a lot of my own beats and all that. It's EDM house music, but I didn't have a singer. And during JC's keynote, I was loving that he was bringing a little bit of AI music flavor into that because that hit home with me. So I have 16 songs out on all streaming platforms, so if you feel like a little workout music later on, you can search me up and find my music out there. That's potentially, potentially. So let's talk about the problem statement, right? This is not a product demo, as I mentioned. This is really a story about how we got here. As AI adoption accelerates, a lot of the conversation focuses on complexity and capabilities. What technology can do, what we're going to focus on, where it fits in within our organization, how it can help our people. All of those things, if applied carelessly, can be not good for your organization and culture. So we're going to talk about that experience, not just the technology. We base everything in personas, as I mentioned. We have our great research team that will do foundational research that yields journey maps, personas, all of the things that I need as an experience lead to make a decision on strategy of how we approach something, which is so great to have that ability on our team.

05:22 Member Benefit Search Problem and Natural Language Need Slide: 4

benefit search natural language chatgpt medical jargon coverage questions

We did some testing around just searching benefits, which is a large problem. You guys may have ran into this before. Finding out what's covered, How to get service? Can I get this surgery? Is this preventative covered? All those things, right? It's complex, depends on your plan and your familiarity with healthcare. So what we looked at here was looking at trying to figure out how participants in that space want to search for benefits. And surprise, they want to use regular language. Like they just want to ask it questions. And I think we have probably ChatGPT on the commercial side to blame for that because everybody's using it off their mobile phone on the side of their desk. So Querying and asking standard questions in human language is kind of the norm now. And so we thought, well, how are we going to solve that for our complex benefits when it comes to medical jargon and all of that stuff? How are we going to do that? So that's what I want to talk about today. And I want you guys to meet Sarah, right? So I want to start with a little bit of a story. Imagine you guys are a member, which I think a lot of you in this room are.

06:24 Member Search Frustration and Customer Service Costs Slide: 4

member frustration benefit PDFs legal language customer service call center

You just got a bill in the mail, and it's higher than expected, but you don't get why. So what do you do, right? You do what all of us do. You go online, you search, you type something like, why wasn't my visit covered? Why was this denied? And what you get back isn't an answer. You usually get PDFs. Sometimes you'll get legal language. Sometimes you get benefit summaries written for compliance, not comprehension from a member point of view. So now you're frustrated, not only because the information doesn't exist, but because it feels impossible to find your answer. So eventually what you do, call customer service, right, which is our highest cost channel to serve our members. So you get the information from the customer service rep. A human has to translate that complex information in real time under pressure and deliver that for you, right? That moment where that search failed and support takes over is really where our story begins. So I want to show a quick video that I think will really hit home with this audience. And I'm going to switch over. Sorry, I'm not sure what that is. Okay.

07:34 Video Demo of Natural Language Search Solution Slide: 7

video demonstration natural language search pregnancy benefits self-service AI search

Jamie is about to have her first baby, so she goes to mywellmark.com to understand her medical benefits. She scrolls and scrolls and scrolls. Hundreds of options. The answers are there, but she can't find them. Frustrated, Jamie gives up and calls for help. That happened far too often, so Wellmark fixed it. Using AI, we created a new way to search, one that understands the way people actually talk. Now Jamie types one word, and natural language search instantly finds the right coverage. Suddenly, it's all there. Prenatal care, postnatal support, even breast pumps she didn't know were covered. One search, clear answers, and Jamie can get back to what matters most. Fewer calls, faster help, a more efficient system for everyone. So as you guys can imagine, pretty awesome, right?

08:32 Core Problem: Members Ask Questions, Not Keywords Slides: 5, 6

member questions search behavior keyword search coverage questions system mismatch

For our members that are calling and trying to find this stuff, they could self-serve. Our customer service agents can leverage this as well to help serve members. And that's super important, right? Because that's going to give them the value as being a Wellmark member, maybe that another health insurance carrier might not do. So we talk about the core problem. Members don't search for benefits, right? They ask questions. That's what they do. Is it covered? What do I do next? Why do I owe this? That's the fundamental issue here. Members don't search, right? They ask those questions. So search is assuming people know what to ask for. Benefits assume people know how the coverage works. Neither of those assumptions are true, however. So there's a constant mismatch between how the systems are built and how those systems behave. And that's really what we're going to focus on today with that core problem I mentioned. Why does the search fail? Traditional search assumes 2 things, right? Actually, three things when I think about it. The first is that users know the right words, which they don't always know the right words to search. Second, that the content's readable. And 3rd, that answers all live in one place.

09:34 Why Traditional Search Fails in Healthcare Slide: 7

search failure healthcare terminology content fragmentation plan variations context-dependent answers

Sadly, in healthcare, they don't live all in one place. This really breaks down. Terminality varies between plans, contents fragmented, and answers depend on context. That plan, the claim, the timing, and all the expectations that the member has, all of those vary depending on the situation for the member. So that can be a big struggle on why they can't get answers and why they can't search on what they're looking to find. So when search fails, customer service absorbs the cost for us. And I mentioned that that's the highest channel cost that we have. So calls increased, handle times go up, our customer service agents are forced to act as like translators between what the member's asking in that complex benefit question and really giving them that answer that'll help them in the real human situation. So this is not really a digital experience problem. It's kind of an operational one if you look at it. So this just shows kind of generically how, you know, someone would call, the confusion, a member calls, a customer service representative has to translate that, which leads to longer handle time and obviously less customer satisfaction.

10:38 Zero Tolerance for AI Hallucinations in Healthcare Slide: 8

AI hallucinations healthcare regulations accuracy requirements coverage errors medical procedures

We all want to make sure that we can handle those member requests as soon as they come into customer service and get the member what they need when they need it. I'm also going to talk here about why this is a hard AI problem. If it were easy to apply here, I mean, it would be everywhere already. It's slowly starting to get to a place where AI is embedding everywhere. But in terms of health care and searching, it's not there yet, right? These health care benefits involve regulated content. I mentioned fragmented systems and really 0 tolerance for hallucinations. You couldn't imagine someone wanting to do a preventative service, a heart surgery, transplant, or something very serious and getting information that it's covered, and then they go and have the surgery, and then they're stuck with a $100,000 bill. I mean, that happens. We hear stories of that happening. I'm sure you guys maybe know people that have had issues with getting the wrong information. So hallucinations, obviously, as you know, in AI can happen, and that's, there's zero tolerance in the healthcare space. So getting something almost right can actually be worse than getting it wrong in our profession in healthcare.

11:41 Hackathon Origin and First Place Win Slide: 9

hackathon wellmark first place project funding legal compliance

So the reality is heavily shaped on how we approach this work. But you can see here there's a lot to be considered in this domain specifically. So let's begin with internal testing. We used a hackathon actually to do this work. Wellmark decided last year to do our first official hackathon ever. It was three days. You got to partner with anyone you wanted. You can submit ideas for a period of two weeks, and then you can actually request to be on a team, and then the teams were assembled for those three days on site. It was actually an incredible experience. This was the idea that my team had to solve search for members using AI. And we actually, out of 19 teams, we actually won. We placed first place last year. And because of that, Wellmark funded the work, which I just thought was super cool for a company to not only sponsor Hackathon for three days, but then fund the winning project. So we funded that last year, and we're getting ready to implement it next month for our members, which is just incredible. A year, yeah, it took a long time, but you can imagine the legal conversations and compliance conversations we've had to have and go back and forth with what we're actually saying on the screen.

12:48 Hackathon Structure and Cross-Functional Collaboration Slide: 10

time-boxed cross-functional teams rapid prototyping controlled failure hackathon structure

I think we've landed on AI assist with a bunch of legal language, really small. So there's that too. But we leveraged, like I said, time-boxed, low-risk, cross-functional. So we had developers, we had analysts from different parts, we had operations folks. We had about 12 people on my team for those three days. And then we presented to leadership and everybody else, which was really fun, right? So that rapid failure and that controlled structure, took that from JC this morning from his keynote, That was key for the hackathon, so they're getting ready to do that again this year. I'm looking at some potential teams to join, but this is a very cool way to not only get everybody together from a culture perspective, but actually deliver working stuff now that we've implemented, which I think is super awesome. So let's keep talking. The hypothesis, right? Our hypothesis was simple. What if people could ask questions in their own words and the system met them halfway? Not a chatbot replacing humans, not a magic answer engine, but a bridge between human language and that complex benefit logic, right?

13:49 Hypothesis and Amazon Alexa Inspiration Slide: 1

hypothesis amazon alexa natural language bridge human language benefit logic

We thought about Amazon Alexa as like, vibes and we were thinking like, how do we want this to feel? We said, why couldn't you just ask Alexa like if it's covered, right? So that jokingly became how we thought about this. How easy would it be just to be like, are the things I walk around on covered, right? Which would be orthotics, right? But how does search know that you're talking about feet, right? And that's where AI comes in, right? That's where that language model comes in. So it was pretty cool to see. We had a working demo for our hackathon debut, and we had a bunch of executives coming up trying to like stump it, trying to like get it to not bring back benefits. But surprisingly, it worked very well. And they were like, okay, we can see the benefit in this. So conceptually, the architecture, legal sadly wouldn't let me put anything in here that we used. But you guys can use your imagination. We have a repository here, so that's all of our documents that we have, benefit documents, think all of that historic document. We're using AI retrieval, so semantics, vector matching.

14:50 AI Architecture Using Retrieval and Chunking Slide: 12

AI architecture document repository semantic search vector matching chunking methods

We use chunking methods, which some of the big service providers offer that in their language models. The chunking is how it takes that segment of information and displays it to the member. And there's various different methods of chunking. So we tried and tested a bunch of ones until we kind of got the result that we felt was going to give the member the best result, which was cool. And then ultimately, we had some lambdas and some service layers we built to connect it to our member portal and our customer service CRM, things like that, which is great. So this is a little bit overview of the conceptual architect. If you want to get involved with that after, I'm more than happy to dive into that. Let's talk experience, right? So what we did, what we deliberately did not do was surface raw policy text. We didn't pretend that AI was certain. We didn't optimize for cleverness. We optimized for clarity, restraint, and trust, right? We just heard all about trust. How is our members going to trust? If we get a wrong answer, And they go to the doctor and it's not covered and they used AI.

15:52 Design Principles of Clarity, Restraint, and Trust Slide: 13

clarity restraint trust design principles policy text

They're not going to trust Walmart. They're not going to trust anything that we tell them. So it's super important for us to be clear and really restrain ourselves because in healthcare, that confidence without that accuracy is super dangerous for our members. We talked about the current climate. We all know kind of the story around United Healthcare and That whole sad thing that happened, we are not using AI at Wellmark for any healthcare outcomes, any determinations of claims. We are not doing any of that. This is the first, we're using it internally as a workforce. We have copilot and all of that stuff, but we're not leveraging it for member-facing things by any means. This would be the first thing that we're leveraging AI for. But I feel like it's a very controlled application of AI. It's not generative. It's very specific. So that's kind of where we're at with the experience lens on this. Two audiences, two jobs, right? So I mentioned we have our members and we have our customer service agents. They don't need the same answers, right? Members need the clarity, they need the confidence, and they need that empathy.

16:53 Dual Audience Design for Members and Customer Service Agents Slide: 13

dual audience members customer service agents CXA self-service

And what are their next steps, right? That's what they're looking for. The CXAs, however, they need the speed and the traceability to get that information quickly to deliver to the members. So it's two audiences with two completely separate jobs, but we designed for both of them at the same time. And that forced us to think more carefully about the experience outcomes that we're trying to drive, right? So that's the self-service side of things versus the customer service side of things. But either way, they both can use the same solution, which I thought was really awesome and it was really impactful, I think, to the leadership group to hear we can actually not only solve members' issues, but we can solve speed and clarity around what our CXAs are doing. So what broke first, right? AI struggled where humans also struggle, which is not a surprise. That's ambiguity. So conflicting sources, vague benefit terms, those edge cases. We saw early tendencies toward overconfidence, which reinforced the need for those guardrails. We talk about enterprise-wide initiatives a lot in this space as well.

17:56 AI Challenges with Ambiguity and Overconfidence Slide: 15

ambiguity overconfidence guardrails conflicting sources edge cases

This is an enterprise-wide initiative. We had to get the buy-in from the stakeholders. A lot of different departments are siloed, but we're working to come together and say, like, how can we all leverage the same tool so we can get confidence in deploying this to our members and have a better experience when it comes to finding their benefits? So what surprised us? It was how much less AI actually needed to do. Smaller, well-scoped answers actually built more trust with our testers. Retrieval, beat generation, and transparency built confidence, even when the answer was it depends, right? So it was really about making sure that we keep that level of trust when we're having interactions with our members and really restraining the scope around it versus just letting it go. It was really important as well. So I have a couple of minutes left here, and I want to make sure I allow time for questions. From an experience lesson, This isn't an experience on its own it's part of 1. It can reduce cognitive load, but not remove complexity.

18:59 Success Through Smaller Scope and Transparency Slide: 19

scoped answers retrieval vs generation transparency trust building corporate insights awards

It can assist, but it still needs humans in the loop, right? So that's what we're looking at here. We just were awarded some Corporate Insights Awards, which is an industry kind of... recognition for best mobile app and desktop app. We finished second on that. That's 24 health insurance carriers were rated, and we were #2 on that, right behind Anthem, which Anthem has a massive budget compared to what Walmart has in terms of experience. So I think we're doing some things right in this space, and this just kind of shows that some of the recognition is coming our way. From A technical perspective, bounded domains, clear governance, strong content, Responsible AI is an experience decision, not just a platform decision. A few organizational lessons we learned. This changed kind of how we collaborate as teams. Experience, content, operations, technology can't operate independently anymore, right? AI forces alignment or exposes the lack thereof. So our hackathon part 2, like I said, is coming up next.

19:59 Lessons Learned and Scalability Potential Slides: 20, 21, 11

cognitive load humans in the loop cross-team collaboration scalability hackathon model

But this could scale beyond healthcare, right? We're talking other domains, insurance, government, higher ed. These challenges show up everywhere. It's just about allowing people to ask the right questions in the way that they want to and providing the answer to them. What this is not, though, is an AI reality check. It's not a replacement for humans, not a knowledge oracle, and of course, not without guardrails. I'll leave this last quote up here before I end. This was from one of our team leaders that was part of our testing group, and she, I'm not going to read the quote, but basically she handles our customer service team, and she just really saw the value of having something like this for her CXAs to use to help members. She thought this could be magnified by 200x agents if we implemented this at Wellmark. So it's a very promising piece of technology, not only for internal operations, but for our customers that, you know, use Wellmark Insurance. And with that, I know I ran through that a little fast, but we are a little short on time, but I wanted to open up for a question or two, if you guys have any.

21:01 Implementation Metrics and Data Quality Work Slide: 11

UAT metrics legal compliance implementation team data quality

Yes, do you have metrics that measure sales queries? Yes, we do, so we're compiling that right now. I mentioned we're in UAT with all of our stuff right now, and we're finalizing all the testing. And we still have to get through the rest of the legal compliance stuff, but they are documenting that. I haven't been as close to the implementation team, sadly. I've been, I was part of the hackathon team, but we have another AI-focused team that's doing the implementation at Walmart, but more than happy to find out. Is there a bar, like a percentage of success or failure? No, I'm not even sure, to be honest with you, but I'd be more than happy to find out for you. Yeah. One of the things I mentioned is like you've got to go in and treat your data. So you find out that on your documentation there were a lot of contradictions. Yep, we had to, we spent a lot of time and effort over the last year cleaning up our benefits document catalog and really optimizing that. So it, like everyone says, you got to have good data to get good output. So we did some of that legwork. It was in a pretty good spot before. It just was a lot, very complex and hard to understand for most people.

22:05 Fallback Mechanisms for Unanswered Queries Slide: 10

fallback mechanisms unanswered queries chatbot secure messaging customer service

So if it has a problem that it can't answer, does it have kind of an escape mechanism to talk to a human? I'm using an example for last year, it took me 6 hours with AT&T to fix the problem once. Wow. Because every time I went on, I had to... menu, AI, get to a person, get to the next person. It was highly frustrating. Yeah, so that's a great call out. Our design team has put in some things. So if you're searching and you're just not getting, I think it's like 2 searches. If it doesn't come back, there's of course some chat bot help or send a secure message or call customer service. So we're not going to eliminate that, but we hope that self-serve. You know, that's always the way, I think, is members want to self-serve if they can. It's just making it easy for them to do that. Just learning yours get you to a person, but that works in other ones. What was that? Yeah, that's true. That works for CVS. Oh, good to know. They're our pharmacy partners. Cool. All right, folks, I think that puts us pretty much at time. So everybody, please join me in thanking Nick for his wonderful presentation. That brings us pretty close to the end of the day, folks.

23:10 Session Closing and Event Wrap-Up Slide: 10

session closing thank you NIST closing comments event wrap-up

There should be some closing comments and some stuff from Robert Ivester of NIST back in the main room, and then followed by our final closing comments from Paul. Thank you for being here.

So in today's session, he will explore how natural language search can improve the way members find and understand their benefits, demonstrating how AI can reduce friction, increase clarity, and deliver meaningful value in real-world applications. So please join me in welcoming Nick Nystrom. Thank you. Appreciate it, everybody, for being here.

Good afternoon. I'm Nick Nystrom from Walmart Blue Cross Blue Shield of Iowa and South Dakota. I'm going to walk you guys through today a story about how to apply AI in a place maybe where information is very complex, regulated, and really incredibly human. Member benefits and customer service is really where we're going to focus today.

This isn't a product launch or a live demo. It's a real exploration. What worked, what didn't work, and really what we learned as a team through that lens of AI-driven experiment. So, what I want to start with first today is just a quick overview about myself and a little bit about Wellmark.

Wellmark is in Iowa and South Dakota. And we have about 2,000 employees, and the headquarters is in Des Moines, where I work. I'm on a member experience team that's embedded within technology at our company, and there's about 60 of us on our team. We have everybody from disciplines ranging from design to analytics to user testing to research to experience delivery. a lot of different, we have content writers, we have designers, obviously.

So a lot of disciplines that really help me as an experience lead deliver epic experiences to our members. So just by a raise of hands, how many have Walmart Insurance? Anybody here? Okay, a lot of people.

We have a lot of members, about 2 million members in Iowa and South Dakota. My team's focused on really the first part of the member's journey. So that's the invite, the shop, the enroll, and the welcome moments. So how do we show up for those members in those service moments?

That's what my team's focused on, right? So a lot of it's driving adoption to our self-serve platform, which is our mobile app. But there's also initiatives I'm leading across our enterprise as well, and I'll talk a little bit about that as well. We are all trained in human-centered design.

We use Luma, which is a methodology for practitioners of human-centered design. I'm A Luma certified instructor as well, so we're also rolling out Luma practitioner trainings to the rest of the organization on different product teams. So Human-centered design, if you're not familiar with it, there's a lot of different methodologies. Luma is the one that we use at Walmart.

It's really just about putting the user at the center of how to solve your problem, which yields really good results for us. We do a lot of data-driven decisions, so we measure everything we build. There's been a lot of themes throughout today's talks around measurement. We measure in the form of interaction metrics using Google Analytics and our digital properties.

And we track that on a scorecard so we can start to see sentiment on an experience that we deliver. And that really helps us iterate and refine that once we launch. And then the final thing is outcomes, right? So when I drive an Epic experience.

I'm looking for outcomes. Some of them are business, some of them are member. But ultimately, outcomes, interactions, and VOC is how we really make decisions at Wellmark on the experience team. I've been at Wellmark for four years.

Before that, I was at a company called Kingland, right down the road here in Ames, Iowa. I spent 5 1/2 years there as a product manager and product analyst. Got a lot of good experience there. I was also part of the Technology Association of Iowa's ITLI, which is their leadership program last year.

I was a graduate of that. And I'm a DJ, 20 years in the wedding and corporate event space. And we've had a lot of talks about AI today. I dove in last year, actually, into creating my own music, and I leveraged Suno AI to do the vocals.

I produce a lot of my own beats and all that. It's EDM house music, but I didn't have a singer. And during JC's keynote, I was loving that he was bringing a little bit of AI music flavor into that because that hit home with me. So I have 16 songs out on all streaming platforms, so if you feel like a little workout music later on, you can search me up and find my music out there.

That's potentially, potentially. So let's talk about the problem statement, right? This is not a product demo, as I mentioned. This is really a story about how we got here.

As AI adoption accelerates, a lot of the conversation focuses on complexity and capabilities. What technology can do, what we're going to focus on, where it fits in within our organization, how it can help our people. All of those things, if applied carelessly, can be not good for your organization and culture. So we're going to talk about that experience, not just the technology.

We base everything in personas, as I mentioned. We have our great research team that will do foundational research that yields journey maps, personas, all of the things that I need as an experience lead to make a decision on strategy of how we approach something, which is so great to have that ability on our team. We did some testing around just searching benefits, which is a large problem. You guys may have ran into this before.

Finding out what's covered, How to get service? Can I get this surgery? Is this preventative covered? All those things, right?

It's complex, depends on your plan and your familiarity with healthcare. So what we looked at here was looking at trying to figure out how participants in that space want to search for benefits. And surprise, they want to use regular language. Like they just want to ask it questions.

And I think we have probably ChatGPT on the commercial side to blame for that because everybody's using it off their mobile phone on the side of their desk. So Querying and asking standard questions in human language is kind of the norm now. And so we thought, well, how are we going to solve that for our complex benefits when it comes to medical jargon and all of that stuff? How are we going to do that?

So that's what I want to talk about today. And I want you guys to meet Sarah, right? So I want to start with a little bit of a story. Imagine you guys are a member, which I think a lot of you in this room are.

Why was this denied? And what you get back isn't an answer. You usually get PDFs. Sometimes you'll get legal language.

Sometimes you get benefit summaries written for compliance, not comprehension from a member point of view. So now you're frustrated, not only because the information doesn't exist, but because it feels impossible to find your answer. So eventually what you do, call customer service, right, which is our highest cost channel to serve our members. So you get the information from the customer service rep.

A human has to translate that complex information in real time under pressure and deliver that for you, right? That moment where that search failed and support takes over is really where our story begins. So I want to show a quick video that I think will really hit home with this audience. And I'm going to switch over.

Sorry, I'm not sure what that is. Okay. Jamie is about to have her first baby, so she goes to mywellmark.com to understand her medical benefits. She scrolls and scrolls and scrolls.

Hundreds of options. The answers are there, but she can't find them. Frustrated, Jamie gives up and calls for help. That happened far too often, so Wellmark fixed it.

Using AI, we created a new way to search, one that understands the way people actually talk. Now Jamie types one word, and natural language search instantly finds the right coverage. Suddenly, it's all there. Prenatal care, postnatal support, even breast pumps she didn't know were covered.

One search, clear answers, and Jamie can get back to what matters most. Fewer calls, faster help, a more efficient system for everyone. So as you guys can imagine, pretty awesome, right? For our members that are calling and trying to find this stuff, they could self-serve.

Our customer service agents can leverage this as well to help serve members. And that's super important, right? Because that's going to give them the value as being a Wellmark member, maybe that another health insurance carrier might not do. So we talk about the core problem.

Members don't search for benefits, right? They ask questions. That's what they do. Is it covered?

What do I do next? Why do I owe this? That's the fundamental issue here. Members don't search, right?

They ask those questions. So search is assuming people know what to ask for. Benefits assume people know how the coverage works. Neither of those assumptions are true, however.

So there's a constant mismatch between how the systems are built and how those systems behave. And that's really what we're going to focus on today with that core problem I mentioned. Why does the search fail? Traditional search assumes 2 things, right?

Actually, three things when I think about it. The first is that users know the right words, which they don't always know the right words to search. Second, that the content's readable. And 3rd, that answers all live in one place.

So that can be a big struggle on why they can't get answers and why they can't search on what they're looking to find. So when search fails, customer service absorbs the cost for us. And I mentioned that that's the highest channel cost that we have. So calls increased, handle times go up, our customer service agents are forced to act as like translators between what the member's asking in that complex benefit question and really giving them that answer that'll help them in the real human situation.

So this is not really a digital experience problem. It's kind of an operational one if you look at it. So this just shows kind of generically how, you know, someone would call, the confusion, a member calls, a customer service representative has to translate that, which leads to longer handle time and obviously less customer satisfaction. We all want to make sure that we can handle those member requests as soon as they come into customer service and get the member what they need when they need it.

I'm also going to talk here about why this is a hard AI problem. If it were easy to apply here, I mean, it would be everywhere already. It's slowly starting to get to a place where AI is embedding everywhere. But in terms of health care and searching, it's not there yet, right?

These health care benefits involve regulated content. I mentioned fragmented systems and really 0 tolerance for hallucinations. You couldn't imagine someone wanting to do a preventative service, a heart surgery, transplant, or something very serious and getting information that it's covered, and then they go and have the surgery, and then they're stuck with a $100,000 bill. I mean, that happens.

We hear stories of that happening. I'm sure you guys maybe know people that have had issues with getting the wrong information. So hallucinations, obviously, as you know, in AI can happen, and that's, there's zero tolerance in the healthcare space. So getting something almost right can actually be worse than getting it wrong in our profession in healthcare.

Wellmark decided last year to do our first official hackathon ever. It was three days. You got to partner with anyone you wanted. You can submit ideas for a period of two weeks, and then you can actually request to be on a team, and then the teams were assembled for those three days on site.

It was actually an incredible experience. This was the idea that my team had to solve search for members using AI. And we actually, out of 19 teams, we actually won. We placed first place last year.

And because of that, Wellmark funded the work, which I just thought was super cool for a company to not only sponsor Hackathon for three days, but then fund the winning project. So we funded that last year, and we're getting ready to implement it next month for our members, which is just incredible. A year, yeah, it took a long time, but you can imagine the legal conversations and compliance conversations we've had to have and go back and forth with what we're actually saying on the screen. I think we've landed on AI assist with a bunch of legal language, really small.

So there's that too. But we leveraged, like I said, time-boxed, low-risk, cross-functional. So we had developers, we had analysts from different parts, we had operations folks. We had about 12 people on my team for those three days.

And then we presented to leadership and everybody else, which was really fun, right? So that rapid failure and that controlled structure, took that from JC this morning from his keynote, That was key for the hackathon, so they're getting ready to do that again this year. I'm looking at some potential teams to join, but this is a very cool way to not only get everybody together from a culture perspective, but actually deliver working stuff now that we've implemented, which I think is super awesome. So let's keep talking.

The hypothesis, right? Our hypothesis was simple. What if people could ask questions in their own words and the system met them halfway? Not a chatbot replacing humans, not a magic answer engine, but a bridge between human language and that complex benefit logic, right?

Which would be orthotics, right? But how does search know that you're talking about feet, right? And that's where AI comes in, right? That's where that language model comes in.

So it was pretty cool to see. We had a working demo for our hackathon debut, and we had a bunch of executives coming up trying to like stump it, trying to like get it to not bring back benefits. But surprisingly, it worked very well. And they were like, okay, we can see the benefit in this.

So conceptually, the architecture, legal sadly wouldn't let me put anything in here that we used. But you guys can use your imagination. We have a repository here, so that's all of our documents that we have, benefit documents, think all of that historic document. We're using AI retrieval, so semantics, vector matching.

And then ultimately, we had some lambdas and some service layers we built to connect it to our member portal and our customer service CRM, things like that, which is great. So this is a little bit overview of the conceptual architect. If you want to get involved with that after, I'm more than happy to dive into that. Let's talk experience, right?

So what we did, what we deliberately did not do was surface raw policy text. We didn't pretend that AI was certain. We didn't optimize for cleverness. We optimized for clarity, restraint, and trust, right?

We just heard all about trust. How is our members going to trust? If we get a wrong answer, And they go to the doctor and it's not covered and they used AI. They're not going to trust Walmart.

They're not going to trust anything that we tell them. So it's super important for us to be clear and really restrain ourselves because in healthcare, that confidence without that accuracy is super dangerous for our members. We talked about the current climate. We all know kind of the story around United Healthcare and That whole sad thing that happened, we are not using AI at Wellmark for any healthcare outcomes, any determinations of claims.

We are not doing any of that. This is the first, we're using it internally as a workforce. We have copilot and all of that stuff, but we're not leveraging it for member-facing things by any means. This would be the first thing that we're leveraging AI for.

But I feel like it's a very controlled application of AI. It's not generative. It's very specific. So that's kind of where we're at with the experience lens on this.

Two audiences, two jobs, right? So I mentioned we have our members and we have our customer service agents. They don't need the same answers, right? Members need the clarity, they need the confidence, and they need that empathy.

And that forced us to think more carefully about the experience outcomes that we're trying to drive, right? So that's the self-service side of things versus the customer service side of things. But either way, they both can use the same solution, which I thought was really awesome and it was really impactful, I think, to the leadership group to hear we can actually not only solve members' issues, but we can solve speed and clarity around what our CXAs are doing. So what broke first, right?

AI struggled where humans also struggle, which is not a surprise. That's ambiguity. So conflicting sources, vague benefit terms, those edge cases. We saw early tendencies toward overconfidence, which reinforced the need for those guardrails.

We talk about enterprise-wide initiatives a lot in this space as well. This is an enterprise-wide initiative. We had to get the buy-in from the stakeholders. A lot of different departments are siloed, but we're working to come together and say, like, how can we all leverage the same tool so we can get confidence in deploying this to our members and have a better experience when it comes to finding their benefits?

So what surprised us? It was how much less AI actually needed to do. Smaller, well-scoped answers actually built more trust with our testers. Retrieval, beat generation, and transparency built confidence, even when the answer was it depends, right?

So it was really about making sure that we keep that level of trust when we're having interactions with our members and really restraining the scope around it versus just letting it go. It was really important as well. So I have a couple of minutes left here, and I want to make sure I allow time for questions. From an experience lesson, This isn't an experience on its own it's part of 1.

It can reduce cognitive load, but not remove complexity. It can assist, but it still needs humans in the loop, right? So that's what we're looking at here. We just were awarded some Corporate Insights Awards, which is an industry kind of... recognition for best mobile app and desktop app.

We finished second on that. That's 24 health insurance carriers were rated, and we were #2 on that, right behind Anthem, which Anthem has a massive budget compared to what Walmart has in terms of experience. So I think we're doing some things right in this space, and this just kind of shows that some of the recognition is coming our way. From A technical perspective, bounded domains, clear governance, strong content, Responsible AI is an experience decision, not just a platform decision.

A few organizational lessons we learned. This changed kind of how we collaborate as teams. Experience, content, operations, technology can't operate independently anymore, right? AI forces alignment or exposes the lack thereof.

So our hackathon part 2, like I said, is coming up next. But this could scale beyond healthcare, right? We're talking other domains, insurance, government, higher ed. These challenges show up everywhere.

It's just about allowing people to ask the right questions in the way that they want to and providing the answer to them. What this is not, though, is an AI reality check. It's not a replacement for humans, not a knowledge oracle, and of course, not without guardrails. I'll leave this last quote up here before I end.

This was from one of our team leaders that was part of our testing group, and she, I'm not going to read the quote, but basically she handles our customer service team, and she just really saw the value of having something like this for her CXAs to use to help members. She thought this could be magnified by 200x agents if we implemented this at Wellmark. So it's a very promising piece of technology, not only for internal operations, but for our customers that, you know, use Wellmark Insurance. And with that, I know I ran through that a little fast, but we are a little short on time, but I wanted to open up for a question or two, if you guys have any.

I haven't been as close to the implementation team, sadly. I've been, I was part of the hackathon team, but we have another AI-focused team that's doing the implementation at Walmart, but more than happy to find out. Is there a bar, like a percentage of success or failure? No, I'm not even sure, to be honest with you, but I'd be more than happy to find out for you.

Yeah. One of the things I mentioned is like you've got to go in and treat your data. So you find out that on your documentation there were a lot of contradictions. Yep, we had to, we spent a lot of time and effort over the last year cleaning up our benefits document catalog and really optimizing that.

So it, like everyone says, you got to have good data to get good output. So we did some of that legwork. It was in a pretty good spot before. It just was a lot, very complex and hard to understand for most people.

It was highly frustrating. Yeah, so that's a great call out. Our design team has put in some things. So if you're searching and you're just not getting, I think it's like 2 searches.

If it doesn't come back, there's of course some chat bot help or send a secure message or call customer service. So we're not going to eliminate that, but we hope that self-serve. You know, that's always the way, I think, is members want to self-serve if they can. It's just making it easy for them to do that.

Just learning yours get you to a person, but that works in other ones. What was that? Yeah, that's true. That works for CVS.

Oh, good to know. They're our pharmacy partners. Cool. All right, folks, I think that puts us pretty much at time.

So everybody, please join me in thanking Nick for his wonderful presentation. That brings us pretty close to the end of the day, folks. There should be some closing comments and some stuff from Robert Ivester of NIST back in the main room, and then followed by our final closing comments from Paul. Thank you for being here.