Close the GenAI “Learning Gap”: Self‑Improving AI Without Fine‑Tuning
AI Demo
The MIT State of AI report surfaced a brutal truth: most GenAI systems do not retain feedback, adapt to context, or improve over time. While frontier models get better with every release, enterprises rarely gain a durable advantage, because their systems don’t actually learn.
The default answer is fine‑tuning. In practice, it’s often expensive, brittle, slow to iterate, and tightly coupled to a specific model version. Worse, it can lock teams out of rapidly improving frontier models.
This session presents an alternative: learning‑loop architectures that allow enterprise GenAI systems to improve continuously, without fine‑tuning, while remaining flexible enough to adopt new models as they emerge.
You’ll see how feedback from real usage can be captured, measured, and reintegrated safely into production systems. We’ll demonstrate how observability, evaluation, and automated optimization work together to turn GenAI from a static capability into a learning system.
We’ll explore:
- Automated Prompt Optimization: enabling systems to evolve their own instructions using Genetic‑Pareto (GEPA) techniques based on measurable feedback
- Observability‑Driven Learning: detecting failure patterns and routing targeted corrections back into the system
- Trust & Auditability: fitting learning loops into existing governance, compliance, and risk frameworks rather than fighting them
If your GenAI initiative is stuck in pilot, or producing inconsistent or stagnant results, this session shows the missing half: the learning loop that makes improvement routine instead of exceptional.
Key Takeaways
- Understand the Learning Gap: Why MIT identified learning as the core barrier to scaling GenAI, and what enterprises can do about it
- The Learning‑Loop Pattern: Hands‑on exposure to GEPA techniques that work across LLM providers
- Self‑Improving Demo: See a small GenAI system measurably improve from user feedback during use, with no fine‑tuning required
Transcript from Summit:
Session Transcript
Thank you. Appreciate it. So what we're talking about today is something that Ben and I were really excited about when we first saw this paper come out from Stanford. And then actually while we were developing the talk, like a super simple library came out. that allows you to do what we're talking about. It's really exciting stuff. We're happy that you're here. We're happy that you weren't scared away by fine-tuning, because it is kind of a technical topic, and what we're talking about is more approachable than that. Source Allies, where Ben and I are from, is an IT consultancy based in Des Moines, and we specialize in data and AI. We've been around for almost 25 years now, and we are proud to be builders who teach. We do a lot of building, a lot of building of Gen AI systems, and we've run into this roadblock that we're going to talk about today and how to get past it. And then as we're working, we really love teaching and learning, leveling up together, whoever we're working with. A. Back in 2002, I founded Source Allies, and I was one of several teammates who started our data and AI practice almost eight years ago now. And way back when I was in college, my dad led a neural networks research group, and I was what The keynote speaker called today one of the doubters. I did not believe that anything he was doing was actually possible. So, but here I am today. Ben? Yeah, my name is Ben McCone. I am an open source contributor, a staff consultant with Source Allies. I've actually contributed to a lot of the tools that we'll show you today. I have taken multiple clients from idea concept to production with scalable single and multi-agent systems over the years, all of the buzzwords. I've seen that progression over the last three or four years that we've been doing AI. Really excited to be giving this talk to everyone today. Back to you, Matt. Thanks, Ben. I was actually in San Francisco with Ben at a conference where he was speaking at, and the kind of leaders of these open source AI framework groups like LangChain. If you do AI development, you're familiar with these LangChain, DS Pi. Everybody was super excited to see Ben. It was really weird. Like, we show up from Iowa, and it's like, Ben, thank you for helping us. So there's some credibility up here. So what we're talking about today is this, a report that came out from MIT, Late last year, talking about the state of AI in business, and it talked about this big problem that everybody's running into, this big roadblock that is the AI. applications, the Jennifer application that we're building, aren't learning. You interact with it, you tell it, no, that's a little bit wrong, this is how we work, or you want to guide it a little bit more before you expose it to a larger audience. And it's just not learning from how you actually work within your organization. Maybe it's because of, you know, the things that you want to capture really aren't written down anywhere, but they come out through the interaction of a large group working together with the AI. This is what captures that. And it's actually what I'm hearing a lot of people being concerned about at the conference. You have people who have been at the business for a long time and have a lot of knowledge, and then people who are new to the business and don't have that knowledge. And what's going to happen when all of that talent eventually does retire? There's a lot of promise in this approach as a way to capture the knowledge and strategies and principles that are really unspoken. So we're going to talk about that. We're going to talk about the normal go-to. Normal go-to with using large language models and they don't fit for you is to do some fine-tuning. There's some pitfalls to that, and we're going to talk about what to do instead. And we welcome questions throughout. So just feel free to raise your hand, and we'll also be around for questions afterwards. So most Gen AI projects fail due to lack of trust, not capability. JC was talking about there's an average intelligence, there's great human intelligence, and then there's the best AI intelligence. So we don't need more Intelligent models. to do a lot of the use cases that we're trying to do right now. But there are some impediments. One of them is trust. They say AI moves at the speed of trust. Does anybody want to offer what is being done within your organization to to get past, to start to build trust with AI and actually have it be something that is more usable. Yes. Shen. As. How big? Great, yep, sharing what's worked, having that be kind of that teaching tool so people know that there are some successes out there and where it doesn't fit. Anybody else? Yes. Only access for AI, so that if they go to query or database, the user can see what to do with that data, interpret it, create visuals for that data without risk of any harm to the data. Yeah, I like that. People love that. Read-only access that prevents the AI from going off the rails and deleting databases, but still seeing what the capabilities are while you're building trust. Awesome. How about one last one? Yes. Software development, and using it as a tool, the developers know that you can't just take it verbatim, so I guess we're more... cautionary stage, where a lot of things it recommends is not correct. So you can't distinct the answer, you have to know yourself and then trust it more as it gets better. Great. Awesome, so not necessarily trusting the output and building up using it as a tool, but you can't just trust it. No blind trust, yes, great. There's one other thing that is coming out of industry, and it's this idea that we need some way to leverage the experts in our company and figure out how do you measure in comparison to how an expert would do a particular task. And that measurement capability is actually the thing that is the foundation upon which what we're talking about today is built. So measuring how the AI is performing, In industry terms, it's called LLM evaluations. So here's some industry quotes that talk about, really, if you're not investing in evals, you're not really shipping, you're guessing. A lot of pilots now are coming out of the gate. Pilots, when you're just trying to prove something out, are coming With some sort of measurement. Ability, and we'll talk more about that. So that's sort of the foundation. Then the MIT State of AI and Business report says, here's the big problem, the big barrier that everybody is running into. And again, it's not model quality. It's that These AI systems that we're interacting with aren't learning. So if you have a team or group or division department using AI and we're working with it and teaching it, like, here's how we do some things. Oh, this isn't really documented, but this is how we like to work. None of that is captured. So it literally is Groundhog's Day all over again, every day. The AI is just going to keep giving you the same answer. And that's the big problem. So what big tech does to solve for that, you have feedback loops. You have the thumbs up, thumbs down. We've all seen these. Thumbs down, then you can give feedback. But who is that helping? It's not helping your team or your company. It's helping Big tech make a better model that really, you know, it's improving things for everybody, but not necessarily for your particular group. So I heard. You know, MIT talks about AI is really great for individual brainstorming, but stalls out in enterprise settings because they lack iterative learning. And I actually heard a great quote at lunch that was someone saying, I'm not interested in AI that is just for an individual. I want to I want to be thinking about AI that is for multiple people. That's for teams. And that's where a lot of companies are right now. They want to break out from the individual use. So MIT gave this big problem, but they actually didn't give us an answer. They just said, it's a problem. And we think the solution is that you need some sort of loop, learning loop, but they didn't tell us how. And that's what we're sharing today. Fine-tuning, that's the typical go-to. Fine-tuning requires thousands, sometimes hundreds of thousands of examples to do fine-tuning. When you do fine-tuning, you are changing the weights in the model. A model, these big giant smart models are actually nothing more than a CSV file. So you're changing the numbers in the CSV when you do that. You have higher standards when it comes to governance and how the model is governed within your organization. There are, it's really difficult to change. fine-tune models that are the ones that we like using. We like using anthropic and open AI. But fine-tuning is something that typically happens with open weight models. And then finally, when you go to fine tuning, which really kind of changes the behavior of the model so it can adapt to your domain, it's a whole another tier of expenses. Like you're basically getting an MLOps, AIOps. It's a lot of expense to do that instead of paying pennies. AER interaction. Okay, so those are some of the downfalls of fine-tuning. Another interesting thing as we get into the solution is LLMs are very good at doing what you tell them, and a lot of the failures that we encounter come from Not being specific enough, not removing ambiguity. So, most... Prompt failures are actually knowledge gaps, where some of these principles or strategies, we just didn't have a way, we didn't know how to express them. And then the AI gives us a result, and we're like, well, that's kind of stupid. But the key point here is that... We all know what a prompt is, of course. Probably a lot of people know that there's also something called a system prompt or a system message. It's like a prompt that we never see behind the scenes that says, you know, be kind to the person you're working with and here are some ethical guidelines and here are some rules to follow. So the key point though is that prompt changes, changing that wording just a little bit can have massive differences on how the model performs. And this used to be prompt engineering. People made fun of it. It was going to be a $200,000 career, prompt engineering. But there's actually what we're showing today is a mathematical, a scientific way to change the words. that are in every single application that we're using. Even if you're just using a team co-pilot, there's a way to give it a prompt that helps it work with your team better and recognize how your business is working. These small prompt changes, there's a way to mathematically Scientifically, methodically change the prompt to get better outcomes for you. This is a guy who talks a lot about AI on the internet, and he says LLMs just, they don't come with instructions in the box. So that's kind of the thing. We're all figuring out what are the right instructions. There was a talk early this morning that was sharing out what a model is and what agents are and what skills are and what tools are. So there are all of these places where you give instructions to the model. And again, we're figuring that out together. So some of the kind of go-tos that we have, fine-tuning, okay, it's expensive. Here's the downfall. RAG, that's where you pull in all of your documents. Maybe you have existing SOPs. That works great. We love RAG. But it doesn't help you bring out the reasoning and the strategy that your team is using when you're doing work within your organization. Memory is really good, but we don't want memorization of just things that have been done in the past. We want to extract lessons from that. So all of these things that we want just don't exist yet until this method that we're going to show you. If anybody's working on skills, skills is just a text file that helps you guide your AI to do a particular thing. And that. actually isn't scaling very well. Like a lot of people are trying to figure out how do you scale that beyond just one person? How do you scale that to multiple people, a team? So those are the problems that we're trying to answer. That's what we are going to answer today. And this is the kind of, what if we could? What if we had a way to do all of this continuous improvement, to have it be fully auditable? Because instead of changing model weights, these numbers in a giant CSV file, we're changing the instructions. And you can see, oh, the model's performing better because of these words. I can get that, like an auditor can look at that and say, I understand what that's doing. There's all of these great benefits. And they just so happen to get you similar to sometimes even better performance gains when compared to fine tuning, the really expensive thing. Not going to hold you in suspense further, this is the library that came out that really does this magic thing that kind of codifies. what the author at Stanford released. It allows you to take text and take some feedback on why that text was good or bad and make better text. It works on text. So it's super simple stuff. It can actually be done with A spreadsheet. You know, take some really bad interactions with your AI, talk to some subject matter experts, and say, "Why was it bad? What would an expert have said?" and then run it through this. And what's interesting is you're not getting the hard-coded answers embedded into this text. You're getting higher-level principles and strategies extracted for you that can apply to Your entire domain examples not yet seen. OK, so we're getting to an example, and we're going to do a chat thing. Yeah, AI is not all about chat, but it's an example that we all recognize. So we ask a question, we get an answer, we give a thumbs up, thumbs down. We say it was a thumbs down because someone's asking about photosynthesis because you didn't talk about light absorption details. So to be more precise, photosynthesis is a two-step blah, blah, blah. So we have a feedback loop for where the humans, or maybe even smarter, super expensive models, are saying, here's a better answer. We're applying that to... Food. We have lots of insurance and retail and energy and defense and med tech examples, but we're applying it to Feud. There's a rating. There's a rating system called Michelin Star that if you're traveling, these are like, hey, these are some useful places to go for an interesting experience. So, it's like fancy food, but that's what we want to do. We have an example; it's running in code. You can stop by our booth and we'll show you the code if you want to see the details of it, but we're saying... OK, how would you how would you make an omelet, and then how would a Michelin? fancy chef make an omelet. So it's that difference between how anybody would do it and then how an expert would do it. And what are we doing behind the scenes? We're basically taking, we're asking Michelin-trained chefs What is the really great answer of how you make an omelette? What are all the things that you need to consider? Temperature and all the things, seasoning. And we use AI to say, Okay, the answer that was given, how, this is a simplified example, but the answer that was given, how many of the bullet points or statements made by the expert chef were represented in the answer that the AI gave you? Maybe one out of five. So then we tell the AI Why it was wrong? And we do that for 30, 40, 50 cycles. And run it through the system and give it a new system prompt. And you get the same kind of performance gains that you get from the really super expensive reinforcement learning. Okay, that was a lot of detail. Food. We're talking about food. I'm going to hand it over to Ben. Thank you. So as Matt had alluded to, we are talking food. And one of the challenges that we face in organizations is sometimes we don't know the question that we should be asking. So in this example, the question isn't really great. It's how do I make a roast? That's leaving out a lot of variables. Are there any home chefs in the room? So you may say this is a bad question because I don't know how big the roast is. I don't know what type of meat it is. I don't know what technique you're using. All of these lead to a very generic answer, hey, let's flip every 30 to 45 minutes. Not super helpful. And so we give a thumbs down and we explain that every 30 to 45 minutes, it's just actively bad advice. We need to provide better context. And 20 to 30 minutes per pound, well, that's a blunt instrument. We don't actually know without more of those details. So we give that feedback, and we let our subject matter experts actually provide what is that ideal answer. We start with something very simple, like that bad answer had something like this in the prompt. It's like, hey, answer questions accurately, use any context that you have, but it doesn't even know that it's supposed to be acting like a Michelin star chef. When we are done with the process, this goes on and on. If you want to read the full ending prompt, feel free to scan that. But the interesting thing, as Matt had alluded to, is nowhere in the final result does the word roast even appear. Instead, we're talking about the scientific qualities of the answer. We're talking about the Mylar reaction, the browning that you get on your meat when you're cooking it. It's talking about collagen structures, what oils to use in what cases. Said simply, it learned the principles, not the answers to the questions that we are asking. And we can see a very strong result. On the left here, how do you make a roast? This is the original. Same bad answer. But with that updated prompt, it starts out by telling you, you need to choose the right cut, asking you immediately, beef, pork, or lamb. Tells you how to prepare the meat, how to season it. I was just going to add one thing. By the way, it's been proven that if you say act like a super experienced data scientist or act like a super experienced Michelin star chef, that doesn't work. That kind of prompting does not work. It did work a few years ago, but the models have. have grown since then. And why is this a problem? Why can't we just have somebody on our team sit down and write it? Well, JC had mentioned in the keynote that it is somebody's job at Anthropic to write the system prompt or the sole document. This document, they've been leaked. Anthropic does also release these themselves after some delay, about 24, 25,000 words, and many, many lines long. I have never been part of a team that can actually justify spending that much time on one document every single cycle. So... That is where JEPA, the parent library to optimize anything that Matt had introduced, comes in. This is a result of a research paper out of Stanford saying that reflective prompt evolution can outperform reinforcement learning. Essentially, give the AI a signal, and it's a better prompt engineer than Than you or I? And what it does is it helps us extract the principles, practices, strategies, and techniques, specifically not rote memorization of what the answer should have been. This is important because, yeah, if we just say, when asked how to cook a roast, respond with this, it will always be correct. But it does not generalize to every other recipe that I may want to attempt. So how might we leverage this under the hood? How does optimize anything really work? Well, it's doing something very similar to this. I went into ChatGPT, typed this question. It was very helpful and gave me the same, the right formatting on the output. But essentially, we need to collect those weird examples. Where does it fail? This is that thumbs down, by the way, from our subject matter experts. And then we request improvement. We go in and we tell our AI, I used this prompt, this system prompt, this input, the user's question, and it gave me this weird answer. And it was weird for these reasons. And then it gives us a new prompt. And we can try again with our subject matter experts. Now, this doesn't scale very well, but it's a really good starting point. Now, what do we do if that doesn't scale? We can use programming, like development, to automate this process. We can use that feedback and let the system actually write its own feedback and say it's correct for these reasons. It included the right oil, it included the... Information about the Mylar reaction, but it missed information about the internal temperature. The difference between a traditional optimization in machine learning, also known as gradient descent, and JEPA is that this is the only signal that a traditional system will get is that that number, 0 to 1. Not a whole lot of understanding of where did we go right and where did we go wrong. So that natural language feedback is invaluable. Let's take a look at a little bit of a visual here. In machine learning, large language models, this is actually what the inside of the brain kind of looks like conceptually. We have all of these hills. These are all of the expert topics. And that red ball there, that is what a traditional process will do. It starts at some point in the map, and it starts trying to climb the hills around it, getting to the highest point on that plane. The problem is it climbs to not quite the highest hill, but to something that is, oh, middle of the road. But because of the way that that technology works, it gets stuck. Meanwhile, JEPA, the blue ball, or the green ball, sorry, is actually jumping around because it's given human feedback, and it's able to see, oh, I need to jump over here, here, and eventually it found that peak much faster and with hundreds, not thousands, of examples. You can actually run this with as few as 10 examples, but in our example, I think we had 200 question and answer pairs from experts. So the question that I have for everyone is, we're not optimizing prompts. We're optimizing text. Where else might we see text in our AI applications? Any examples? Earlier, we had talked about skills. There's also MCP servers, tool routing logic. All of these are possible. The 2 that this demo focuses on are the system prompts and what is called an LLM as a judge. This is what powers the evaluations that Matt had talked about. This is a stand-in for our users, so that we can test 10s to hundreds of different prompts along the way, without driving our subject matter experts up a wall. So we are almost through all of the math heavy, I promise. But this was too cool not to show, so I wanted to bring this to light. This is why JEPA is so effective compared to traditional methods. That 0 circle at the top, that is where we start. That's the initial how do I make a roast question. It didn't do very good. And it tried five different methods to, or five different prompts to improve. And #5 was the winner of that generation. In a traditional world, we would have thrown out one through 4. But JEPA allows us to continuously explore that space. eventually landing all the way over here on the left on child 12 that was a descendant of 1, which performed quite a bit worse than number 5. It allows us to more effectively explore the expertise of our language models. The cool thing about this optimize anything is that the library now spits out a graph like this, and you can hover over each one of those nodes and see how the prompt has evolved. So hover over node zero, it's, you're a helpful agent, try not to be rude. And then, you know, hover over Node 12, and it has all the stuff in there about, you know, you need to pay attention to flavor and how you lock it in and the chemistry of cooking. Yeah, that's a great callout, Matt. So really, the difference said simply is the old way is, this version feels better, we got a higher number. The new one says, we know that it performs in these categories very well for our evaluations. We no longer are asking, is it good enough? We can now confidently say, this prompt sits on the how do we say it? The efficient frontier of our evaluations. It said simply aligns with our subject matter experts. So what actually changes when we do this? We're changing the instructions and logic. Matt had mentioned this is an auditor's dream. No longer are we looking at why is this a.5 instead of a.4. We're looking at make sure to remember the Mylar reaction. Know to use avocado oil in these Scenarios, olive oil in these scenarios, and sunflower oil here. We aren't locked into specific models. We can continue to use the clods and the GPTs of the world, but we still have the flexibility to use those open weight models if we choose. And because it's just instructions, it means that undoing these changes takes minutes, not days or weeks. Before, our code looked something like this, and our average score, it was getting about 67% of the answer correct. And if we looked for strict accuracy, meaning it hit every bullet point that an expert cared about, we only got 35% of the questions correct. So afterwards, we saw task-specific information. It very clearly described the inputs. It described what output it's expecting, lean into food science. It added the domain-specific knowledge, the food chemistry. And most importantly, it defined strategies. Understanding the difference between different cooking methods and the trade-offs. And the results speak for themselves. We didn't change the data available to the system. We only changed the instructions. And we ended up with a 9.7% improvement in the average score and an 8.8 in improvement in strict accuracy. Again, this is just from listening to our subject matter experts. So a couple of other in the industry at scale, we see this is an example from Shopify. Shopify runs one of the largest e-commerce platforms in the world, probably only second to Amazon. And they were running a very expensive system, analyzing every storefront. They were spending millions of dollars a year, and they covered 13% of stores. Not very great. They used JEPA to actually train a smaller model to be more effective, and now they can cover 100% of shops 75 times cheaper. They got So, they got over five times the ability, and they spent seventy-five times less. Another example from Dropbox, if you don't know when you search for a file in Dropbox, your search and the files are actually going to AI and saying, hey, does this file and description match this search term? And again, Dropbox used JEPA to use, again, a smaller model and lower their adaptation. To changing needs from their users from weeks to days. All just from listening to feedback. This is a bit more local. I'm currently working with a Fortune 500 client, and they're having AI write queries against their data lake, think Databricks or Power BI. When we started out, the AI knew nothing about that environment, and it was only scoring A fifty-eight percent. In under an hour and less than $5 worth of AI usage, we got all the way up to 89%. Again, all just leveraging existing knowledge from the team, saying yes or no. So again, let's go ahead through what changed. We went domain specific rather than general. We kept everything task specific. We learned strategies and prescribed what output we actually cared about. All audible, our data governance folks love it. Yes. So let's flip the script. Instead of giving the thumbs up and thumbs down information to ChatGPT, to Claude, to Co-Pilot, let's bring that back internally and improve our own products, creating the competitive advantage instead of just rising with the tide. I'll leave you all with a architecture overview. This outlines what we've been talking about today. That chat has a thumbs up and thumbs down. Our users can provide feedback. And then all of that goes into this optimization pipeline. We store that in a tool, an open source tool called Phoenix, so that the data never leaves our customers' environment. And then that is used to train up a user judge, and the user judge, along with the user feedback, allows us to optimize and say, yes or no, we are actually improving. This allows us to run on a weekly or monthly basis, depending on the amount of feedback we've gotten, and continuously improve with how the team Uses the tool. I'll pass it back over to Matt to close this out, and then we'll be ready for some questions. So yeah, this is second to last slide. This is if you want to use this tomorrow. Here's what you can do: Yes, you need access to a developer who will pull down that Optimize Anything library, but all you need to give them is a spreadsheet. So, a spreadsheet is... 10 to 30 interactions with an AI system. And then you sit that down in front of the expert and you say, where did this go wrong? And they write down, okay, here's where they went. This was like completely off. This they got, this was actually a good part of the answer. But then you get the expert feedback in there. You feed it to optimize anything. You spit out the text. That's all it is, is text. It doesn't matter how you're building your gen area. app. And there is some place, no matter what you're using, Office Co-Pilot to custom Gen AI, there's some place where you can drop in this text and get big improvements, again, without the expense of fine-tuning. So we're obviously super excited about it. Hopefully some of that enthusiasm rubs off. And let's see, we have one other question or one other one other thing that there's like resources that we have in QR codes that are at our at our booth where the sponsor area is, and Ben and I will be over there to take any deep dives for anybody who wants to dig into code or more specifics if you'd like to. But love to hear what questions you have or where we can provide some clarity. Yes, Adam. Just curious to understand how this compares to like the Andre Garpathy approach to self-improvement, and if that's been, you know, played into this model or this way of approaching improvement. Yeah, the question for the recording, sorry everyone, that mic does not go through the recording, so I'll be repeating. The question was, how does this compare to the Andre Kaparthy auto research that was unveiled, what was it, maybe a month ago? This is, they're very similar in concept. JEPA is A year, year and a half old at this point, so where auto research is really focused on... architecture of training models, this is very much optimizing text. They can be one and the same, because again, the code to train a model is also text. So I would say that both are feasible and show the same promise. Yeah, and one of the QR codes is an Andre Karpathy post, because all things lead back to Andre Karpathy, yes. Who guys have the next question? Adam, again. Why did you decide Phoenix, and what's the significance behind the Phoenix portion of your database? Theh. Observability in AI is really important. So you need some ability to kind of log the traces or the interactions, the turns between the user and the AI. And basically what we're doing with that product, and it's one of the products that Ben open source products that Ben supports and helps. But basically what we do from there is we pick, we kind of click through a bunch of things where we said, well, these are really great examples or these are really bad examples. And we do the classic data science thing. We turn it into a data set. We carve, we tag 80% of it for training data. We hold out 20% for validation data. And then we point, optimize anything to that data set and have it give a new prompt. And we set the new prompt in Phoenix. Phoenix also stores prompts. and then our app just automatically pulls in the new prompt text. That was one way of explaining it. Yeah, I'll go ahead and echo what you said, Matt, is absolutely, it all holds true. A little bit of a different reason why we choose Phoenix is it is open source. We don't have to worry about the data residency problem. If we have clients that are all on-prem or all in their own cloud, It makes it really easy for us to adhere to those. to those desires of keeping everything private, not sending off our very valuable LLM interactions and really company data to a third party. We're able to stay in control of that. And then on top of that, the feature set of Phoenix just all meshes very, very well with a system like Optimize Anything. Anyone other than Adam? I guess I could ask what I've been asking pretty much everybody that I teach engineering transfer courses for students that are going on to a four-year program from DNAC. And so most of what I'm interested in here today, and he also teaches at DNAC. What are the main tools that you would say is important that we make sure that our students understand before they go out into the workforce two, three, four years from now? Yeah. Yeah, that's an interesting question and I think is unfortunately a little bit dependent on what path they choose to take. I would give different advice to somebody looking to become a data scientist to somebody being a data engineer, different advice to a software engineer. So I think the general advice that I would give to anyone going from a two-year to a four-year degree like DMAC to Iowa State, as an example, would be remain curious, learn to learn, don't get too hung up on one specific tool set, because we've seen with the age of AI things change so frequently. that if we spend too much time making sure that this one tool set is perfect, we run the risk of that being out of date by the time they're out of school. We saw this back in the early 2010s with Hadoop clusters. They were all the rage. Everyone had it. You need to go into Hadoop. And now I haven't worked with anyone that has a Hadoop cluster in a few years. So that is where I would be leaning is learn how to learn. Don't be loyal to any one tool. Understand that judgment point. Matt, do you have anything to add there? No, great answer. Hey, if I'm building a domain specific AI tool for AEC workflows using like Revit API, where would you start with the self-improvement loop talking with like the feedback on incorrect element detection, missed clashes, or something else? I'm sorry, I'm not familiar. What is an AEC environment there? Sorry. Like construction industry. Matt, do you have any thoughts there? Um... The incredible thing about this is that it really is the expert feedback and saying, okay, here's why a clash was missed. Here's what I know from my 30 years of experience. And doing that 30, 40, 100, 200 times, or just setting up this loop that just every two weeks it just pulls in anybody who gave any feedback and updates the prompt to make it better. So it doesn't matter what the strategy is, you're extracting. those out. Um... So, it really becomes a domain-agnostic way to make domain-specific AI. Folks, I do think that puts us right on time, so go ahead and join me in thanking Matt. Thank you.