This article is contributed. See the original author and article here.
The Sales Qualification Agent (SQA) in Dynamics 365 Sales introduces a new class of autonomous sales AI, one that does far more than assist with drafting or summarization. SQA performs multi-step reasoning, conducts live web research, generates personalized outreach, and engages prospects in multi-turn qualification conversations. These capabilities directly shape pipeline quality, seller productivity, and customer relationships.
As agentic AI becomes deeply embedded in revenue-critical workflows, trust must be earned through transparent, repeatable, and rigorous evaluation—not anecdotal wins or point demos.
Today, we’re announcing the Microsoft Sales Bench—a collection of evaluation benchmarks designed to assess the performance of AI-powered sales agents across real-world scenarios. Adding to the Sales Research Bench already published as part of this collection to evaluate Sales Research Agent, today we are also publishing the Sales Qualification Bench to evaluate Sales Qualification Agent in Dynamics 365 Sales.
This post presents the detailed evaluation methodology and results for the agent, including a head-to-head comparison against chatGPT using identical data, tasks, and scoring rubrics. These efforts establish the first benchmark purpose-built to measure end-to-end sales agent workflows, from research to outreach to live qualification.
SQA Architecture
The Dynamics 365 Sales Qualification Agent (SQA) architecture is designed as an end-to-end, enterprise-grade AI system that autonomously researches leads, synthesizes insights, and generates seller-ready outreach. It combines an intelligence engine powered by large language models with iterative web and enterprise data research, tightly integrated with Dynamics 365 Sales and Microsoft Copilot Studio for orchestration. Built on secure enterprise foundations, the architecture enforces governance, compliance, and data protection while enabling scalable, trustworthy AI-driven sales workflows.
Evaluation Metrics and Methodology
To understand how well the Sales Qualification Agent (SQA) performs in real-world sales qualification workflows, we designed the Sales Qualification Bench, a comprehensive evaluation that mirrors how sellers actually research leads, personalize outreach, and engage with prospects. Our goal was straightforward: measure whether SQA can help reps qualify faster, personalize more effectively, and carry higher-quality customer conversations—using the same signals and information they rely on every day.
To ensure that the evaluations accurately represent real-world conditions, we developed a testbed that closely mirrors the complexity and ambiguity found in contemporary sales environments. This allowed us to evaluate SQA end to end, from autonomous research and reasoning to grounded, actionable research briefs, outreach messages, and multi-turn qualification conversations.
Evaluation Setup
To ensure real-world fidelity, we constructed a production-like lead evaluation environment that mirrors how SQA operates in Dynamics 365 Sales.
Lead and Data Corpus
Three synthetic but realistic seller companies (C1) across distinct industries, with unique:
Product offerings
Knowledge sources
Ideal customer profiles
300+ lead dataset (C2) expanded into a scenario-rich corpus:
Companies across 6 global regions (North America, Europe, Asia, South America, Australia, Africa)
33 industries
Mixed clarity (well-known brands and long-tail companies)
Structured attributes (name, role, email)
CRM roles represented:
Sales representatives
Digital specialists
Customer success managers
Each linked to relevant accounts, opportunities, and cases
Company segment coverage:
Enterprise
Mid-Market
Small Business
Government
Education
500+ email exchanges simulating real sales interactions:
Technical product questions
Meeting requests
Ambiguous or low-intent inquiries
Simulated Agent Workflows
All evaluations reflected real SQA behavior:
Autonomous web-based research
Role-aware outreach generation
Multi-turn qualification conversation handling
Tasks Evaluated and Evaluation Metrics
1. Company Research
For each lead, the agent generates a structured research brief including:
Business overview, strategy and priorities
Financial signals
Recent news relevant to the seller
Metrics
Definition
Recency
Measure of how recent time-sensitive insights are relative to the current date (older insights are not as useful for sellers)
Relevance & Solution Fit
Measure of how well the insights are tied back to sellers’ offerings (relevant insights are more actionable than a regurgitation of facts) and articulate the lead company’s need or interest in then
Completeness
Measure of how well the insights capture all the facts that are useful to a seller
Reliability
Measure of how consistently the agent finds useful insights for the seller (e.g., strategic priorities return current strategic priorities and not generic mission statements, news returns news articles and not generic evergreen statements about a company)
Credibility
Measure of how reputable the sources referenced by the agent are
2. Lead Outreach
Based on its research, the agent generated a personalized email aligned to:
The lead’s role
The seller’s value proposition
The company’s business context
Value-based positioning
Metric
Definition
Clarity
Assesses how clear, precise, and jargon-free the message is, ensuring every sentence adds value.
Personalization
Measures how well the email is tailored to the specific target company, using concrete company-level details rather than generic industry language.
News-anchored opening
Checks whether the email references recent company events or updates, ensuring the outreach feels timely and current.
Relevance and Solution Fit
Measure of how well the insights are tied back to sellers’ offerings/solutions (relevant insights are more actionable than a regurgitation of facts), and articulate the lead company’s need or interest in them
Structure
Evaluates whether the email has a clear logical flow from opening hook to problem, solution, and call to action.
3. Qualification Conversations (Engage)
The agent then autonomously engages back and forth with the lead, progressively asking them questions for customer-configured qualification criteria such as budget, need, and timeline and answering the lead’s questions such as:
“What does your solution do?”
“How are you priced?”
“How do you compare to competitors?”
“Who else uses this?”
Metric
Definition
Answer Quality
Assesses whether the agent provides clear, relevant, and complete answers that directly address the customer’s intent.
Agent Comprehension
Evaluates how well the agent understands customer intent, prioritizes requests, and adapts tone and strategy based on the user’s response.
Answer Readability
Checks that responses are natural, professional, easy to read, and fully compliant with formatting and content rules.
Human handoff accuracy
Ensures the agent correctly flags when human intervention is required, such as for unanswered technical questions, legal/billing requests, meeting requests, or explicit requests for a human.
Discovery question coverage
Measures how effectively the agent qualifies leads using indirect, strategic discovery questions across Need, Budget, Authority, and Timeline.
Each metric is scored independently on a 0–10 scale, where higher scores indicate stronger performance. We used an LLM-as-a-judge approach to score outputs against the ground truth and rubric and manually reviewed a sampled subset of evaluations to calibrate the judges and validate scoring consistency. To reduce judge variance and mitigate hallucination risk, each sample was evaluated five times, and the mean across runs was recorded as the final score.
Benchmarking Strategy with ChatGPT
To ensure an objective and fair comparison, we replicated a standard seller workflow in ChatGPT UI using GPT-4.1 with Pro license, a more advanced model than the GPT-4.1-mini variant currently used by SQA.
Standard Prompting
This setup simulates how a seller naturally interacts with a general-purpose LLM:
Comparisons reflect real-world usability, not prompt-engineering skill
Identical Knowledge Sources and Context
ChatGPT was given the exact same knowledge sources as SQA, including:
Full lead information and seller value proposition
Seller Q&A documentation via the SharePoint connector
Historical conversation context for reply generation
This isolates differences in agent reasoning and orchestration, not data access.
Evaluation Results
Microsoft evaluated the Sales Qualification Agent (SQA) and ChatGPT with over 300 leads, covering research, outreach, and qualification tasks with identical knowledge sources. Evaluations completed on December 4, 2025, showed that SQA consistently outperformed ChatGPT-4.
Research: SQA was 6% more effective at relevant, thorough company research.
Outreach: SQA was 20% better at personalized communication and timely event references.
Engagement: SQA scored 16% higher for precise responses and targeted qualifying questions.
SQA also operates autonomously, reducing overhead and boosting pipeline quality for sales teams.
Results by Task Category
1. Company Research
SQA was 6% better than ChatGPT, winning in its ability to perform more relevant and complete research that highlighted the lead company’s interest in the sellers offerings:
SQA provided more relevant results: To ensure sellers spend their time on the most important leads, they need to determine whether a lead is good fit for their offerings. While both SQA and ChatGPT were given the same context (seller company and value proposition of the offerings), SQA consistently did better at tying its research back to this context, helping sellers determine fit. Appendix A shows an example where SQA was able to tie the company’s strategic priorities to its need for a collaboration platform and infer strong purchase ability from its robust operational health and minimal leverage burden.
SQA synthesized results with higher level of fidelity and completeness: The agent’s value is directly correlated to its ability to eliminate tedious work for the seller. SQA produced more detailed research synthesis (as demonstrated in Appendix A), giving a single, trusted source for the seller to get equipped with any insights they may need.
These results stem from numerous experiments aimed at optimizing web research for the best outcomes at minimal cost, rather than relying on costly advanced models. Sellers get deeper insights with SQA’s agentic RAG for real-time reasoning with iterative web search results, combined with unique capabilities that increase data coverage, for example, auto-linking CRM records and extraction of company name from lead emails.
2. Personalized Outreach
SQA was 20% better than ChatGPT, notably ahead in the level of personalization and mentions of relevant recent events that will resonate with the lead.
More personalized and customer-centricity: A lead is more likely to respond to a cold outreach email that directly explains how the seller’s offering can address their needs. SQA did so effectively by starting with the lead’s situation and recent events, while ChatGPT often focused on the seller and uses heavier technical jargon. A clear, actionable call to action bookends the email and guides the conversation forward. Appendix B shows an example of how SQA was able to tie a recent acquisition the lead’s company made to the value proposition of the seller’s offering.
These results are based on direct engagement with sellers – every sales team that deploys SQA gives us precious feedback that all other customers benefit from.
3. Qualification Conversations (Engage)
SQA was 16% better than ChatGPT. It responded with greater precision to the lead’s questions to develop purchase interest and asked pointed discovery questions to better qualify the lead before handing off to a seller.
Answers accurately by correctly understanding the lead’s intent and maintaining conversation context effectively. To drive deeper buyer consideration, SQA independently answered even the most technical questions that leads had about the seller’s offerings while maintaining the context from earlier messages in the simulated conversation, delivering clear, direct, and well-structured responses. Appendix C demonstrates SQA’s ability to pull the most relevant information from provided knowledge sources (in this case, files with technical specifications) during an ongoing conversation with a lead.
Handles uncertainty responsibly, handing off to a supervisor/seller when appropriate. Both SQA and ChatGPT were instructed to handoff a lead to a supervising seller when a suitable response cannot be generated or when the lead is considered qualified as per pre-defined criteria. SQA handed off accurately and at the right moment in more tests than ChatGPT.
Demonstrates strong discovery coverage. To maximize the value exchange from each follow-up conversation with the lead, SQA and ChatGPT were instructed to include discovery questions in their response to assess pre-configured qualification criteria (covering lead’s need, budget, buying authority and purchase timeline). SQA was able to ask pointed discovery questions to cover more of these criteria than ChatGPT in our simulated conversations. This resulted in SQA identifying and handing off better qualified leads through its engagement.
These gains are attributable to the hard lessons we have learnt through close collaboration with customers to understand the diversity of needs on intent detection and knowledge retrieval across AI agents for Dynamics 365 Sales, Service and Contact Center applications.
Conclusion
Sales Qualification Agent (SQA) is not a simple productivity tool—it is a complex multi-step agent directly influencing revenue outcomes. The Sales Qualification Bench represents a foundational step toward enterprise-grade trust, transparency, and continuous quality improvement for agentic AI in sales.
Through rigorous, scenario-driven benchmarking, SQA demonstrated clear leadership across research, outreach, and qualification—while also surfacing concrete architectural and rule-quality investments for the next phase of growth.
Upcoming plans for the Sales Qualification Bench include using the benchmark for continuous improvement of the Sales Qualification Agent, running further comparisons against a wider range of competitive offerings, and publishing the eval package so customers can reproduce it on their own data. Evaluation is not a one-time event. It is now embedded as a first-class engineering discipline in the SQA development lifecycle—ensuring the agent evolves safely, measurably, and in lockstep with seller needs.
Appendix
Sample inputs and responses for each category
Appendix A. Sample of research synthesis generated by SQA
Task
Find and summarize business overview, strategic priorities, financial summary, and recent news. C1: Microsoft Teams C2: Agência Nacional de Águas e Saneamento Básico
Research Response
Overview Agência Nacional de Águas e Saneamento Básico (ANA) is a Brazilian federal regulatory agency in the public sector, focused on water resources management and basic sanitation. It operates nationwide with headquarters linked to the Ministry of Integration and Development. ANA is responsible for regulating the use of water resources, managing the security of dams, overseeing public irrigation, and establishing national standards for basic sanitation services. The agency serves primarily government entities and industry—a B2B context, as it sets legislation and frameworks for critical infrastructure projects. Its mission centers on sustainable water resource management for development and environmental security. This background is highly relevant to Microsoft Teams’ value proposition, as ANA’s nationwide role and collaboration among stakeholders (internal teams, other agencies, private partners) demand robust, secure digital communication and collaboration platforms like Teams, which would support ANA’s operations, regulatory projects, and document governance requirements.
Strategic priorities ANA’s current strategic priorities include strengthening climate resilience, advancing national water governance, and digitalizing monitoring infrastructure to improve the management of Brazil’s water resources. The agency has prioritized participation in international programs, digital transformation in water resource management, and enhancing collaboration tools for more efficient regulatory processes. It seeks integrative technology for better data sharing, persistent project documentation, and cross-functional coordination among regulatory, legal, and operational teams. These needs align closely with Microsoft Teams’ capabilities: integrated communication, persistent chat channels, seamless meetings, and data governance tools are well-suited to ANA’s initiatives and pain points around digital collaboration, transparency, and security in government operations.
Financial summary Agência Nacional de Águas e Saneamento Básico (ANA) demonstrates stable financial health as a federally funded autarchy, with annual revenues sourced exclusively from government allocations. Its updated budget stands at R$458.63 million, with R$416.56 million in current revenues and R$277.8 million in expenses paid. The agency has executed approximately 89% of its forecasted budget with minimal reliance on outside debts or capital markets, indicating low financial risk and cost-effective operations. For a seller like Microsoft Teams, ANA shows consistent funding and robust operational execution, supporting strong purchasing capacity for communication and collaboration products and services.
News Agência Nacional de Águas e Saneamento Básico (ANA) launches new Digital Transformation Plan for 2025–2027: Published on 16/09/2025, ANA’s new Digital Transformation Plan prioritizes digital advancement for the next three years. This aligns with the value proposition of Microsoft Teams, as modernization initiatives may involve upgrading communication and collaboration infrastructure, making ANA potentially receptive to integrated solutions like Teams.
ANA publishes revised Information and Communication Technology Master Plan (PDTIC) for 2024–2026: On 24 February 2025, ANA approved updates to its Master Plan for IT and Communication, emphasizing ongoing activities by its IT department. For Microsoft Teams, this is relevant as the procurement and deployment of communication platforms may form part of the plan’s priorities and budget.
ANA releases 2025–2026 regulatory agenda with focus on digital processes and operational standards in water and sanitation: The new regulatory agenda, published 12/12/2024, includes digital governance, public-private partnerships, and operational standards for water and sanitization. Growing focus on digital solutions to support these agendas could signal interest in platforms that enable cross-team alignment, document management, and virtual collaboration, matching Microsoft Teams capabilities.
Appendix B. Sample of an initial outreach email generated by SQA
Task
Write a cold sales outreach email to a lead. Don’t include a sign-off or signature in the email body.
Lead ID: `d3f74f84-7e2b-4c16-9c37-919c2b8c2c6e`
Responses
Hi Akira,
I noticed the Japan Meteorological Agency’s recent procurement notice for a cloud-based meteorological communication system, highlighting your focus on secure and efficient collaboration.
Microsoft Teams can support JMA’s mission by consolidating chat, video calls, and file sharing into a single platform, enabling seamless cross-team collaboration on forecasts and critical data. This integrated approach aligns with your strategic priorities of disaster prevention and timely information sharing across multiple locations.
Would you be available for a quick call in the next few days to discuss further?
Appendix C. Sample of an engagement email generated by SQA
Task
Last Customer Reply “We’re constantly battling fragmented conversations—Slack channels get out of control, and switching to Teams for meetings is just more friction. Documents get lost between email threads and random cloud folders. Sales and marketing are rarely on the same page, especially when we’re chasing deals. Honestly, half the team just ends up texting when things get urgent. If you can solve that, I’m listening”
Response
Hi Liam,
Thanks for sharing the specific challenges your team faces with fragmented conversations, switching between Slack and Teams, and document management.
Microsoft Teams offers several features that directly address these issues: + Unified chat and channels keep conversations organized by project or department, reducing fragmentation. + Integrated meetings and calling allow seamless transitions from chat to video, minimizing friction between platforms. + Secure file sharing and real-time co-authoring help prevent documents from getting lost across email threads and cloud folders. + Deep integration with Microsoft 365 apps ensures sales and marketing teams stay aligned, with shared access to files and collaborative tools. + External collaboration features allow you to work securely with guests and partners without switching accounts.
To help tailor recommendations, could you share what budget range you have considered for improving your communication tools? Let me know if you’d like more details or have other questions about streamlining collaboration at CSU.
This article is contributed. See the original author and article here.
In October 2025, we announced the general availability of the Sales Qualification Agent (SQA) in Dynamics 365 Sales—a breakthrough in autonomous lead qualification. Sales Qualification Agent empowers sellers by helping build higher quality opportunity while eliminating tedious, repetitive work. Sales Qualification Agent autonomously researches every lead, initiates personalized outreach, and engages prospects to understand purchase intent, ensuring that sellers spend their time meeting prospects who are ready to take the next step. With modes enabling both seller-driven and fully autonomous qualification, the agent supports a key goal for sales organizations—increasing revenue per seller.
Customers are using Sales Qualification Agent in two ways:
Helping boost revenue beyond current sales capacity
Responding to inbound leads within minutes instead of days, increasing response rates and in turn, qualified opportunities.
Engaging leads that sellers are unable to follow up on due to capacity constraints, or those deemed economically unviable to pursue.
Increasing pipeline quality by focusing the seller’s time on a handful of high intent, engaged leads recommended by the agent.
Helping reduce sales costs
Reducing back-office costs related to lead research and validation, using Sales Qualification Agent in “Research only” mode to hand-off only the leads that meet the ideal customer profile criteria.
Automatically disqualifying low-quality leads, saving hours of seller time during the week.
Continuing benchmarking the quality of sales AI agents
Microsoft is building the future of agentic Sales technology with prebuilt AI agents, such as Sales Qualification Agent, the Sales Research Agent, and the Sales Close Agent available in Dynamics 365.
At Microsoft, we’re committed to delivering quality, trust, and transparency with our agents, and that requires rigorous evaluation. As we continue to build new agents and improve existing ones for critical sales workflows, evaluation benchmarks provide a structured and transparent way for our customers to measure quality for the jobs the agent does.
Today, we’re announcing the Microsoft Sales Bench—a new collection of evaluation benchmarks designed to assess the performance of AI-powered sales agents across real-world scenarios. This framework brings together purpose-built metrics, hundreds of sales-specific scenarios, and composite scoring validated by both human and AI judges.
The Sales Bench isn’t starting from scratch. It now formalizes and expands what began with the Sales Research Bench, published on October 21, 2025, which evaluates how AI solutions answer business research questions for sales leaders.
Today, we’re extending the Microsoft Sales Bench with a second benchmark: the Microsoft Sales Qualification Bench, focused on measuring how effectively AI agents qualify leads and generate high-quality pipeline.
Introducing the Sales Qualification Bench for lead qualification
This Microsoft Sales Qualification Bench evolved from rigorous evaluations we conducted since the Sales Qualification Agent’s public preview in April, with the goal of objectively measuring quality as we further developed the agent, partnering with customers from a diverse set of industries. Since the preview, we measured every update against these standards, ensuring improvements are real and repeatable.
We generated a synthetic dataset modeled after companies from three different industries, with 300 leads, with attributes such as name, company, and email ID—representative of what sales teams typically work with before any enrichment or hygiene is performed. In addition to these typical attributes, we also added key knowledge inputs such as value proposition of the products being sold, customer case studies, and documentation for answering customer questions.
In addition to Sales Qualification Agent, we used the evaluation framework to measure ChatGPT by OpenAI on the same dataset. Since we didn’t have access to an autonomous agent from OpenAI, we mimicked how a human seller would use ChatGPT to recreate the three key jobs SQA performs. We provided each system—Sales Qualification Agent and ChatGPT—the exact same lead inputs, knowledge sources, and contextual signals under controlled evaluation configurations. We used a ChatGPT Pro license with GPT-4.1. This model is the closest match (and slightly better) to Sales Qualification Agent’s GPT-4.1 mini, which we intentionally chose to deliver optimal quality at lower cost per lead than newer models. Additionally, Pro license was chosen to optimize for quality: ChatGPT’s pricing page describes Pro as “full access to the best of ChatGPT.”1
The framework evaluates outputs from the three jobs across Sales Qualification Agent and ChatGPT:
Research: Company research for the lead—background, strategic priorities, financial health, and latest news.
Outreach: A personalized email generated based on research, to make initial contact with the lead.
Engagement: The agent’s conversation with a lead until it’s qualified or dispositioned.
Our scoring metrics span core quality (accuracy, relevance, completeness), trustworthiness (grounding and citations), and business-specific success criteria (e.g., relevancy of company research to highlight interest in the seller’s offerings, personalization of the initial outreach emails sent to catch the lead’s attention, accuracy of responses to the lead’s questions to drive purchase intent, and the timing of handoff to a seller when the lead is ready to engage).
Outputs were scored independently by both human reviewers and an LLM judge built with GPT-5.1, using a 1–10 scale for each metric. These metric-specific scores were then rolled up using a simple average to produce a composite quality score. The result is a rigorous benchmark presenting a composite score and dimension-specific scores to reveal where agents excel or need improvement. Our methodology, metrics, and their definitions are described in this technical blog.
Results
In evaluations completed on December 4, 2025, using the Sales Qualification Bench, Sales Qualification Agent outperformed ChatGPT on each of the three jobs required for sales qualification:
Research: The Sales Qualification Agent outperformed ChatGPT with 6% higher aggregate scores, leading on relevancy and completeness in research results that highlighted the lead company’s interest in the seller’s offerings.
Outreach: Sales Qualification Agent demonstrated 20% better results compared to ChatGPT, generating email drafts with accurate personalization and mentions of relevant recent events that will resonate with the lead.
Engagement: Sales Qualification Agent’s email responses to engage a lead over a multi-turn conversation scored 16% higher than ChatGPT’s. SQA generated emails that responded to the lead’s questions with accurate answers that develop their purchase interest and with precise discovery questions that qualify the lead before handing off to a seller.
In addition to performing better on these metrics, Sales Qualification Agent has the ability to run autonomously, which can help significantly reduce the time spent generating pipeline while helping sales teams build better quality pipeline.
We evaluated quality based on scoring metrics specific to research, outreach and engagement scenarios. An LLM judge, using OpenAI’s GPT-5.1 model, scored the outputs generated by Sales Qualification Agent and ChatGPT’s 4.1 model on a scale of 1 to 10 for each rubric, with higher scores being better. A total composite score for each of these scenarios was calculated as a weighted average from their metric-specific scores. More details on the methodology can be found in this blog.
Sales Qualification Agent scores well on these three jobs as its optimized for sales-specific scenarios and uses the following techniques to get great results:
It uses agentic Retrieval Augmented Generation (RAG) to relentlessly research each lead, ensuring greater completeness. More on this in the following section.
With knowledge of what the company sells, it can contextualize every workflow to increase relevancy for both the seller and the lead.
It can retrieve organizational knowledge from attached documents and internal repositories like SharePoint with greater precision, boosting accuracy of its responses when engaging with the lead.
The technical blog details which metrics SQA excels at relative to ChatGPT, where it falls short, and why.
Translating evals to real-world impact
Running evals led to major Sales Qualification Agent improvements during its six-month preview. Early results prompted us to try agentic AI design patterns, especially agentic RAG, which improved our company research by allowing iterative web searches and real-time reasoning. They also led us to enhance data coverage by auto-linking existing CRM records to each lead and inferring company names from lead emails. These updates provided sellers with deeper insights, revealing strategic opportunities and risks beyond basic facts.
For instance, when researching leads for a security company, Sales Qualification Agent can link news on recent cyberattacks to increased demand for its software. As highlighted in the technical blog, research synthesized by the agent makes such inferences more consistently than ChatGPT. Enhancing the agent’s research also improved the relevance and personalization of outreach emails, helping agents better engage leads and clarify their ability and intent to purchase before handing them off to sellers.
Sandvik Coromant, a leader in precision cutting tools, partnered with us to pilot Sales Qualification Agent for their Digital Commerce program. After the updates, Pia Cedendahl, Global Sales Manager for Strategic Channels/Partners and Online Sales, noted, “Sales Qualification Agent’s answers became far more on-point to our business—it’s like having a research assistant that already understands what we care about.” Sandvik Coromant saw improved lead conversion and higher engagement from their Digital Account Managers, validating the impact of our evaluation-driven approach. Pia joined Microsoft leaders at the Microsoft Ignite 2025 session, “Accelerate revenue and seller productivity with agentic CRM,” where she shared how the team saved more than 120 hours and $19,000 in just the first three weeks since launching a pilot, and forecasted a 5% increase in revenue with full rollout.
Better insights, more personalization, proven value
Equipped with agentic AI design and backed by data-driven evaluation, customers can confidently use the Sales Qualification Agents so that:
Sellers receive comprehensive company overviews, timely news highlights, and actionable recommendations that are consistently delivered with high quality—drawing a clear line from insight to action.
Sales leaders can expand their qualified pipeline cost efficiently, with the agent ensuring high lead quality.
Prospects benefit from more personalized outreach, enhancing their experience and supporting increased conversion rates.
What’s next
We’ll continue to refine Sales Qualification Agent using agentic design patterns, aiming to make every improvement measurable and meaningful. Stay tuned for the full evaluation results and methodology for the Sales Qualification Bench, which will be published for transparency and reproducibility. We also intend to add more evaluation frameworks and benchmarks to the Microsoft Sales Bench collection including benchmarks that cover future sales agent capabilities.
This article is contributed. See the original author and article here.
Markets are shifting faster than traditional operating models can adapt. Customer expectations are rising. Capital constraints require every decision to deliver measurable value. In this environment, organizations that thrive are those treating business model innovation as a core capability–not an occasional strategy exercise. They are rethinking how value is created, how operations scale, and how technology supports the enterprise.
ERP is evolving to meet this moment. As Satish Thomas described in his recent blog, we’re entering the era of agentic business applications: systems moving beyond recording transactions to actively orchestrating processes, anticipating needs, and adapting to change. For leaders, this evolution means ERP is becoming a system of action. It aligns people, data, and workflows around the outcomes the business is driving toward.
Why rental business models are accelerating
In asset- and across product-driven industries, the ability to generate value from equipment, tools, and machinery has always been central to business performance. Increasingly, customers want access to what they need, when they need it, without long-term ownership. This asset-as-a-service shift is expanding across industries–from heavy equipment, consumer goods, and automotive to medical devices, technology assets, and renewable energy.
Global forecasts underscore this momentum. Industry forecasts from the American Rental Association and independent market analysts indicate the North American equipment and tool rental market will exceed $80 billion. Global rental and leasing revenues are already well above $500 billion annually. Similar trends are emerging in adjacent verticals, all signaling that the opportunity extends far beyond traditional equipment categories.
Today, rental processes often run across fragmented systems for quoting, dispatching, billing, and financials. The result: avoidable idle time, slow handoffs, margin erosion, and most critically, subpar customer experiences.
This is where Dynamics 365 ERP can help you with transforming your rental operations business processes.
A strong connected foundation
Operational excellence in rental management demands more than isolated workflows. It requires seamless integration and orchestration across the entire lifecycle. From rental rate management, quoting, and asset reservation to contract management, inspections, maintenance, and billing, every step must work in harmony to keep revenue moving and customers delighted.
Building these capabilities into Dynamics 365 connects the full lifecycle of work within a single agentic ERP.
Today, we are announcing that we are making investments to accelerate adding new capabilities for rental operations. These capabilities are now in development, planned for release in Q4 of 2026, including new ERP capabilities designed for:
Quoting and reservations to confirm availability and seamlessly convert opportunities into contracts.
Contract and pricing management for short- and long-term rentals, rent-to-own programs, or seasonal pricing with flexible terms and rate structures.
Inspections orchestration to coordinate inspections upon deliveries, transfers, and returns.
Billing and invoicing tied directly to rental activity to improve accuracy and reduce reconciliation effort.
Rental operations succeed when every handoff–from quoting to return–is coordinated and timely. The forthcoming capabilities bring structure and clarity to these moments. They are designed to help organizations accelerate deal cycles, improve asset utilization, enhance customer satisfaction, and reduce reliance on custom or manual processes.
By unifying these processes inside Dynamics 365, leveraging the composability of Dynamics 365 Finance & Supply Chain Management, Project Operations, and Field Service, organizations will be able to run rental as a natural extension of their operations rather than as a separate system or afterthought.
Driving utilization, uptime, and margin
The levers that shape rental performance–utilization, uptime, margin, and cash flow–are all influenced by how well operational data connect across the lifecycle. When organizations have a single view of reservation status, asset availability, and maintenance needs, they can plan more effectively and limit avoidable idle time. Consistent pricing and billing structures then help ensure every transaction reflects the same rules and logic, reducing confusion and rework. When maintenance activities are linked to actual rental usage, teams can schedule work proactively, support asset longevity, and reduce the risk of unplanned downtime.
With these elements working together, rental operations can run with greater predictability – improving financial clarity while delivering more reliable, trusted customer experiences.
Turning operational telemetry into financial clarity
Operational data is only as valuable as the financial clarity it enables. Information such as rental item status, reservations, and maintenance history can become a strategic asset when used to drive accurate forecasting, informed capital allocation, depreciation planning, and profitability analysis. By connecting operational metrics with financial outcomes, organizations can optimize resource utilization, reduce risk, and uncover opportunities for growth.
Enabling a strong ecosystem
The rental management capabilities that we are developing in Dynamics 365 will form a robust foundation for rental businesses. Rental operations vary significantly within and across industry verticals. To address this, we continue to build on our proven model of success. We are empowering the extensive ecosystem of Microsoft Dynamics 365 partners and ISVs to deliver specialized, deeply vertical solutions that meet unique business needs.
Because these foundational capabilities will run natively in Microsoft Dynamics 365 on the Microsoft Cloud, customers, ISVs, and partners can extend them with AI agents using MCP and Microsoft Copilot Studio to support vertical-specific requirements from front-office process optimization and automation to compliance, pricing strategies, and equipment lifecycle planning. The flexibility of the Microsoft Cloud, combined with advanced AI, is designed to help organizations accelerate innovation, optimize operations, and deliver differentiated customer experiences that drive growth and profitability. Microsoft remains committed to enabling innovation across the ecosystem.
Looking forward: the future of rental, built into agentic ERP
Flexible, service-based operations are transforming how organizations create value from their assets. Our investment in rental management capabilities is designed to help customers meet this moment. It will simplify processes, improve visibility, and deliver measurable business outcomes.
If your organization operates or supports rental models today, now is an ideal time to explore what’s possible with Dynamics 365. If you’re attending Convergence 2025, you’ll see firsthand how these investments align with our broader vision for adaptive, agentic ERP systems–solutions that work alongside your teams to drive operational excellence and unlock new opportunities for growth.
This article is contributed. See the original author and article here.
The way we work is transforming. Generative AI and agentic business applications are catalysts for a profound shift in how we create, collaborate, and make decisions. What once took hours now happens in moments. Ideas move faster, insights surface sooner, and the boundaries between human creativity and machine capability are blurring. This isn’t the future—it’s unfolding now, and it’s changing the very fabric of how businesses operate.
It’s a move from systems of record to systems of action. In this new era, AI agents go beyond support—they help interpret signals, uncover patterns, and initiate actions and continuously optimize processes on your behalf. At Convergence 2025, December 9–12, we’ll explore how organizations can build an AI-first autonomous enterprise powered by data, copilots, and agents working together across Microsoft Dynamics 365 and the Microsoft Cloud.
From automation to autonomy: The AI-first organization
Businesses have long pursued automation to streamline operations. The next leap is autonomy—systems that interpret signals across the business, adapt workflows in real time, and anticipate decisions. Autonomous enterprises go beyond efficiency—they are driven, adaptive, and human-centered, with intelligent systems that understand context, collaborate across processes, and deliver measurable outcomes.
We believe the recognition of Microsoft Dynamics 365 Leader placement in Gartner® Magic Quadrant™ and IDC MarketScape reports reinforces our ability to connect front-end and back-end operations into one intelligent ecosystem. Our leadership is grounded in the breadth of the Microsoft Cloud, the depth of our data estate, and our ability to bring AI, analytics, and business applications together as a unified platform to accelerate end-to-end AI transformation.
Agentic business applications are built on three connected pillars: data, Copilot, and agents—working together to change how work gets done.
1. Data: The foundation of agent autonomy
Most business processes begin with data. Finance close, order-to-cash, customer insights, case resolution, supply chain planning, field service, HR workflows—each depends on complete, connected, trusted information. When this data is fragmented across systems, agents can’t act with autonomy, and AI can only answer questions instead of accelerating outcomes.
Dynamics 365 spans both front-office and back-office operations—customer relationship management (CRM) and enterprise resource planning (ERP)—providing a continuous, end-to-end view of customers, employees, finances, supply chain, and operations. This structured, governed, and auditable data is the backbone of business processes—how work flows, how controls are enforced, and how organizations maintain compliance at scale.
As business model innovation becomes even more important, we continue to invest deeply in these core applications. We’re expanding capabilities across ERP and CRM, investing in bringing Microsoft Fabric, Microsoft Copilot Studio, and Microsoft Power Platform closer to Dynamics 365 than ever before. From connected frontlines to adaptive supply chains, business applications are becoming composable, intelligent, and outcome-driven—helping customers innovate more quickly while maintaining trust, security, and governance, adapt faster, redesign workflows, and build the operational integrity required for agent-powered processes. At Convergence 2025, you’ll see how Dynamics 365 continues to evolve as the operational engine of AI-first organizations.
At Microsoft Ignite 2025, we announced major advances in how agents connect with ERP systems. The Dynamics 365 ERP Model Context Protocol (MCP) server is evolving from static actions to a dynamic, configurable framework that adapts as business needs evolve. A new analytics MCP server extends this capability to structured metrics and insights, supporting agents to reason over governed operational and financial data—not snapshots or exports, but live business signals. Together, these updates unify data, logic, and actions into a common protocol that can simplify integration and accelerate automation.
And with MCP designed for massive scale—including support of millions of ERP actions—organizations gain the performance and extensibility required for agent-powered business processes at enterprise scale.
The entire Microsoft data and application stack—Dynamics 365 ERP and CRM, Dataverse, Fabric, and more—forms the operational engine of agentic transformation. It provides the structure, governance, and adaptability required to support AI-powered business models and next-generation workflows.
2. Microsoft 365 Copilot: Your strategic productivity partner
Copilot continues to help transform productivity across many roles. It helps sales teams move deals forward, finance teams accelerate reconciliation, and service teams resolve issues before they escalate. But Copilot is more than an assistant—it’s the connective layer that links people, data, and systems, understanding intent, orchestrating workflows, and guiding decisions across Dynamics 365, Power Platform, and Microsoft 365.
We expect AI agents will be core to how businesses operate—planning, deciding, and acting across systems to accelerate outcomes. These agents will interpret signals, identify patterns, and initiate actions to keep the business in motion.
Microsoft’s strategy spans the full agent ecosystem: first-party embedded agents inside Dynamics 365, agents for specialized industries that can be customized and extended by partners, specific partner-built agents, and custom agents created by organizations through Microsoft Copilot Studio. All share the same security, governance, and identity foundation.
For small to mid-sized businesses, Dynamics 365 Business Central brings agents directly into everyday finance and operations. The Sales Order Agent creates, validates, and updates sales orders which can help improve accuracy and speed by reducing manual entry and exception management. The Payables Agent automates vendor invoices and reconciliations, improving control while freeing finance teams from repetitive tasks. Together, these agents help Business Central customers modernize core processes with governed, AI-powered actions that keep work flowing across purchasing, sales, and accounting.
Across Dynamics 365 finance and operations, embedded agents are transforming core processes—from time and expense entry in Project Operations, to supplier outreach in Supply Chain Management, to reconciliations in Finance and technician scheduling in Field Service. These agents help reduce manual effort and bring greater precision and autonomy to everyday operations.
At Convergence 2025, we’re also thrilled to announce the public preview of the Product Change Management Agent Template—an AI-powered solution that transforms how manufacturers manage the process of change across equipment, products, and processes changes. Built on Copilot Studio, the agent automates workflows and connects critical systems, helping teams cut approval times from weeks to days, reduce errors, and bring innovations to market, faster. Learn how customers like Coca-Cola Beverages Africa (CCBA) are using this agent template to optimize their manufacturing operations.
Partner-built agents that extend industry workflows
Partners are building agents that extend domain workflows and connect directly to Dynamics 365 through the MCP. These solutions show how agent-to-agent coordination and cross-system reasoning will define the next era of enterprise automation.
Shop Floor by RSM helps manufacturers gain real-time visibility across production by bringing job details, quality checks, and operational signals into one experience. It can surface issues as they occur and support teams in resolving them quickly, helping to reduce disruptions and maintain consistent output. By connecting workers, data, and actions on the floor, the agent can support more resilient, adaptive manufacturing operations.
The PayFlow Agent by HSO helps finance teams manage vendor payment inquiries with greater speed and accuracy. It analyzes incoming emails, retrieves live payment data from Dynamics 365, and responds with up-to-date status information—which can reduce manual effort and help vendors receive clear, timely communication. The agent streamlines payment cycles and can improve transparency across accounts payable.
The Quality Impact Recall Agent by Cegeka helps organizations identify product quality issues and trace their impact across inventory and customer shipments. It coordinates notifications, guides corrective steps, and streamlines recall preparedness, which can help teams act quickly when risks emerge and maintain trust in the products they deliver.
Factorial connects to the Dynamics 365 Business Central MCP Server to deliver a new agent-to-agent experience. You can use a single Copilot interface to pull, exchange, and act on data across the systems. For example, within expense workflows, Factorial’s agent can request, validate, and reconcile financial data directly with Business Central.
Zensai’s agent integrates with Dynamics 365 Business Central to transform an organization’s operational signals into aligned goals and check-ins inside Microsoft 365. Built in Copilot Studio and connected through MCP for Dynamics 365, the agent converts finance, compliance, HR, and sales insights into structured, cascaded goals within Perform 365. At Convergence 2025, you’ll see how these first-party, partner, and custom-built agents form a connected ecosystem—and how organizations are already using them to move from automation to autonomy.
Convergence 2025
Convergence has always been where our community comes together to learn, connect, and imagine what’s next—and this year we’re reigniting that tradition at a moment of extraordinary change.
Throughout the event you’ll see how organizations are moving from systems of record to systems of agency, and how you can apply these ideas to accelerate your own AI transformation. I’m excited for you to experience everything we’ve been building.
This article is contributed. See the original author and article here.
Imagine being able to ask your CRM system a question like, “Which opportunities are likely to close this week?” or “Who has met with Ernie Kerrigan at Contoso recently?” and getting an instant, accurate answer without writing a single query or navigating through multiple Views in Dynamics 365.
Whether you’re using Copilot in Dynamics 365 Sales, Power Apps customized through Microsoft Copilot Studio or Microsoft 365 Copilot for Sales, under the hood, these experiences leverage one common engine: AI-powered Dataverse (DV) Search, which seamlessly connects business users to the underlying database schema, translating intent into action without requiring technical expertise. Thousands of enterprise customers already rely on this capability to power their business workflows.
Figure 1: How AI-powered Dataverse Search Connects Copilot Experiences Across Dynamics 365
We’ve reimagined the technology behind Dataverse Search from the ground up. Leveraging recent breakthroughs in agentic AI, the new system delivers answers that are more relevant, contextual, and accurate to your specific business data. Think of it as an intelligent assistant that not only understands your question but figures out the best way to answer it using an adaptive reasoning process.
In this blog, we’ll explore why this agentic approach was necessary, how it works under the hood, and how it scales to enterprise needs supporting complex schemas, massive datasets, and domain specific terminologies while adhering to Microsoft Responsible AI principles. In particular, the agentic approach is model-agnostic, and while different models or fine-tuned models can influence the quality of results, the choice of model is orthogonal to the architecture. For this post, our emphasis remains on the agentic loop and its role in delivering dynamic, context-aware answers. Further, we will demonstrate our success via evaluation results and show you ways to customize it for your business.
Queries to Conversations: Unlocking Your Live Business Data
Every organization’s Dynamics 365 environment is unique, and most customers customize it extensively. Over time, these customizations lead to complex schemas, ambiguous relationships, and massive datasets spanning millions of records and terabytes of data. Our original Dataverse Search system was pioneering, but it relied on a fixed-plan natural language to SQL pipeline. A user’s question was converted to SQL through sequential stages: parsing, schema mapping, data linking, and SQL generation. This design was prone to cascading failure in a sequential pipeline. Each stage operated in isolation without shared context, so a single error could invalidate the entire query. Every question followed the same fixed flow, even when certain steps were unnecessary. This resulted in brittle behavior and suboptimal answers for complex or ambiguous queries that spanned multiple tables.
We recognized the need for a more adaptable, resilient approach to tackle the complexities of enterprise data. This upgrade shifts DV Search beyond simple Search into intelligent, interactive conversations with your business data. For you, this translates into immediate, actionable value by providing:
Real-Time, Actionable Answers: Ask, “Which of my open opportunities in New York are scheduled to close this month?” and get an instant answer from the live Dataverse data. This isn’t a report from last night’s data refresh; it’s the current state of your business.
Democratized Data Access: A service manager can ask, “Show me active, high-priority cases that haven’t been updated in 3 days” without needing to understand the underlying table structure of incidents and case/activities.
Deeper Contextual Conversations: The agent supports multi-turn conversations. After asking about opportunities in New York, you can follow up with, “Of those, which ones are for our ‘Pro’ license?” The agent remembers the context, providing a progressively refined answer.
Under the Hood: Agentic Architecture
To overcome some of the limitations of the earlier system and to meet the complex customer scenarios, the new DV Search architecture introduces an Agentic Orchestrator powered by GPT4.1. It transforms query handling from a static pipeline into a dynamic reasoning loop: plan → execute → refine. Instead of blindly converting text to SQL, the orchestrator treats each question as a goal, intelligently deciding the best steps to reach it.
Figure 2: Agentic Architecture for AI-powered Dataverse Search
Context Awareness and Conversations: When a user submits a new or follow-up question, a dedicated preprocessing component reviews prior conversation history and rewrites the query as a single, self-contained question, enabling coherent multi-turn conversations. For example, if you ask, “Show my top opportunities in Q4” and then follow up with “How about in Europe only?”. the component understands the second question is a refinement of the first rather than starting from scratch or losing track of prior context. This conversational capability makes interactions feel natural and efficient. The refined question is then enriched with the business’s domain knowledge (glossary) to fully reflect the user’s intent within the specific business context.
Dynamic Planning and Execution: When the self-contained question comes in, the orchestrator doesn’t simply translate it into SQL. Instead, it breaks the query into logical steps and decides which tools to use and in what order, while also utilizing the domain knowledge encapsulated with the supplied glossaries. These tools include:
schema_linking_tool: Schema lookup for understanding tables and relationships
data_linking_tool: Semantic Search for finding relevant data values and resolving data ambiguities
sql_execution_tool: SQL execution tool for retrieving results
submit_plan_update_tool: Captures both the original plan and any course corrections made during execution
The orchestrator adapts on the fly if the first attempt fails or returns incomplete results. It analyzes the issue, revises the plan, and retries. This self-correcting loop is a major improvement over older systems that suffered from cascading failures.
Handling Relational Complexity: One of the most powerful aspects of this approach is its ability to handle relational complexity. Operational business application schemas often require multi-hop joins across multiple tables, including custom entities. The orchestrator understands these relationships and can navigate them intelligently, ensuring accurate joins and filters even in highly customized environments. For example, if a question involves linking Accounts to Opportunities and then to a custom Product table, the agent plans the steps and executes them seamlessly.
Personalization and Learning: Personalization further enhances the experience. Over time, the system learns from usage patterns within your organization. If you frequently work with the Accounts table or use certain custom fields, the agent prioritizes those interpretations in future queries. This learning is based on interaction signals, not external data, and is carefully scoped to respect privacy and organizational boundaries. The result is a system that becomes more aligned with your business logic the more you use it.
Real-World Example
Imagine you run Fourth Coffee Machines, a business selling premium espresso and grinder units to commercial and residential customers. It’s managed through a Power App built on Dataverse. A seller begins with a simple keyword search in top-bar search in Power Apps for “Fourth Coffee” to confirm the account record. Thanks to fuzzy matching and relevance re-ranking, even typos like forth coffee or 4thcoffee surface the right entity instantly.
From there, the seller asks Copilot: “Show me my open opportunity at risk with Fourth Coffee.” The agent rewrites the query, scopes it to the current user, interprets at risk as a cold rating, and joins Account → Opportunity. It executes SQL, returns the results, and summarizes them with citations—no manual filtering, no report building.
Finally, the seller pivots to a KPI question: “What is the HRR for Coffee Grinder 02?” Here, the agent consults the business glossary, which defines HRR as Happy Response Rate (positive sentiment ÷ total reviews in the Product Review table). It computes the metric, explains the formula, and cites the source records. The user now understands exactly how the number was derived.
Under the hood, this seamless experience is powered by an Agentic Orchestrator that plans, executes, and refines dynamically. It chooses the right tools, adapts when errors occur, and injects domain knowledge from glossaries. By combining dynamic planning, iterative refinement, relational understanding, and personalization, it represents a significant leap forward from static query pipelines. It’s not just about generating SQL it’s about orchestrating an intelligent, context-aware process that feels conversational and delivers real business value.
Evaluation Results
To measure how well our agentic system performs in practical enterprise scenarios, we evaluated it against curated datasets of user prompts each representing or assisting with a real job to be done. These prompts reflect the everyday questions and tasks that drive productivity for CRM users— from quick record lookups and aggregation analytics using keyword search or simple filters and joins, to complex multi-join queries requiring domain expertise. By categorizing prompts into levels of complexity, we ensure the evaluations capture the full spectrum of enterprise challenges.
For each complexity level, we report two practical metrics: Relaxed Execution Accuracy (EX Accuracy) and P80 Latency. Relaxed Execution Accuracy measures how often the generated SQL returns the same rows as the reference SQL when both are executed on the same data—extra columns in the predicted query are allowed, but extra or missing rows are not; order is ignored unless ORDER BY is specified. P80 Latency is the 80th percentile end to end response time, from request receipt through retrieval, model inference, and verification to the final SQL result. Together, these metrics give a transparent, action-oriented view of correctness and responsiveness as task complexity increases. It highlights where the Agentic framework delivers reliable, efficient answers that empowers users to get more done with natural language.
Complexity Level
Description
Prompt Distribution (%)
EX Accuracy (Relaxed)†
P80 Latency (s)
Level 1
Keyword Search
21%
96.2%
7.7s
Level 2
NL Queries involving retrievals with filters and joins
28%
96.4%
7.5s
Level 3
NL Queries requiring understanding of Domain knowledge, Customizations
51%
81.2%
10.6s
† Metrics averaged over multiple runs
In practice, higher accuracy often comes at the cost of increased latency. Conversely, pushing for low latency can reduce end to end quality. This Agentic system is designed to navigate that tradeoff, delivering strong accuracy while keeping latency within practical bounds. This achieves a practical balance for production use.
Tuning for Your Business: Glossaries and Enriched Schema
No AI system knows your business out-of-the-box. We’ve added tuning mechanisms that let makers refine how the Q&A agent understands your data:
Glossaries: You can define a glossary to teach the agent your company’s unique vocabulary and acronyms. For example, if “QoQ” is common slang on your team for “quarter-over-quarter” or “CTX” refers to a particular set of products, you can add those to the glossary. The next time someone asks “What’s the QoQ growth for CTX?”, the agent will know exactly what that means. This helps align the AI with the lingo of your organization so it interprets queries the same way a knowledgeable employee would.
Schema Descriptions: Dataverse allows adding custom descriptions to tables and columns. By populating these descriptions with meaningful info, you give the agent extra context. For instance, two fields might both be called “Status” – one on a custom entity and one on a standard entity. If you add descriptions like “(Order Status – custom)” vs “(System Status code)”, the agent can use that to pick the right field during SQL generation. Essentially, you’re able to clarify the semantics of your data model for the AI.
Using the inherent metadata in Dataverse (like relationships and display names) plus these maker-driven additions, the agentic system can be tailored to use the correct terms and relationships in your domain, boosting accuracy even further. And because you control these glossaries and descriptions, you can continuously refine the AI’s understanding as your business evolves.
Conclusion
By reinventing Dataverse Search with an agentic architecture, we’ve moved from a rigid query engine to an adaptive, intelligent assistant for your business. The system understands nuance, handles ambiguity through reasoning, and even lets you inject your domain knowledge. Early adopters are seeing more questions answered correctly and faster than before, turning previously buried data into actionable insights. One leading global financial services company saw an Execution Accuracy surge from 22% to 97% on their marquee set of scenarios. This marks a significant step toward making enterprise data truly conversational. It empowers everyone from business users to power makers to tap into complex data and get the answers they need instantly and accurately, simply by asking.
Recent Comments