This article is contributed. See the original author and article here.
In today’s fast-paced digital world, customers expect more than just plain text when interacting with businesses. Traditional text-based conversations can be inefficient. This is especially true when customers need to exchange detailed information, explore options, or make quick decisions. That’s where rich messaging comes in.
Rich messaging introduces interactive elements, such as forms, carousels, and suggested replies, directly within the conversation. This enables businesses to create conversations that are not only more engaging but also visually intuitive. Subsequently, customers understand the choices faster and act with confidence.
Now, you can preview rich media messaging across both live chat and WhatsApp. With rich media messaging, businesses can deliver enhanced experiences on the channels customers use most. This reduces typing, speeds up resolution, and improves overall satisfaction for both customers and agents.
Rich media message types
While rich messaging is already available for Apple Messages for Business, forms, suggested replies, and cards are available in live chat and suggested replies are available in WhatsApp.
Forms are supported in live chatSuggested replies are supported in live chat and WhatsAppCards and carousels are supported in live chat
For scenarios where these options don’t fully meet a business’s live chat requirements, organizations can use Microsoft’s Adaptive Card technology to create fully customized JSON-based messages.
Key capabilities
One template, multiple channels
Create rich message templates once and use them across both live chat and WhatsApp. There’s no need to redesign for each channel.
Preview pane
Instantly preview how your rich media message will appear to customers while designing, ensuring accuracy and a great user experience.
Create messages for both WhatsApp and live chat in one template (left) and preview the rich media message design (right) in template designer
Seamless bot integration
Reuse certain rich media templates, such as live chat forms and WhatsApp suggested replies, directly in Copilot Studio—eliminating the need to recreate templates for bots.
Service reps can customize templates
Customer service representatives can easily customize admin-designed templates by editing fields before sending them to customers, enabling personalized interactions.
Customer service representative editing rich media message form on the right before sending to customer
Enhanced customer experience
Rich messages are more visually engaging and make it easier for customers to share relevant information. This boosts customer satisfaction and reduces resolution times.
The customer’s view of a live chat form
Get started today
To get started, navigate to the Copilot Service admin center, select Productivity in Support experience and then select Manage for Rich messages.Here you can start designing rich message templates for customer service representatives and bots.
This article is contributed. See the original author and article here.
Business leaders are facing a new reality. AI and agents are transforming traditional systems of record into systems of action, becoming applications that not only store data but use it to drive decisions and outcomes.
In this new model, the user experience becomes almost invisible. What matters most is the foundation: structured data, clear governance, and business logic that allows agents to operate effectively.
These are agentic business applications. They can help teams scale up capacity, lower operational costs, grow topline revenue, and surface key insights on an ongoing basis for smarter, faster decisions.
But technology alone isn’t enough. Business transformation requires functional leaders to align processes with these new capabilities. That means rethinking how work gets done. Agents can operate in the background, continuously monitoring, analyzing, and acting. They surface insights and take action, helping leaders stay focused on outcomes.
Early adopters—what we call Frontier Firms—are building the right foundations now. They are investing in agentic customer relationship management (CRM), enterprise resource management (ERP), and contact center solutions (CCaaS), as well as rethinking how to align business processes with agents. They realize there must be a fundamental shift in how work gets done.
Microsoft agentic business applications: Toolkit for the frontier
To help organizations move to the Frontier, Microsoft offers a suite of agentic business applications with Dynamics 365—bringing enterprise-grade AI and Microsoft Copilot experiences across CRM, ERP, and CCaaS. Organizations can extend Dynamics 365 with Microsoft Power Platform and Microsoft Copilot Studio to build custom AI-powered applications and agents tailored to unique business needs.
At the core of every agentic business application there are three components:
Agents that transform business processes.
Copilot that empowers every employee to maximize productivity.
A unified, secure data platform that connects insights across the enterprise.
Let’s take a look at each of the components of the stack.
Expanding Dynamics 365 agents in key business functions
Over the last year, we have launched more than a dozen business process agents in Dynamics 365, giving organizations a starting point to transform sales, service, finance, and supply chain. We’re continuing to expand our agent portfolio to deliver proactive and growth-oriented outcomes.
In Dynamics 365 Sales, the new Sales Close Agent (in public preview beginning October 25, 2025) help sellers prioritize high-value opportunities, identify and mitigate risks for deals in pipeline proactively, and close simple transactions—accelerating deal velocity and improving win rates.
Also in Dynamics 365 Sales, agents are moving to public preview and general availability, including Sales Research Agent (public preview began on October 1, 2025) and Sales Qualification Agent (with general availability beginning October 25, 2025).
In Dynamics 365 Customer Service and Dynamics 365 Contact Center, the new Quality Evaluation Agent (general availability beginning October 24, 2025) gives supervisors and service teams a real-time pulse on service quality across both human and AI-led interactions. Unlike traditional, manual approaches that review a small fraction of engagements, this agent uses the speed and scale of AI to evaluate the majority of cases and conversations, uncover actionable insights, and assess AI-handled interactions. It monitors quality metrics, detect anomalies, and initiate corrective actions—enabling broader, faster, and more consistent quality management.
In addition, service agents moving to general availability beginning October 24, 2025, include: Case Management Agent in Dynamics 365 Customer Service and Customer Knowledge Management Agent, and Customer Intent Agent in Dynamics 365 Customer Service and Contact Center. In Dynamics 365 Field Service, Scheduling Operations Agent, in public preview, keeps schedules agile and service running smoothly.
“By adopting agents in Dynamics 365 service solutions, we’re making every interaction faster and more empathetic. In a service where demand exceeds capacity, this can be a game changer.
Agents help gather information, route contacts based on need, and streamline resolution—enabling counselors to focus on direct support to young people.
In our fundraising unit, we’re also exploring how agents can manage inbound calls to reduce abandonment rates from 20 to 30% to under 5%—directly lifting revenue streams that fund vital services.”
—Helen Vahdat, Chief Information Officer, yourtown (Kids Helpline)
In our ERP portfolio, customers can use Account Reconciliation Agent in Dynamics 365 Finance and the Supplier Communications Agent in Dynamics 365 Supply Chain Management to complete reconciliation faster and process inbound supplier emails autonomously.
“The Account Reconciliation Agent pilot sharpened our team’s understanding of AI in practice and paved the way for a confident move toward the Supplier Communication Agent where we see clear potential to drive efficiency and enhance collaboration.”
—Wolfgang Bauer, ERP Team Lead, Haas Baumanagement GmbH
Additionally, customers can access Sales Order Agent and Payables Agent in Dynamics 365 Business Central and Time and Expense Agent and Activity ApprovalsAgent in Dynamics 365 Project Operations.
To further support organizations on their journey to the frontier, we’re making it easier to get started with agents. Beginning in late November 2025, Dynamics 365 Premium SKUs—including Dynamics 365 Sales Premium, Customer Service Premium, Supply Chain Management Premium, and Finance Premium—will include 1,000 Copilot Credits per user, per month, pooled at the tenant level. New and existing customers can use these credits to run agents in the scenarios most meaningful to their business. When the included capacity is exhausted, customers can add more capacity with additional Copilot Credits as needed.
Benchmarks—The Sales Research Bench
As organizations begin using agents to transform core processes, the next priority is ensuring these solutions deliver measurable value so that leaders can make confident high-impact decisions. Microsoft is meeting this need through benchmarks that provide a standardized evaluation framework to continuously measure quality of output from AI solutions. The most recent example is the Sales Research Bench, which uses a 100-point scale to measure what we have heard from sales leaders that matters most to them: accuracy, relevance, clarity, and transparency. More specifically, the Sales Research Bench evaluates how AI solutions generate text and data visualizations in response to the strategic, multi-faceted questions that sales leaders have about their business data.
The Sales Research Bench runs 200 business research questions typical of enterprise sales leaders on a sample customized data schema that reflects the complexities of enterprise environments. It assesses performance across 8 quality dimensions with scoring by large language models (Azure Foundry out-of-box evaluators for two dimensions and OpenAI’s GPT 4.1 model with specific instructions for the other six dimensions). Dimension-specific scores are weighted to create a composite quality score.
In evaluations executed by Microsoft using the Sales Research Bench framework, the Sales Research Agent in Dynamics 365 outperforms both ChatGPT-5 and Claude Sonnet 4.5. More details on the benchmark methodology and results are available here. We intend to publish the full evaluation package including the 200 benchmark questions and sample dataset in the coming months, so others can run these evaluations themselves.
With this approach, we’re creating purpose-built agent benchmarks aligned to the priorities of business leaders. Our intent is to demonstrate a new standard for trust and transparency, providing clear insight into the quality and performance of agents in a specific business function. We also plan to publish agent performance regularly to reduce friction and help leaders make confident, data-driven decisions.
Results: Results reflect testing completed on October 19, 2025, applying the Sales Research Bench methodology to evaluate Microsoft’s Sales Research Agent (part of Dynamics 365 Sales), ChatGPT by OpenAI using a ChatGPT Pro license with GPT-5 in Auto mode, and Claude Sonnet 4.5 by Anthropic using a Claude Max license.1
Empowering everyone with Microsoft Copilot
The next critical layer in agentic transformation is Microsoft Copilot, which is embedded across Dynamics 365 enhancing sales, customer service, and finance. By automating routine tasks, such as summarizing key opportunities, drafting email responses to customer queries, and predicting and acting on supply chain disruptions, Microsoft Copilot frees employees to focus on strategic work to drive more impact.
With Copilot in Dynamics 365 Sales, sellers can spend less time in their CRM, and more time nurturing customer relationships. For example, Copilot can provide quick summaries of sales opportunities and leads, meeting preparations, and account-related news.
Grand & Toy uses Copilot’s real-time insights, dashboards, and time-saving features like chat summarization, email creation, and sentiment analysis to deliver exceptional customer service.
Connecting businesses on a unified, trusted platform
Lastly, there is the data layer—the foundation of agentic transformation. When unified, it can connect every interaction, insight, and action. With integration between Dynamics 365 and Microsoft 365, organizations can unify data and workflows, so teams can stay focused and make faster decisions.
Built on Microsoft Dataverse, Dynamics 365 agents deliver real-time insights across departments like sales, service, and finance without silos and enabling faster and more collaborative decision-making.
Banco PAN is a strong example of this transformation, using Dataverse as a core part of their Dynamics 365 solution to enable real-time integration across systems.
“Our operators now have immediate access to the customer’s history and can resolve issues more quickly.”
—Tulio Prado, Service Superintendent at Banco PAN
Dynamics 365 seamlessly connects with Power Platform and Copilot Studio, creating a unified foundation for apps, agents, and AI. This deep integration empowers everyone—not just professional developers—to build, customize, and deploy intelligent solutions that adapt to business needs. By bringing low-code innovation and enterprise-grade security together, organizations can streamline processes and workflows, reduce costs, and unlock new ways to work smarter.
Explore more
With today’s business applications varying widely in capability and impact, organizations face critical choices. Agentic business applications are the path forward. Discover how leading companies are moving on that path with Dynamics 365, beyond static systems of record to intelligent systems of action to drive real-time insights, automation, and growth.
Tune into the Business Applications Launch Event streaming October 23, 2025 on YouTube to see real-world solutions built on Microsoft agentic business applications.
Join us at Microsoft Ignite 2025 in San Francisco, California from November 18 to 21, 2025. Connect with industry leaders, explore hands-on demos, and be there to get the latest product announcements. Attend Innovation Sessions that delve deeper into how agentic business applications are reshaping the future of work and actionable strategies for leadership.
1Methodology and Evaluation dimensions: Sales Research Bench includes 200 business research questions relevant to sales leaders that were run on a sample customized data schema. Each AI solution was given access to the sample dataset using different access mechanisms that aligned with their architecture. Each AI solution was judged by large language model judges for the responses the solution generated to each business question, including text and data visualizations. We evaluated quality based on 8 dimensions, weighting each according to qualitative input from customers, what we have heard customers say they value most in AI tools for sales research: Text Groundedness (25%), Chart Groundedness (25%), Text Relevance (13%), Explainability (12%), Schema Accuracy (10%), Chart Relevance (5%), Chart Fit (5%), and Chart Clarity (5%). Each of these dimensions received a score from a large language model judge from 20 as the worst rating to 100 as the best. For example, the large language model judge would give a score of 100 for chart clarity if the chart is crisp and well labeled, score of 20 if the chart is unreadable or misleading. Text Groundedness and Text Relevance used Azure Foundry’s out-of-box large language model evaluators, while judging for the other six dimensions leveraged Open AI’s GPT 4.1 model with specific guidance. A total composite score was calculated as a weighted average from the 8 dimension-specific scores. More details on the methodology can be found in this blog: The Sales Research Agent and Sales Research Bench.
This article is contributed. See the original author and article here.
In today’s hyper-competitive business landscape, sales leaders face a relentless challenge: how to drive growth, outpace competitors, and make smarter decisions faster in a resource constrained environment. Thankfully, the promise of AI in sales is no longer theoretical. With the advent of agentic solutions embedded in Microsoft Dynamics 365 Sales, including the Sales Research Agent, organizations are witnessing a transformation in how business decisions are made, and teams are empowered. But how do you know if these breakthrough technologies have reached a level of quality where you can trust them to support business-critical decisions?
Today, I’m excited to share an update on the Sales Research Agent, in public preview as of October 1, as well as a new evaluation benchmark, the Microsoft Sales Research Bench, created to assess how AI solutions respond to the strategic, multi-faceted questions that sales leaders have about their business and operational performance. We intend to publish the full evaluation package behind the Sales Research Bench in the coming months so that others can run these evals on different AI solutions themselves.
The New Frontier: AI Research Agents in Sales
Sales Research Agentin Dynamics 365 Sales empowers business leaders to explore complex business questions through natural language conversations with their data. It leverages a multi-modal, multi-model, and multi-agent architecture to reason over intricate, customized schemas with deep sales domain expertise. The agent delivers novel, decision-ready insights through narrative explanations and rich visualizations tailored to the specific business context.
For sales leaders, this means the ability to self-serve on real-time trustworthy analysis, spanning CRM and other domains, which previously took many people days or weeks to compile, with access to deeper insights enabled by the power of AI on pipeline, revenue attainment, and other critical topics.
Image: Screenshot of the Sales Research Agent in Dynamics 365 Sales
“As a product manager in the sales domain, balancing deep data analysis with timely insights is a constant challenge. The pace of changing market dynamics demands a new way to think about go-to-market tactics. With the Sales Research Agent, we’re excited to bridge the gap between traditional and time-intensive reporting and real-time, AI-assisted analysis — complementing our existing tools and setting a new standard for understanding sales data.“
Kris Kuty, EY LLP Clients & Industries — Digital Engagement, Account, and Sales Excellence Lead
What makes the Sales Research Agent so unique?
Its turnkey experience goes beyond the standard AI chat interface to provide a complete user experience with text narratives and data visualizations tailored for business research and compatible with a sales leader’s natural business language.
As part of Dynamics 365 Sales, it automatically connects to your CRM data and applies schema intelligence to your customizations, with the deep understanding of your business logic and the sales domain that you’d expect a business application to have.
Its multi-agent, multi-model architecture enables the Sales Research Agent to build out a dedicated research plan and to delegate each task to specialized agents, using the model best suited for the task at hand.
Before the agent shares its business assessment and analysis, it critiques its work for quality. If the output does not meet the agent’s own quality bar, it will revise its work.
The agent explains how it arrived at its answers using simple language for business users and showing SQL queries for technical users, enabling customers to quickly verify its accuracy.
Why Verifiable Quality Matters
Seemingly every day a new AI tool shows up. The market is crowded with offers that may or may not deliver acceptable levels of quality to support business decisions. How do you know what’s truly enterprise ready? To help make sure business leaders do not have to rely on anecdotal evidence or “gut feel”, any vendor providing AI solutions needs to earn trust through clear, repeatable metrics that demonstrate quality, showing where the AI excels, where it needs improvement, and how it stacks up against alternatives.
While there is a wide range of pioneering work on AI evaluation, enterprises deserve benchmarks that are purpose-built for their needs. Existing benchmarks don’t reflect 1) the strategic, multi-faceted questions of sales leaders using their natural business language; 2) the importance of schema accuracy; or 3) the value of quality across text and visualizations. That is why we are introducing the Sales Research Bench.
Introducing Sales Research Bench: The Benchmark for AI-powered Sales Research
Inspired by groundbreaking work in AI Benchmarks such as TBFact and RadFact, Microsoft developed the Sales Research Bench to assess how AI solutions respond to the business research questions that sales leaders have about their business data.1
Read this blog postfor a detailed explanation of the Sales Research Bench methodology as well as the Sales Research Agent’s architecture.
This benchmark is based on our customers’ real-life experiences and priorities. From engagements with customer sales teams across industries and around the world, Microsoft created 200 real-world business questions in the language sales leaders use and identified 8 critical dimensions of quality spanning accuracy, relevance, clarity, and explainability. The data schema on which the evaluations take place is customized to reflect the complexities of our customers’ enterprise environments, with their layered business logic and nuanced operational realities.
To illustrate, here are 3 of our 200 evaluation questions informed by real sales leader questions:
Looking at closed opportunities, which sellers have the largest gap between Total Actual Sales and Est Value First Year in the ‘Corporate Offices’ Business Segment?
Are our sales efforts concentrated on specific industries or spread evenly across industries?
Compared to my headcount on paper (30), how many people are actually in seat and generating pipeline?
Judging is handled by LLM evaluators that rate an AI solution’s responses (text and data visualizations) against each quality dimension on a 100-point scale based on specific guidelines (e.g., give score of 100 for chart clarity if the chart is crisp and well labeled, score of 20 if the chart is unreadable, misleading). These dimension-specific scores are then weighted to produce a composite quality score, with the weights defined based on qualitative input from customers, what we have heard customers say they value most. The result is a rigorous benchmark presenting a composite score and dimension-specific scores to reveal where agents excel or need improvement.[2]
[2] Sales Research Bench uses Azure Foundry’s out-of-box LLM evaluators for the dimensions of Text Groundedness and Text Relevance. The other 6 dimensions each have a custom LLM evaluator that leverages Open AI’s GPT 4.1 model. 100-pt scale has 100 as the highest score with 20 as the lowest. More details on the benchmark methodology are provided here
Running Sales Research Bench on AI solutions
Here’s how we applied the Sales Research Bench to run evaluations on the Sales Research Agent, ChatGPT by OpenAI, and Claude by Anthropic.
License: Microsoft evaluated ChatGPT by OpenAI using a Pro license with GPT-5 in Auto mode and Claude Sonnet 4.5 by Anthropic using a Max license. The licenses were chosen to optimize for quality: ChatGPT’s pricing page describes Pro as “full access to the best of ChatGPT,” while Claude’s pricing page recommends Max to “get the most out of Claude.”3 Similarly, ChatGPT’s evaluation was run using Auto mode, a setting that allows ChatGPT’s system to determine the best-suited model variant for each prompt.
Questions: All agents were given the same 200 business questions.
Instructions: ChatGPT and Claude were given explicit instructions to create charts and to explain how they got to their answers. (Equivalent instructions are included in the Sales Research Agent out of box.)
Data: ChatGPT and Claude accessed the sample dataset in an Azure SQL instance exposed through the MCP SQL connector. The Sales Research Agent connects to the sample dataset in Dynamics 365 Sales out of box.
Results are in: Sales Research Agent vs. alternative offerings
In head-to-head evaluations completed on October 19, 2025 using the Sales Research Bench framework, the Sales Research Agent outperformed Claude Sonnet 4.5 by 13 points and ChatGPT-5 by 24.1 points on a 100-point scale.
Image: Sales Research Agent – Evaluation Results
1Results: Results reflect testing completed on October 19, 2025, applying the Sales Research Bench methodology to evaluate Microsoft’s Sales Research Agent (part of Dynamics 365 Sales), ChatGPT by OpenAI using a ChatGPT Pro license with GPT-5 in Auto mode, and Claude Sonnet 4.5 by Anthropic using a Claude Max license.
Methodology and Evaluation dimensions: Sales Research Bench includes 200 business research questions relevant to sales leaders that were run on a sample customized data schema. Each AI solution was given access to the sample dataset using different access mechanisms that aligned with their architecture. Each AI solution was judged by LLM judges for the responses the solution generated to each business question, including text and data visualizations.
We evaluated quality based on 8 dimensions, weighting each according to qualitative input from customers, what we have heard customers say they value most in AI tools for sales research: Text Groundedness (25%), Chart Groundedness (25%), Text Relevance (13%), Explainability (12%), Schema Accuracy (10%), Chart Relevance (5%), Chart Fit (5%), and Chart Clarity (5%). Each of these dimensions received a score from an LLM judge from 20 as the worst rating to 100 as the best. For example, the LLM judge would give a score of 100 for chart clarity if the chart is crisp and well labeled, score of 20 if the chart is unreadable or misleading. Text Groundedness and Text Relevance used Azure Foundry’s out-of-box LLM evaluators, while judging for the other six dimensions leveraged Open AI’s GPT 4.1 model with specific guidance. A total composite score was calculated as a weighted average from the 8 dimension-specific scores. More details on the methodology can be found in this blog.
The Sales Research Agent outperformed these solutions on each of the 8 quality dimensions.
Image: Evaluation Scores for Each of the Eight Dimensions
The Road Ahead: Investing in Benchmarks
Upcoming plans for the Sales Research Bench include using the benchmark for continuous improvement of the Sales Research Agent, running comparisons against a wider range of competitive offerings, and publishing the full evaluation package including all 200 questions and the sample dataset in the coming months, so that others can run it themselves to verify the published results and benchmark the agents they use. Evaluation is not a one-time event. Scores can be tracked across releases, domains, and datasets, driving targeted quality improvements and ensuring the AI evolves with your business.
Sales Research Bench is just the beginning. Microsoft plans to develop eval frameworks and benchmarks for more business functions and agentic solutions—in customer service, finance, and beyond. The goal is to set a new standard for trust and transparency in enterprise AI.
Why This Matters for Sales Leaders
For business decision makers, the implications are profound:
Accelerated Decision-Making: AI-driven insights you can trust, when delivered in real time, enable faster, more confident decisions
Continuous Improvement: Thanks to evals, developers can quickly identify areas for highest measurable impact and focus improvement efforts there
Trust and Transparency: Rigorous evaluation means you can rely on the outputs, knowing they’ve been tested against the scenarios that matter most to your business.
The future of sales is agentic, data-driven, and relentlessly focused on quality. With Microsoft’s Sales Research Agent and the Sales Research Bench evaluation framework, sales leaders can move beyond hype and make decisions grounded in demonstration of quality. It’s not just about having the smartest AI—it’s about having a trustworthy partner for your business transformation.
This article is contributed. See the original author and article here.
Raising the bar for Enterprise AI
The Sales Research Agent in Dynamics 365 Sales automatically connects to live CRM data and can connect to additional data stored elsewhere, such as budgets and targets. It reasons over complex, customized schemas with deep domain expertise, and presents novel, decision-ready insights through text-based narratives and rich data visualizations tailored to the business question at hand.
For sales leaders, this means the ability to self-serve building out rich research journeys, spanning CRM and other domains, that previously took many people days or weeks to compile, with access to deeper insights enabled by the power of AI on pipeline, revenue attainment, and other critical topics.
But the market is crowded with offers that may or may not deliver acceptable levels of quality to support business decisions. How can business leaders know what’s truly enterprise ready? To help make sure customers do not have to rely on anecdotal evidence or “gut feel”, any vendor providing AI solutions must earn trust through clear, repeatable metrics that demonstrate quality, showing where the AI excels, where it needs improvement, and how it stacks up against alternatives.
Figure 1. The Sales Research Agent in the Dynamics 365 Sales Hub.
This post introduces the architecture and evaluation methodology and results behind Microsoft’s Sales Research Agent. Its technical innovations distinguish the Sales Research Agent from other available offerings, from multi-agent orchestration and multi-model support to advanced techniques for schema intelligence, self-correction and validation. In determining how best to evaluate the Sales Research Agent, Microsoft reviewed existing AI benchmarks and ultimately decided to create the Sales Research Bench, a new benchmark purpose-built to measure the quality of AI-powered Sales Research on business data, in alignment with the business questions, needs, and priorities of sales leaders. In head-to-head evaluations completed on October 19, 2025, the Sales Research Agent outperformed Claude Sonnet 4.5 by 13 points and ChatGPT-5 by 24.1 points on a 100-point scale.
Figure 2. Sales Research Bench Composite Score Results.
1Results: Results reflect testing completed on October 19, 2025, applying the Sales Research Bench methodology to evaluate Microsoft’s Sales Research Agent (part of Dynamics 365 Sales), ChatGPT by OpenAI using a ChatGPT Pro license with GPT-5 in Auto mode, and Claude Sonnet 4.5 by Anthropic using a Claude Max license.
Methodology and Evaluation dimensions: Sales Research Bench includes 200 business research questions relevant to sales leaders that were run on a sample customized data schema. Each AI solution was given access to the sample dataset using different access mechanisms that aligned with their architecture. Each AI solution was judged by LLM judges for the responses the solution generated to each business question, including text and data visualizations. We evaluated quality based on 8 dimensions, weighting each according to qualitative input from customers, what we have heard customers say they value most in AI tools for sales research: Text Groundedness (25%), Chart Groundedness (25%), Text Relevance (13%), Explainability (12%), Schema Accuracy (10%), Chart Relevance (5%), Chart Fit (5%), and Chart Clarity (5%). Each of these dimensions received a score from an LLM judge from 20 as the worst rating to 100 as the best. For example, the LLM judge would give a score of 100 for chart clarity if the chart is crisp and well labeled, score of 20 if the chart is unreadable or misleading. Text Groundedness and Text Relevance used Azure Foundry’s out-of-box LLM evaluators, while judging for the other six dimensions leveraged Open AI’s GPT 4.1 model with specific guidance. A total composite score was calculated as a weighted average from the 8 dimension-specific scores. More details on the methodology can be found in the rest of this blog.
Microsoft will continue to use the evals in Sales Research Bench to drive continuous improvement of the Sales Research Agent, and Microsoft intends to publish the full evaluation package in the coming months, so others can run it to verify published results or benchmark the agents they use (example evals from the benchmark are included in this paper).
Sales Research Agent architecture
The architecture of the Sales Research Agent sets it apart from other offerings, delivering both technical innovation and business value.
Multi-Agent Orchestration: The Sales Research Agent uses a dynamic multi-agent infrastructure that orchestrates the development of the research blueprints, the text-based narratives and data visualizations accompanied by an explanation of the agent’s work. Specialized agents are invoked at each step in the journey to deliver domain-optimized insights for user questions, taking organizational data as well as business and user context into account.
Multi-Model Support: This multi-agent infrastructure enables each specialized agent to use the model that is best suited to the task at hand. Microsoft tests how each specialized agent performs with different models. Models are easily swapped out to continue optimizing the Sales Research Agent’s quality as the models available evolve over time.
Support for Business Language: There is a difference between business language (how business users naturally communicate) and natural language (any language that is not code). The Sales Research Agent can give quality answers to prompts in business language, because it breaks down the prompt into multiple sub-questions, building a research plan and using multi-step reasoning over connected data sources. Additionally, the Sales Research Agent is infused with knowledge of the Sales domain, so it can correctly interpret terminology and context that is only implicit to the user’s prompt.
Schema Intelligence: The Sales Research Agent can handle both out-of-the-box and customized enterprise schemas, adapting to complex, real-world environments. It has sophisticated techniques and heuristics built in to recognize the tables and columns that are relevant to the user query.
Self-Correction and Validation: The Sales Research Agent incorporates advanced auto-correction mechanisms for its generated responses. Whether producing SQL or Python code, the agent leverages sophisticated code correctors capable of iterative refinement—reviewing, validating, and amending outputs as needed. The correction loop begins with a fast, non-reasoning model to identify and fix straightforward issues. If errors persist, the system escalates to a reasoning model and, if required, a more powerful model to ensure deeper contextual understanding and precise correction. This dynamic, multi-model process helps to ensure that the final code is both accurate and reliable, enhancing the overall quality and trustworthiness of the agent’s insights and recommendations.
Explainability: The system tracks every agent interaction and decision, as well as the SQL query and Python code generated to produce the research blueprint. The Sales Research Agent uses this information to help users quickly verify its accuracy and trace its reasoning. Each blueprint includes Show Work, an explanation in simple language for business users, with an advanced view of SQL queries and more details for technical users.
Figure 3. A high-level diagram of Sales Research Agent’s architecture and how it connects to business workflows
Why Enterprise Sales Requires a New Evaluation Framework
In traditional software, unit tests give repeatable proof that core behaviors work and keep working. For AI solutions, evaluations (evals) are needed to demonstrate quality and track continuous improvement over time.
Enterprises deserve evaluations that are purpose-built for their needs. While there is a wide range of pioneering work on AI evaluation, existing benchmarks miss key attributes that are needed for an AI solution to guide critical business decisions:
The benchmark must reflect the strategic, multi-faceted business questions of sales leaders using their business language.
The benchmark must measure schema accuracy: whether the system correctly handles tables, columns, and joins on system of record schemas that can be highly customized.
The benchmark should assess insights across both text-based narratives and data visualizations, reflecting the outputs with which leaders make decisions.
Introducing Sales Research Bench for AI-powered Sales Research
To meet these demands, Microsoft developed the Sales Research Bench, a composite quality score built to evaluate AI-powered Sales Research solutions in close alignment with customers’ actual questions, environments, and priorities. From engagements with customer sales teams across industries and geographies, Microsoft identified the critical dimensions of quality and created real-world business questions in the language sales leaders use. The data schema on which the evaluations take place is customized to reflect the complexities of customers’ enterprise environments, with their layered business logic and nuanced operational realities. The result is a rigorous benchmark presenting a composite score based on 8 weighted dimensions, as well as dimension-specific scores to reveal where agents excel or need improvement.
Benchmark Methodology
The evaluation infrastructure for Sales Research Bench includes:
Eval Datasets: 200 business questions in the language of sales leaders, each associated with its own set of ground-truth answers for validation.
Sample enterprise dataset: Eval questions run on a customized schema, reflecting the complexities of enterprise environments.
Evaluators: LLM-judge-based evaluation, tailored for each of the 8 quality dimensions described below. Azure Foundry out-of-box evaluators are used for Text Groundedness and Text Relevance. For the other 6 dimensions, OpenAI’s GPT 4.1 model is used with specific guidelines on how to score answers, which are provided in the appendix.
Here are 3 of the 200 evaluation questions informed by real sales leader questions:
Looking at closed opportunities, which sellers have the largest gap between Total Actual Sales and Est Value First Year in the ‘Corporate Offices’ Business Segment?
Are our sales efforts concentrated on specific industries or spread evenly across industries?
Compared to my headcount on paper (30), how many people are actually in seat and generating pipeline?
Dimensions of Quality
The Sales Research Bench aggregates eight dimensions of quality, weighting them as shown in the parentheses below to reflect what we have heard customers say they value most in AI tools for sales research during their engagements with Microsoft.
Text Groundedness (25%): Ensures narratives are accurate, faithful to the sample enterprise data, and applying correct business definitions.
Chart Groundedness (25%): Validates that charts accurately represent the underlying data from the same enterprise dataset.
Text Relevance (13%): Measures how relevant the insights in the text-based narrative are to the business question.
Explainability (12%): Ensures the AI solution accurately and clearly explains how it arrived at its responses.
Schema Accuracy (10%): Verifies the correct selection of tables and columns by evaluating whether the generated SQL query is consistent with the tables, joins, and columns in the ground-truth answers. (Business applications typically consist of approximately 1,000 tables, many featuring around 200 columns, all of which can be highly customized by customers.)
Chart Relevance (5%): Validates whether the data and analysis shown in the chart are relevant to the business question.
Chart Fit (5%): Evaluates if the chosen visualization matches the analytical need (e.g., line for trends, bar for comparisons).
Chart Clarity (5%): Assesses readability, labeling, accessibility, and chart hygiene.
Each of these dimensions received a score from an LLM judge from 20 as the worst rating to 100 as the best. For example, the LLM judge would give a score of 100 for chart clarity if the chart is crisp and well labeled, score of 20 if the chart is unreadable or misleading.
Sample Enterprise Dataset
Evaluation needs representative conditions to be useful. Through customer engagements, Microsoft identified numerous edge cases from highly customized schemas, complex joins and filters, and nuanced business logic (like pipeline coverage and attainment calculations).
For instance, most customers customize their schemas with custom tables and columns, such as replacing an industry column with an industry table, and linking it to the customer object, or adding market and business segment instead of using an existing segment field. As a result, their environments often contain both the out-of-box tables and columns as well as customized tables and fields, all with similar names. By systematically incorporating these edge cases into the sample custom schema, Sales Research Bench evaluates how agents perform outside of the “happy path” to assess enterprise readiness.
Figure 4. Example evaluation case (see the Appendix for more examples)
Evaluating Sales Research Agent and Other Solutions
In addition to the Sales Research Agent, Microsoft evaluated ChatGPT by OpenAI using a Pro license with GPT-5 in Auto mode and Claude Sonnet 4.5 by Anthropic using a Max license. The licenses were chosen to optimize for quality: ChatGPT’s pricing page describes Pro as “full access to the best of ChatGPT,” while Claude’s pricing page recommends Max to “get the most out of Claude.”[1] Similarly, ChatGPT’s evaluation was run using Auto mode, a setting that allows ChatGPT’s system to determine the best-suited model variant for each prompt.
Microsoft implemented a controlled evaluation environment where all systems – Sales Research Agent, ChatGPT-5, and ClaudeSonnet 4.5 worked with identical questions and data, but through different access mechanisms aligned with their respective architectures.
The Sales Research Agent has a native multi-agent orchestration layer that connects directly to Dynamics 365 Sales data. This allows it to autonomously discover schema relationships and entity dependencies, and to perform natural-language-to-query reasoning natively within its own orchestration stack.
Since ChatGPT and Claude do not support relational line-of-business source systems out of box, Microsoft enabled access to the same dataset by mirroring it into an Azure SQL instance. Mirroring was done to preserve all the data types, primary keys, foreign keys, and relationships between tables from Dataverse to Azure SQL. This Azure SQL copy was exposed through the MCP SQL connector, ensuring that ChatGPT and Claude retrieved the exact same data but through a standardized external interface. Once responses were captured, they were evaluated using the same evaluators against the exact same evaluation rubrics.
Finally, prompts to ChatGPT and Claude included instructions to create charts and to explain how they got to their answers (Sales Research Agent has this functionality out of box.)
In a test of 200 evals on the customized schema, Sales Research Agent earned a composite score of 78.2 on a 100-point scale, while Claude Sonnet 4.5 earned 65.2 and ChatGPT-5 earned 54.1.
The chart below presents the Sales Research Bench composite scores, with scores for each dimension overlaid on the bars within the stacked bar chart.
Figure 5. Sales Research Bench Composite Scores with Dimension-specific Scores.
Breaking this down, the Sales Research Agent outperformed other solutions on all 8 dimensions, with the biggest deltas in chart-related dimensions (groundedness, fit, clarity, and relevance), and the smallest deltas in schema accuracy and text groundedness. Claude Sonnet 4.5 outperformed ChatGPT-5 on all 8 dimensions, with the biggest delta in chart clarity and the smallest delta in chart relevance.
Figure 6. Sales Research Bench Scores by Dimension.
Looking Ahead
Sales Research Agent introduces a new generation AI-first business application that transforms how sales leaders can approach and solve complex business questions. The Sales Research Bench was created in parallel to represent a new standard for enterprise AI evaluation: Rigorous, comprehensive, and aligned with the needs and priorities of sales leaders.
Upcoming plans for the Sales Research Bench include using the benchmark for continuous improvement of the Sales Research Agent, running further comparisons against a wider range of competitive offerings, and publishing the eval package so customers can run it themselves to verify the published results and benchmark the agents they use. Evaluation is not a one-time event. Scores can be tracked across releases, ensuring that AI solutions evolve to meet customer needs.
Looking beyond Sales Research Bench, Microsoft plans to develop eval frameworks and benchmarks for more business functions and agentic solutions— in customer service, finance, and beyond. The goal is to set a new standard for trust and transparency in enterprise AI.
Appendix:
Scoring Guidelines provided to LLM Judges
Text Groundedness and Text Relevance used Azure Foundry’s out-of-box LLM evaluators. Below are the guidelines provided to the LLM judges for the other six quality dimensions. These judges leverage Open AI’s GPT 4.1 model.
Schema accuracy:
100: Perfect match – all golden tables and columns are present (extra columns OK, Dynamics equivalents OK)
80: Very good – minor missing columns or one missing table
60: Good – some important columns or tables missing but core schema is there
40: Fair – significant schema differences but some overlap
20: Poor – major schema mismatch or completely different tables
Explainability:
100 (Excellent): Explanation is highly detailed, perfectly describes what the generated SQL does, technically accurate, and provides clear business context
80 (Good): Explanation is sufficiently detailed and mostly accurate with minor gaps in describing the SQL operations
60 (Fair): Explanation provides adequate detail but misses some important SQL operations or has minor inaccuracies
40 (Poor): Explanation lacks sufficient detail to understand the SQL operations or has significant inaccuracies
20 (Very Poor): Explanation is too vague, mostly incorrect, or provides insufficient detail about the generated SQL
Chart Groundedness:
100: Data accurately matches ground truth OR both ground truth & chart empty
80: Minor data inaccuracies
60: Some data inaccuracies
40: Major data inaccuracies
20: data completely mismatches ground truth
Chart Relevance:
100: Question and chart strongly reinforce each other OR both ground truth & chart empty
60: Question and chart loosely align but with some disconnect
20: Question and chart do not align at all
Chart Fit:
100: Optimal chart choice for the task OR both ground truth & chart empty (appropriate emptiness)
60: Acceptable chart choice but not optimal for the task
20: inappropriate/confusing chart type
Chart Clarity:
100: Chart is crisp and well-labeled OR both ground truth & chart empty
60: Chart readable but missing labels/clarity elements
20: Chart unreadable, misleading
Examples of Evaluation dataset:
Below are some of the evaluation datasets that we have used to benchmark the performance of Sales Research Agent against all the evaluation rubrics mentioned above. These same questions were also evaluated against the competitive offerings.
Click on the + to see the full datasets.
Evaluation Dataset One
{ “question”: “Looking at closed opportunities, which sellers have the largest gap between Total Actual Sales and Est Value First Year in the ‘Corporate Offices’ Business Segment?””, “difficulty”: “hard”, “sql”: [ “SELECT su.[fullname] AS [seller_name],”, ” COUNT(*) AS [closed_deals],”, ” SUM(CAST(COALESCE(o.[sop_totalactualsales], o.[actualvalue_base]) AS DECIMAL(38,2))) AS [total_actual_sales],”, ” SUM(CAST(o.[sop_estvaluefirstyear_base] AS DECIMAL(38,2))) AS [total_est_value_first_year],”, ” SUM(CAST(COALESCE(o.[sop_totalactualsales], o.[actualvalue_base]) AS DECIMAL(38,2)))”, ” – SUM(CAST(o.[sop_estvaluefirstyear_base] AS DECIMAL(38,2))) AS [sales_gap]”, “FROM [dbo].[opportunity] AS o”, “JOIN [dbo].[systemuser] AS su ON CAST(o.[ownerid] AS NVARCHAR(36)) = CAST(su.[systemuserid] AS NVARCHAR(36))”, “JOIN [dbo].[sop_businesssegment] AS bs ON CAST(o.[sop_businesssegment] AS NVARCHAR(36)) = CAST(bs.[sop_businesssegmentid] AS NVARCHAR(36))”, “WHERE o.[statecodename] = ‘Won’ AND bs.[sop_name] = ‘Corporate Offices’ AND su.[fullname] ” AND o.[sop_estvaluefirstyear_base] IS NOT NULL”, “GROUP BY su.[fullname]”, “HAVING SUM(CAST(COALESCE(o.[sop_totalactualsales], o.[actualvalue_base]) AS DECIMAL(38,2))) IS NOT NULL”, “ORDER BY [sales_gap] DESC;” ], “tags”: [ “seller-performance”, “variance”, “actuals-vs-estimate” ], “ground_truth”: { “structured”: [ { “columns”: [ “seller_name”, “closed_deals”, “total_actual_sales”, “total_est_value_first_year”, “sales_gap” ], “rows”: [ [ “Jenny Chambers”, 3, 44501.69, 16010.15, 28491.54 ], [ “Heather Rogers”, 1, 21501.05, 4190.57, 17310.48 ], [ “Grace Rice”, 1, 21223.33, 6789.20, 14434.13 ], [ “Ann Rice”, 1, 3243.23, 7267.77, -4024.54 ] ] } ], “unstructuredtext”: “Largest positive gaps: Jenny Chambers (+$28.49K), Heather Rogers (+$17.31K), and Grace Rice (+$14.43K). Ann Rice under-shot estimate (−$4.02K).”, “evaluationNotes”: “Gap = Total Actual Sales − Est First Year; Corporate Offices segment only; closed (Won) opps.” } }
Evaluation Dataset Two
{ “question”: “Are our sales efforts concentrated on specific industries or spread evenly across industries?”, “difficulty”: “medium”, “sql”: [ “SELECT “, ” [sop_industry].[sop_name] AS [industry_name],”, ” COUNT([opportunity].[opportunityid]) AS [total_opportunity_count],”, ” COUNT(CASE “, ” WHEN [opportunity].[statecodename] NOT IN (‘Won’,’Lost’,’Canceled’) “, ” THEN 1 “, ” END) AS [open_opportunity_count]”, “FROM “, ” [opportunity]”, “INNER JOIN “, ” [account] ON CAST([opportunity].[parentaccountid] AS NVARCHAR(36)) = CAST([account].[accountid] AS NVARCHAR(36))”, “INNER JOIN “, ” [sop_industry] ON CAST([account].[sop_industry] AS NVARCHAR(36)) = CAST([sop_industry].[sop_industryid] AS NVARCHAR(36))”, “GROUP BY “, ” [sop_industry].[sop_name]”, “HAVING “, ” COUNT([opportunity].[opportunityid]) > 0″, “ORDER BY “, ” [open_opportunity_count] DESC;” ], “tags”: [ “industry”, “concentration”, “open-vs-total” ], “ground_truth”: { “structured”: [ { “columns”: [ “industry_name”, “total_opportunity_count”, “open_opportunity_count” ], “rows”: [ [ “Legal Services”, 1352, 240 ], [ “Insurance”, 1210, 212 ], [ “Non-Durable Merchandise Retail”, 946, 177 ], [ “Inbound Repair and Services”, 695, 126 ], [ “Outbound Consumer Service”, 740, 124 ], [ “Design, Direction and Creative Management”, 719, 119 ], [ “Building Supply Retail”, 633, 118 ], [ “Durable Manufacturing”, 569, 111 ], [ “Business Services”, 597, 108 ], [ “Broadcasting Printing and Publishing”, 597, 104 ], [ “Accounting”, 551, 104 ], [ “Distributors, Dispatchers and Processors”, 562, 104 ], [ “Financial”, 606, 102 ], [ “Consulting”, 532, 100 ], [ “Agriculture and Non-petrol Natural Resource Extraction”, 586, 95 ], [ “Doctor’s Offices and Clinics”, 497, 90 ], [ “Brokers”, 579, 90 ], [ “Food and Tobacco Processing”, 489, 86 ], [ “Consumer Services”, 451, 81 ], [ “Eating and Drinking Places”, 448, 76 ], [ “Equipment Rental and Leasing”, 425, 74 ], [ “Entertainment Retail”, 429, 73 ], [ “Inbound Capital Intensive Processing”, 419, 71 ] ] } ], “unstructuredtext”: “Effort is broad but skewed: Legal Services and Insurance have the most total opps, while several industries maintain 70–120 open opps.”, “evaluationNotes”: “Counts total vs open opps per industry; ordered by open count. } },
Evaluation Dataset Three
{ “question”: “Compared to my headcount on paper (30), how many people are actually in seat and generating pipeline?”, “difficulty”: “medium”, “sql”: [ “WITH open_opps AS (“, ” SELECT o.*”, ” FROM opportunity o”, ” WHERE o.statecodename NOT IN (‘Won’,’Lost’,’Canceled’)”, “)”, “SELECT”, ” CAST(30 AS INT) AS headcount_on_paper,”, ” COUNT(DISTINCT open_opps.ownerid) AS active_pipeline_users,”, ” (30 – COUNT(DISTINCT open_opps.ownerid)) AS delta_needed,”, ” (SELECT COUNT(*) FROM opportunity) AS total_opportunities,”, ” (SELECT COUNT(*) FROM open_opps) AS open_opportunities,”, ” (SELECT SUM(CAST(o2.estimatedvalue_base AS DECIMAL(38,2))) FROM open_opps o2) AS open_pipeline_value;” ], “tags”: [ “capacity”, “headcount”, “pipeline” ], “ground_truth”: { “structured”: [ { “columns”: [ “headcount_on_paper”, “active_pipeline_users”, “delta_needed”, “total_opportunities”, “open_opportunities”, “open_pipeline_value” ], “rows”: [ [ 30, 7, 23, 14860, 2662, 16047760.29 ] ] } ], “unstructuredtext”: “Only 7 sellers have active pipeline against a plan of 30 (shortfall of 23). Open pipeline totals $16.05M across 2,662 opps.”, “evaluationNotes”: “Active sellers counted as distinct owners on current pipeline.” } }
This article is contributed. See the original author and article here.
The Finance solution in Microsoft 365 Copilot is now generally available
Finance plays a critical role in helping organizations make confident, data-driven decisions. Yet despite decades of automation, much of finance work still happens in spreadsheets and emails. Teams spend hours reconciling data from multiple systems, investigating variances, or fielding ad-hoc questions about budgets, spend, or invoices. The result is slower insight, longer close cycles, and less time for strategic analysis.
The Finance solution in Microsoft 365 Copilot, formerly Microsoft Copilot for Finance, is now generally available, helping finance teams bring ERP-connected data and workflows directly into the flow of work. Built with Microsoft 365 Copilot, this role-based AI solution connects to your existing systems of record, such as Microsoft Dynamics 365 Finance or SAP. It also infuses AI assistance into the tools you already use every day, like Excel and Outlook.
The result: faster financial operations, fewer manual handoffs, and better collaboration between finance, business teams, and IT.
A New Way to Work With Finance Data
Microsoft 365 Copilot bridges productivity tools and enterprise systems, so financial information becomes conversational and accessible. Instead of switching between applications or waiting for manually generated reports, you can simply ask questions in natural language:
“Identify the key drivers for forecast variances for March.”
“Highlight period over period trends across regions.”
“Draft a response to the customer regarding the last payment.”
Copilot interprets the request; when needed, retrieves data from ERP systems under your existing governance controls, and provides traceable, actionable answers. It not only lists figures, it highlights anomalies, explains the drivers of change, and creates draft narratives ready for review or sharing.
This connected experience reduces repetitive work. It also shortens the time between question and answer, and keeps financial insights grounded in governed, auditable data.
Core Capabilities Now Available
The Finance solution in Microsoft 365 Copilot delivers a suite of capabilities designed to simplify financial operations, improve accuracy, and enhance productivity across the finance organization.
Financial Reconciliation (Generally Available)
Reconciliation has always been one of finance’s most time-consuming tasks: matching transactions, detecting exceptions, and validating balances. Copilot transforms this process into an interactive experience.
It identifies unmatched transactions, detects potential differences, and suggests next steps. You can review and confirm matches directly in Excel, reducing manual work and improving audit confidence. Show the desired workflow to Copilot once, save it as a template and set up an AI action to get the same steps to be performed on a regular basis. The results can be mailed directly to your inbox. Organizations piloting these capabilities have reduced reconciliation time from days to hours while improving overall data quality.
Customer Communications in Outlook (Public Preview)
Finance teams often handle hundreds of customer inquiries via email: checking payment status, confirming invoices, or clarifying balances. With Copilot in Outlook, these messages become opportunities for automation. When an inquiry arrives, Copilot drafts a context-aware reply that includes relevant invoice details or payment confirmations pulled directly from ERP data. Finance professionals can review and send with confidence, knowing each response is accurate, consistent, and aligned with company records.
Variance Analysis (Public Preview)
When actuals deviate from the forecast, finance teams must quickly understand why. Variance analysis in Copilot accelerates this process. It identifies anomalies or shifts in financial performance and uses natural language to explain key drivers, such as currency fluctuations, delayed revenue recognition, or cost overruns. It can even draft summary explanations for management reporting. Instead of spending hours building pivot tables, finance teams can spend minutes reviewing insights and refining recommendations.
Data Preparation in Excel (Public Preview)
Preparing data for analysis can consume more time than the analysis itself. Through the Finance solution, Copilot automates this step. When ERP data is exported into Excel, Copilot recognizes column types, fills missing values, and reshapes tables into analysis-ready formats. The result is cleaner, standardized data for forecasting, reporting, and machine-learning models—all produced in a fraction of the time.
Together, these capabilities give finance professionals a connected, AI-assisted workflow across Microsoft 365, where every task, from reconciliation to communication, happens faster, with fewer errors and greater insight.
Enterprise-Grade Security and Governance
The Finance solution is built on the same trusted security foundation as Microsoft 365. All interactions honor existing role-based access, compliance, and audit controls, ensuring users only see the data they’re authorized to view. Finance data never leaves your governed environment, and all prompts and responses remain subject to your organization’s security, data-loss-prevention, and privacy policies.
For IT leaders, this design delivers confidence that Copilot operates within the same enterprise boundaries as your other Microsoft 365 workloads—no additional infrastructure or integration complexity required. Identity management, permissions, and governance remain consistent across finance, sales, and service scenarios.
Deployment and Management Made Simple
IT administrators can deploy the Finance solution directly from Microsoft AppSource, making it easy to discover, install, and configure without custom integration work. Once installed, the solution can be connected to your organization’s ERP systems, such as Dynamics 365 Finance or SAP, through guided setup experiences.
Because it runs within Microsoft 365 Copilot, deployment aligns with your existing Microsoft 365 tenant configuration. There’s no new infrastructure to provision and no separate AI environment to secure. Administrators can manage permissions, configure data connections, and monitor adoption through familiar Microsoft 365 admin centers.
Finance leaders, meanwhile, can roll out Copilot incrementally, starting with high-impact tasks like reconciliation and variance analysis, before expanding to broader finance workflows across teams and regions.
How to Get Started
The Finance solution in Microsoft 365 Copilot is designed to integrate with your current environment quickly. Here’s how to begin:
Check prerequisites – Ensure your organization is licensed for Microsoft 365 Copilot and that the users who will access the Finance solution have permissions aligned with your ERP system (for example, Dynamics 365 Finance or SAP).
Visit Microsoft AppSource – Search for Finance in Microsoft 365 Copilot and initiate the installation.
Connect your ERP system – Use the guided configuration experience to establish a secure connection between Copilot and your ERP environment. All credentials and permissions remain governed by your existing identity and compliance policies.
Assign access and roles – Within the Microsoft 365 admin center, assign appropriate access to finance teams and business users based on their roles.
Start using Copilot – Launch Excel or Outlook and begin exploring finance-related tasks, such as reconciliation support, variance explanations, or drafting customer communications.
Monitor and optimize adoption – IT can track usage, gather feedback, and adjust configurations as needed through existing Microsoft 365 management tools.
By bringing ERP data, productivity tools, and AI assistance together, the Finance solution in Microsoft 365 Copilot helps organizations move from static systems of record to dynamic systems of action. Finance teams no longer have to wait for reports or toggle between applications. They can access insights instantly, collaborate seamlessly, and act with confidence.
For finance professionals, that means faster close cycles and clearer insights. For business leaders, it means timely answers grounded in governed data. For IT, it means secure scalability across the Microsoft cloud.
This is finance reimagined for the agentic AI era—more connected, conversational, and compliant by design.
Learn More
The Finance solution in Microsoft 365 Copilot is now generally available. Explore how you can bring AI assistance into your finance organization today.
Visit Microsoft AppSource to download and configure the solution.
Review Microsoft Learn documentation for setup and administration guidance.
Discover more about Microsoft 365 Copilot and role-based AI solutions across Sales, Service, and Finance.
With Microsoft 365 Copilot bringing finance together with your ERP data, you can accelerate decision-making, enhance data confidence, and empower every finance professional to do more, directly within Microsoft 365.
This article is contributed. See the original author and article here.
As customer service teams strive to deliver faster, more personalized support, the need for tailored productivity enhancements has never been greater. With the introduction of custom productivity tools in Dynamics 365 Copilot Service Workspace (CSw), organizations can now equip customer service representatives with purpose-built utilities that streamline workflows, reduce clicks, and eliminate context switching.
From generic to purpose built
While CSw offers a robust set of out-of-the-box tools, many organizations have unique operational needs that require more than standard capabilities. Custom productivity tools bridge this gap by allowing developers and admins to embed lightweight, task-specific utilities directly into the service rep experience.
Whether it’s a quick calculator for warranty eligibility, a guided script for onboarding, a mini-dashboard for SLA tracking, or a custom appointment scheduler as shown below, these tools empower service reps to work smarter without leaving their workspace.
Custom productivity tools are built using familiar web technologies (HTML, JavaScript, CSS) and hosted as web resources in Dataverse. Admins can surface these tools in the productivity pane or as side panels within sessions. Subsequently, they are contextually available based on the service rep’s workflow.
Key features include:
Context-aware rendering: Tools can access session data, such as customer ID or case type, to personalize functionality.
Two-way data flow: Tools can read from and write to Dynamics 365 records using Web API calls.
Lightweight deployment: No need for full app development—just upload and configure.
Real-world impact
Organizations are already using custom productivity tools to:
Automate repetitive tasks like case classification or knowledge article suggestions.
Provide service reps with quick-reference guides and calculators.
The result? Faster resolution times, fewer errors, and happier service reps.
Custom productivity tools are part of our broader vision to be the most flexible and user-friendly workspace in the industry. As we continue to invest in extensibility, these tools will play a key role in helping organizations tailor CSw to their unique service models without compromise.
This article is contributed. See the original author and article here.
We are excited to introduce sensitive data redaction in Dynamics 365 Contact Center, a major advancement in privacy-first AI for customer service. This new capability empowers organizations to deliver intelligent voice experiences while ensuring that customers’ sensitive information remains protected throughout the interaction.
This feature is designed specifically for human interaction with voice AI agents, allowing voice AI agent makers to flag variables as sensitive in Microsoft Copilot Studio. Once flagged, these variables are automatically redacted from all system-level outputs—including call recordings, transcriptions, and diagnostic logs—ensuring that no sensitive data is stored or exposed.
Built for privacy, designed for trust
Sensitive data redaction reflects Microsoft’s commitment to responsible AI and secure customer engagement. By embedding privacy controls directly into the conversational design process, organizations can confidently deploy Voice AI agents that meet both customer expectations and regulatory requirements.
This feature supports a wide range of use cases, including:
Financial services: Redacting account numbers, PINs, and transaction details
Healthcare: Protecting patient identifiers and medical information
Public sector: Ensuring compliance with data handling standards for citizen services
Empowering contact center teams
With sensitive data redaction, contact center teams can:
Design privacy-aware voice AI agents using intuitive tools in Copilot Studio
Ensure compliance with global data protection regulations like GDPR, HIPAA, and PCI-DSS
Streamline operations by removing the need for manual data sanitization or external redaction tools
Build customer trust by transparently protecting sensitive information during voice interactions
Protecting privacy
This release marks a significant milestone in Microsoft’s journey to deliver secure, scalable, and intelligent contact center solutions. By enabling privacy-first voice AI experiences, we’re helping organizations modernize customer engagement while upholding the highest standards of data protection.
This article is contributed. See the original author and article here.
Frontline technicians juggle more than just work orders—they balance customer visits, team meetings, and personal commitments. Until now, keeping these schedules aligned often meant switching back and forth between apps.
With Release Wave 2 2025, we’re excited to announce the General Availability (GA) of Exchange Integration for Dynamics 365 Field Service. This capability syncs work order bookings directly into Outlook and Teams calendars, giving technicians a unified view of their schedules in one place, where they already collaborate with their team.
Why this matters
Field service organizations rely on accurate scheduling to keep operations running smoothly. Yet, frontline workers have traditionally had to check multiple sources—Field Service for bookings, Outlook for meetings, Teams for collaboration—just to piece together their day.
With this Integration:
Work order bookings appear automatically in Outlook and Teams calendars.
Technicians see everything in one place—work assignments, team meetings, personal appointments.
Dispatchers reduce scheduling confusion, since work orders sync within 15 minutes.
The result: fewer missed updates, less app switching, and more time spent serving customers.
How the integration works
One-way sync: Bookings tied to work orders and created or edited in Dynamics 365 Field Service → Exchange (Outlook and Teams).
Fast updates: Bookings sync within 15 minutes.
Seamless experience: Technicians stay focused in Outlook and Teams with no extra steps.
Synced fields are not configurable: If changing what fields sync is essential, please upvote this idea on the product team’s Idea’s portal and describe your scenario: Microsoft Idea
Important note for existing customers:
If your organization already uses an Exchange integration with Field Service to sync appointments, contacts, or tasks, you’ll notice an important change after GA: Field Service work order bookings will begin syncing into Outlook and Teams calendars.
To ensure a smooth transition:
– Prepare your users for this update to prevent unexpected duplicate or overlapping events.
– If needed, disable the sync for all user mailboxes or turn off the Exchange Online email server profile to opt out of this feature.
Getting started: Best practices
Train users to expect work order bookings in their Outlook and Teams calendars and explain what information will appear in appointments versus Field Service.
Roll out the integration progressively to the field while collecting feedback to identify where the integration is working, where it is not, and why.
Give feedback to the Microsoft Product team on feature gaps and changes you’d like to see using the Ideas portal.
Frequently Asked Questions
Q: When will Exchange Integration be generally available? A: Exchange Integration is available to all customers with Wave 2 2025.
Q: Does the sync work both ways? A: No. The sync is one-way, from Field Service to Exchange. Updates made in Outlook or Teams do not flow back to Field Service.
Q: How often does the sync run? A: Bookings appear in Outlook and Teams within 15 minutes.
Q: What happens if we already have an Exchange integration set up? A: You’ll begin to see Field Service work order bookings added to calendars after GA. Prepare users for this change to avoid confusion. Turn off the feature via mailbox or server profile settings if needed.
Q: Can we control which bookings sync? A: No, only work order related bookings will sync 2 weeks into the future and 1 week into the past. Admins can manage who is set up for integration by enabling or disabling the sync on their mailbox in Dynamics.
Q: Do we need new licenses? A: No additional licenses are required beyond standard Dynamics 365 Field Service and Microsoft 365 licensing.
Conclusion
With the Calendar Integration now generally available, your technicians gain a clearer view of their day, your dispatchers simplify scheduling, and your organization eliminates unnecessary friction.
This article is contributed. See the original author and article here.
As innovation speeds up, staying agile is essential. To keep your business ahead of the curve with innovation across Microsoft Dynamics 365, Microsoft Power Platform, and Copilot Studio, join us for the Business Applications Launch Event, debuting live on the Dynamics 365 YouTube channel on Thursday, October 23, 2025 at 9 AM Pacific Daylight Time. Subscribe to our YouTube channel to get notified when the update is live.
The Business Applications Launch Event offers an exclusive first look at new capabilities launching over the next few months.
With a newly streamlined presentation format, you can quickly get up to speed on the most important and innovative capabilities—with expert insights and demonstrations from Microsoft product leaders and engineers. It’s our way of helping you stay current, make informed decisions, and move faster in the era of Copilot and AI agents.
Mark your calendar for the Business Applications Launch Event—Thursday, October 23, 2025.
Get insights about the latest low-code and AI innovation transforming business from Charles Lamanna, President, Business and Industry Copilot.
Get a sneak preview of upcoming capabilities across Dynamics 365, Microsoft Power Platform, and Copilot Studio with live demonstrations from Microsoft product leaders and engineers.
Discover where to access materials to learn about and plan for new and upcoming capabilities.
All in a new presentation format designed to quickly get you up to speed on the latest updates, so you can get the most from them.
During this update, you’ll hear from the product leaders and engineers behind the technology, including new Copilot and AI agent innovation for Dynamics 365 and Microsoft Power Platform. Demo highlights will include:
Microsoft Dynamics 365 Sales
Learn about updates to the Sales qualification agent. It autonomously researches and engages with leads, helping sales teams quickly identify those with real purchase intent. In this wave, the agent goes further—moving the lead closer to full qualification and boosting the team’s opportunity pipeline with greater precision and impact.
Get an overview of 2025 release wave 2 highlights for Dynamics 365 Sales:
Microsoft Dynamics 365 Customer Service and Dynamics 365 Contact Center
The latest release wave of Dynamics 365 Contact Center helps service reps better understand customer needs and deliver what they need—quickly, efficiently, and with a human touch. Dynamics 365 Customer Service will continue to enhance agentic and Copilot capabilities for case and knowledge management, as well as AI-based routing. Dynamics 365 Contact Center will also focus on expanding agentic and Copilot capabilities to automate service journeys across digital and voice channels, along with introducing new omnichannel and supervisor features in the 2025 release wave 2.
Get an overview of 2025 update two highlights for Dynamics 365 Contact Center:
Dynamics 365 ERP products and solution in Microsoft 365 Copilot
Dynamics 365 Finance expands the capabilities of the Account Reconciliation Agent. Today, it supports your team in effortlessly resolving voucher amount mismatches. In this wave, it extends support to include ledger not in subledger and subledger not in ledger exceptions. Instead of relying on manual exception handling and static reports, the solution reviews all transactions on an ongoing basis, services exceptions, and presents them to you. The agent then suggests the most appropriate action for resolution, and you have the freedom to accept it or choose another path. Core updates to Dynamics 365 Finance also include the automation of remittance advice processing.
Dynamics 365 Supply Chain Management introduces capabilities that make AI-led demand planning more flexible. You can now bring in multiple external signals like inflation, weather, and industry indexes right into your forecast.
And new autonomous and intelligent productivity capabilities for finance solution in Microsoft 365 Copilot will reshape the finance process, from reconciliation to collections to advanced analytics help reduce repetitive work and surface actionable insights.
Get an overview of 2025 release wave 2 highlights for Dynamics 365 Finance:
Dynamics 365 Supply Chain Management:
Finance agents for Microsoft 365:
Microsoft Power Platform
Microsoft Power Platform is getting a major boost with AI and collaboration features. Power Apps now lets people and agents work together—agents can help with tasks like data entry, visualization, and app creation just by describing what you need or sharing an image. Power Automate is evolving with smarter automation tools, including generative AI actions, intelligent document processing, and new human-in-the-loop experiences like advanced approvals. It’s also, it’s adding stronger governance and security controls to help manage automation at scale. Power Pages is making it easier than ever to build secure, data-driven websites, with new tools for low-code makers and developers, and enhanced security insights to keep everything protected.
Get and overview of 2025 release wave 2 highlights for Power Apps:
See wave two highlights for Power Automate:
Copilot Studio
Copilot Studio continues to make agent creation and operation even easier and more powerful with autonomous agents in Microsoft 365 Copilot, the ability to build complete teams of agents that work seamlessly together, and improved governance for enterprise scalability. Copilot Studio will offer even deeper integration with Azure AI Foundry and Microsoft Graph, helping ensure your agents can use the latest AI technology alongside your data in the Microsoft Graph.
Get an overview of 2025 release wave 2 highlights for Copilot Studio:
Catch the wave—Mark your calendar for BALE
The Business Applications Launch Event will be live on the Dynamics 365 YouTube channel on Thursday, October 23, 2025, starting at 9 AM Pacific Daylight Time. We’ll see you there!
Recent Comments