Examinotion
Study Guides

How Microsoft 365 Copilot Actually Works: Grounding, the Semantic Index and Microsoft Graph (2026)

A clear, exam-focused guide to how Microsoft 365 Copilot works: grounding, the Semantic Index, Microsoft Graph, permissions and data privacy for AB-900 and AB-730 candidates.

ET

Examinotion Team

18 min read3 June 2026Updated: 3 June 2026
Abstract 3D rendering of Copilot's permission-scoped grounding pipeline.

Last updated: May 2026. Written and fact-checked by the Examinotion team against Microsoft Learn documentation and the official AB-900 and AB-730 skills outlines.

TL;DR Microsoft 365 Copilot works by grounding your prompt in your own organisational data before any answer is generated. It retrieves relevant emails, chats and documents through Microsoft Graph and the Semantic Index, sends that enriched prompt to a large language model in the Azure OpenAI Service, then filters the response for safety and compliance before returning it to your app.

If you are preparing for a Microsoft AI certification, one question sits underneath almost every exam item you will face: how does Microsoft 365 Copilot actually work? Understanding the architecture is what lets you answer permissions, privacy and grounding questions with confidence rather than guesswork. This guide walks through the full request journey, in plain language, for candidates studying AB-900 (Microsoft 365 Copilot and Agent Administration Fundamentals) and AB-730 (AI Business Professional).

What Microsoft 365 Copilot actually is

Microsoft 365 Copilot is not a chatbot bolted onto Office, and it is not the same product as the public version of ChatGPT. Microsoft describes it as "a sophisticated processing and orchestration engine" that coordinates three things: large language models (LLMs), content in Microsoft Graph such as emails, chats and documents you have permission to access, and the Microsoft 365 apps you use every day, such as Word and PowerPoint [2].

The orchestration engine is the part most candidates underestimate. The large language model supplies general reasoning and language ability, but on its own it knows nothing about your last project update or the meeting you have on Tuesday. Copilot's value comes from combining that general model with your specific working context, and the mechanism that does the combining is called grounding.

Operating inside the Microsoft 365 service boundary does not give Copilot a free pass to your whole tenant. As Microsoft puts it, "Operating inside the Microsoft 365 service boundary doesn't grant Copilot tenant-wide visibility. Data access is always scoped to the signed-in user's permissions" [1]. That single sentence underpins a large share of the exam content for both AB-900 and AB-730.

The request journey: from your prompt to a grounded response

Microsoft documents a clear, ordered flow for how a prompt becomes a response. Knowing this sequence is the fastest way to reason about almost any architecture question on the exam. Microsoft's own four-step description is [1]:

  1. In a Microsoft 365 app, a user enters a prompt in Copilot.
  2. Copilot pre-processes the input prompt using grounding, accessing Microsoft Graph in the user's tenant.
  3. Copilot sends the grounded prompt to the LLM, which generates a response that is contextually relevant to the user's task.
  4. Copilot returns the response to the app and the user.

The Semantic Index documentation expands this into a fuller five-step picture that adds a post-processing stage. After the model returns its response, Copilot "accesses the Microsoft Graph and semantic index for post-processing" before sending the final response and any app command back to your Microsoft 365 app, and "all requests are encrypted by HTTPS and customer data remains encrypted at rest" [3].

That post-processing stage is where Copilot's safety controls live. Microsoft applies "a defense-in-depth approach" that includes "jailbreak and cross-prompt injection attack (XPIA) classifiers" which "analyse inputs to the Copilot service and help block high-risk prompts prior to model execution," alongside protections for "blocking harmful content, detecting protected material, and blocking prompt injections" [2]. The practical takeaway for the exam is that grounding and safety filtering are distinct steps, not the same thing.

Grounding explained: how Copilot uses your own data

Grounding is the step that turns a generic model into a useful assistant. Microsoft defines it simply: "Grounding improves the specificity of your prompt, and helps you get answers that are relevant and actionable to your specific task. The prompt can include text from input files or other content Copilot discovers" [1].

In practice, grounding means Copilot searches your organisation's data for content relevant to your request and appends that context to your prompt before the model ever sees it. Microsoft explains that Copilot "can generate responses anchored in your organizational data, such as user documents, emails, calendar, chats, meetings, and contacts," and combines that content with your live working context, "such as the meeting a user is in now, the email exchanges the user had on a topic, or the chat conversations the user had last week" [2].

If you have read about retrieval-augmented generation, the pattern will feel familiar. Microsoft does not use the term "RAG" in its official Copilot documentation, but the mechanism it describes, retrieving relevant data and adding it to the prompt so the model has more to reason over, is functionally the same idea. For the exam, use Microsoft's vocabulary, grounding and the Semantic Index, and treat RAG as the wider industry concept it resembles.

Grounding can also be scoped deliberately. When a user attaches a SharePoint document library or folder, or provides its URL in a prompt, "the grounding step includes that scope and uses the library's column metadata alongside file content to constrain and rank results" [3]. This is how you point Copilot at a specific body of content rather than your entire data footprint.

The Semantic Index: Copilot's map of your organisation

The Semantic Index is what makes grounding fast and relevant. Microsoft describes it as mapping "your organization's data into an advanced lexical and semantic index to power search relevance and accuracy," allowing Copilot to "access the context and relationships within your data by utilizing Microsoft Graph" [3].

The word "semantic" is the important part. Traditional search matches exact keywords, but the Semantic Index uses vectors, which are mathematical representations of meaning, so it can "find the most similar or relevant data based on the semantic or contextual meaning" rather than only exact matches [3]. That is why Copilot can connect a question about "our Q3 launch" to a document titled "third-quarter go-to-market plan" even when the words do not match.

Two indices are built automatically for every subscriber. Microsoft confirms that "a semantic index is created for every subscription at the tenant and user level," where the tenant-level index is generated from text-based SharePoint Online files and the user-level index covers the individual's mailbox [3]. Crucially, the index "only surfaces the results to a user if the user already has access to the content controlled by role-based access control" [3].

A point that frequently appears in exam questions is that the Semantic Index needs no setup and cannot be switched off. Microsoft states there is "no administrative involvement required to enable semantic indexing, as the service is automatically enabled by Microsoft," and that it "can't be disabled" [3]. It also does not change anything about who can see what, because semantic indexing "works only with content to which your users already have permission and doesn't affect storage quotas" [3].

Where the index lives matters for data residency. User-level index information is stored where the user's mailbox is located, while tenant-level index data sits "in an isolated and protected customer's tenant container" in the region of the SharePoint site, and "for customers within the European Union Data Boundary (EUDB), the index is stored in an EU/EFTA based datacenter" [3].

Microsoft Graph: the data backbone behind every response

Microsoft Graph is the source of the organisational signals Copilot grounds against. Microsoft describes it as the layer that "includes information on users, their activities, and the organization data they can access," and explains that "the Microsoft Graph API brings a personalized context into the prompt, like information from a user's emails, chats, documents, and meetings" [5].

Copilot only ever asks Graph for content in your own context. Microsoft is explicit that "Copilot uses Microsoft Graph to access user data that's in the user's unique context. This user data includes emails, chats, and documents that the user has permission to access" [1]. Graph is therefore both the data backbone and the first line of permission enforcement.

External data can be brought into this picture through Copilot connectors. Microsoft supports two connector models: synced connectors that "ingest and index external content into Microsoft Graph," and federated connectors that "retrieve content in real time by using Model Context Protocol (MCP) without indexing data into Microsoft Graph" [6]. There are more than 100 prebuilt connectors for services such as Salesforce, ServiceNow, Box and Confluence, and organisations can build their own [6].

Connector data is treated exactly like native Microsoft 365 content for both security and access. Microsoft confirms that "data brought in from third party connectors are provided the same storage and protections as other Microsoft 365 data" [3], and that connector content "can be returned in Microsoft 365 Copilot responses if the user has permission to access that information" [2].

Permissions and access control: the most-tested topic

If you remember one rule for AB-900 and AB-730, make it this one. Microsoft 365 Copilot "only surfaces organizational data to which individual users have at least view permissions" [2]. Copilot inherits your existing access, it does not expand it.

The architecture documentation restates the principle from the other direction: "Copilot only accesses data that an individual user is authorized to access, based on, for example, existing Microsoft 365 role-based access controls. Copilot doesn't access data that the user doesn't have permission to access" [1]. Role-based access control (RBAC) is the underlying mechanism, and the Semantic Index "honors the user identity-based access boundary so that the grounding process only accesses content that the current user is authorised to access" [2].

Tenant isolation keeps one organisation's data away from another's. Microsoft achieves "logical isolation of customer content within each tenant for Microsoft 365 services" through "Microsoft Entra authorisation and role-based access control" [2]. Your prompts are grounded against your tenant alone.

Several Microsoft Purview and identity controls sit on top of permissions, and they all narrow what Copilot can use rather than widen it:

Control Effect on Copilot
Sensitivity labels and encryption Honoured; the user needs EXTRACT and VIEW usage rights for Copilot to interact with encrypted content [4]
Data loss prevention (DLP) policies Can exclude labelled items from being processed by Copilot [10]
Conditional Access Honoured before access is granted [1]
Multifactor authentication (MFA) Honoured as part of sign-in [1]
Role-based access control Determines which content the user, and therefore Copilot, can see [1]

The exam-critical conclusion is straightforward: if you cannot open a document in SharePoint, Copilot cannot surface that document's content to you. This single fact resolves a large share of permissions questions on both exams. DLP, by contrast, can further restrict access using "the Microsoft 365 Copilot and Copilot Chat policy location" with a sensitivity-label condition to "exclude items from being processed" [10].

What Microsoft does and does not do with your data

Privacy questions are common, and the answers are unusually clear-cut because Microsoft states them in absolute terms. The central commitment, flagged as Important in the documentation, is that "prompts, responses, and data accessed through Microsoft Graph aren't used to train foundation LLMs, including those used by Microsoft 365 Copilot" [2]. The Semantic Index page repeats the same promise for data accessed through semantic indexing [3].

This no-training rule extends to everything Copilot stores about an interaction. Stored interaction data "is encrypted while it's stored and isn't used to train foundation LLMs," and user feedback is treated the same way: "We don't use this feedback to train the foundation LLMs used by Microsoft 365 Copilot" [2].

The model itself runs in a Microsoft-controlled environment, not the consumer internet. Microsoft states that "Microsoft 365 Copilot uses Azure OpenAI services for processing, not OpenAI's publicly available services," and that "Azure OpenAI doesn't cache customer content and Copilot modified prompts" [2]. Microsoft 365 Copilot has also opted out of the human-review abuse monitoring that is otherwise available in Azure OpenAI [2].

Compliance and residency commitments are concrete. Copilot is "compliant with our existing privacy, security, and compliance commitments to Microsoft 365 commercial customers, including the General Data Protection Regulation (GDPR) and European Union (EU) Data Boundary," and was "added as a covered workload in the data residency commitments in Microsoft Product Terms on March 1, 2024" [2]. For EU users, "EU traffic stays within the EU Data Boundary while worldwide traffic can be sent to the EU and other countries or regions for LLM processing" [2]. Copilot's broader certifications include "GDPR, ISO 27001, HIPAA, and the ISO 42001 standard for AI management systems" [2].

One 2026 nuance is worth knowing for EU-focused scenarios. From 7 January 2026, Anthropic became a subprocessor for Microsoft 365 Copilot, bringing additional models into some Copilot experiences under the Microsoft Product Terms, but Microsoft notes that "Anthropic models are out of scope for the EU Data Boundary" and in-country LLM processing commitments [2]. Model updates in general "don't change your security, privacy, or compliance settings" [2].

What is new in 2026

Copilot's architecture has grown beyond a single chat box, and recent skills outlines expect candidates to recognise the newer building blocks. The clearest additions are the specialist reasoning agents. The Researcher agent "is designed to handle complex, multi-step research tasks" and "draws insights from both the web and your work content, including files, emails, meetings, and chats you have access to," using OpenAI's o3 model to spend longer retrieving and analysing before writing an in-depth, cited report [7]. The Analyst agent is its counterpart for advanced data analysis [7].

Declarative agents are the no-code customisation layer. Microsoft describes them as "conversational AI experiences that result from declared configurations loaded into Microsoft 365 Copilot," which let organisations build tailored solutions "through configuration rather than custom code" using instructions, actions and knowledge [8]. They sit on top of the same grounding and permissions model described above.

Grounding sources are also expanding. The federated connector model uses Model Context Protocol (MCP) to retrieve external data in real time without indexing it into Microsoft Graph [6]. In Microsoft 365 Copilot work chat, a "Web content" toggle lets users ground responses in live web results, with admin controls to govern web access across the organisation [9].

Finally, it helps to keep two product names straight. Microsoft distinguishes "Microsoft 365 Copilot Chat (grounded in the web)" from "Microsoft 365 Copilot (grounded in the web and work data)," where only the latter, work-data grounding requires the paid add-on licence [5]. Microsoft 365 Copilot now also includes Copilot Search, "a universal search experience" across Microsoft 365 and third-party data sources [5].

How this maps to your exam

This architecture is not background reading, it is directly examined, so it pays to know which exam tests what. On AB-900 (Microsoft 365 Copilot and Agent Administration Fundamentals), the heaviest-weighted domain is "Understand data protection and governance tasks for Microsoft 365 and Copilot" at 35-40% of the exam, which explicitly tests how Copilot accesses data, how Microsoft Graph influences responses, and how permissions and controls protect against risk [11]. It is worth noting that Microsoft's official title for AB-900 centres on Copilot and agent administration, so the exam leans towards an IT-administrator audience rather than a pure beginner.

On AB-730 (AI Business Professional), the same concepts appear from a business-user angle. Domain 1, "Understand generative AI fundamentals," carries 25-30% of the exam and covers how Copilot keeps an organisation's information private and secure, and how context such as your work files or the app you are using affects Copilot's responses [12]. AB-730 assumes hands-on experience with Microsoft 365 Copilot, Researcher and Analyst, rather than administrative depth [12].

Be honest with yourself about the difficulty here. Neither exam is a memory test of trivia, and the data-protection and grounding questions reward genuine understanding of the flow rather than recall of a single fact. Because these topics are foundational and heavily weighted, time spent on this architecture is some of the highest-value revision you can do. If you are weighing up which certification to sit, our broader Microsoft AI certification roadmap and the AB-900 exam guide put these exams in context.

For structured revision, work through the AB-900 study guide and the AB-730 study guide, then test yourself with the AB-900 practice questions. If you want to see grounding and agents in action, our explainer on Microsoft Copilot agents for business and the guide to Copilot prompt engineering for AB-730 build directly on the concepts above.

Frequently Asked Questions

How does Microsoft 365 Copilot work?

Copilot is an orchestration engine. When you enter a prompt, it grounds the prompt by retrieving relevant content from Microsoft Graph and the Semantic Index, then sends that enriched prompt to a large language model in the Azure OpenAI Service. The model generates a response, which Copilot post-processes for safety and compliance before returning it to your Microsoft 365 app.

What is grounding in Microsoft 365 Copilot?

Grounding is the step where Copilot enriches your prompt with relevant organisational data before the language model sees it. Microsoft says grounding "improves the specificity of your prompt" so answers are relevant to your task. In practice, Copilot searches your permitted emails, chats, documents and calendar through Microsoft Graph and appends that context to the prompt.

What is the Semantic Index in Microsoft 365 Copilot?

The Semantic Index is an automatic lexical and semantic index of your organisation's data, built from Microsoft Graph content such as SharePoint files and mailboxes. It maps data as vectors so Copilot can find content by meaning, not just exact keywords. It is created for every subscriber at tenant and user level, needs no admin setup, and cannot be disabled.

Can Copilot see files I do not have access to?

No. Microsoft is explicit that Copilot "only accesses data that an individual user is authorized to access" based on existing Microsoft 365 permissions. If you cannot open a document in SharePoint, Copilot cannot surface its content to you. Sensitivity labels, encryption and data loss prevention policies can restrict access further, even within your permitted scope.

Is my data used to train Microsoft 365 Copilot?

No. Microsoft states that "prompts, responses, and data accessed through Microsoft Graph aren't used to train foundation LLMs, including those used by Microsoft 365 Copilot." This covers stored interaction data and feedback as well. Copilot uses the Azure OpenAI Service, which does not cache your prompts, rather than OpenAI's public consumer service.

Does Microsoft 365 Copilot use OpenAI or Azure OpenAI?

Copilot uses the Azure OpenAI Service, Microsoft's enterprise deployment of large language models, not OpenAI's public consumer service. Microsoft documents this directly: processing happens through "Azure OpenAI services for processing, not OpenAI's publicly available services." Your data stays within the Microsoft 365 service boundary throughout.

What are Researcher and Analyst in Microsoft 365 Copilot?

Researcher and Analyst are specialist reasoning agents in Microsoft 365 Copilot, powered by OpenAI's o3 model. Researcher handles complex, multi-step research across your permitted work content and the web, returning a cited report. Analyst is optimised for advanced data analysis. Both appear in the AB-900 and AB-730 study guides as topics candidates should understand.

Conclusion

Microsoft 365 Copilot is best understood as an orchestration engine that grounds a general language model in your own, permission-scoped data. Once you can trace a prompt through grounding, Microsoft Graph, the Semantic Index, the Azure OpenAI model and post-processing, the permissions and privacy questions on AB-900 and AB-730 stop being memorisation and start being logic. The single most reliable principle to carry into the exam is that Copilot inherits your existing access and never exceeds it.

When you are ready to turn this understanding into exam readiness, browse Examinotion's Microsoft AI exam preparation or start directly with AB-900 practice or AB-730 practice. Structured practice against realistic questions is the fastest way to find the gaps in your understanding before exam day.

Sources

  1. Microsoft 365 Copilot architecture — Microsoft Learn, accessed 2026-05-29
  2. Data, privacy, and security for Microsoft 365 Copilot — Microsoft Learn, accessed 2026-05-29
  3. Semantic Index for Copilot — Microsoft Learn, accessed 2026-05-29
  4. Microsoft 365 Copilot data protection architecture and auditing — Microsoft Learn, accessed 2026-05-29
  5. Microsoft 365 Copilot overview — Microsoft Learn, accessed 2026-05-29
  6. Microsoft 365 Copilot connectors overview — Microsoft Learn, accessed 2026-05-29
  7. Researcher agent in Microsoft 365 Copilot — Microsoft Learn, accessed 2026-05-29
  8. Declarative agents for Microsoft 365 Copilot overview — Microsoft Learn, accessed 2026-05-29
  9. Manage public web access for Microsoft 365 Copilot — Microsoft Learn, accessed 2026-05-29
  10. Learn about Microsoft 365 Copilot data loss prevention — Microsoft Learn, accessed 2026-05-29
  11. Exam AB-900 study guide — Microsoft Learn, accessed 2026-05-29
  12. Exam AB-730 study guide — Microsoft Learn, accessed 2026-05-29

Preparing for a Microsoft AI Certification?

Try 5 free practice questions with detailed explanations, no credit card required.

Lifetime access200+ questions per exam7-day money-back guarantee
Start Practising Today

Ready to Pass Your Exam?

Don't leave your certification to chance. Prepare with realistic practice questions, case studies, and detailed explanations for every answer.

No credit card required • Instant access

Can we do better?