Adds Valkey to chat message history#2
Conversation
4563242 to
995dbf0
Compare
Jonathan-Improving
left a comment
There was a problem hiding this comment.
Consolidated Code Review: PR #2 — Adds Valkey to Chat Message History
PR: issue-5445 → main | Reviewed: 2026-04-27 | Scope: 16 changed files, +1,427 lines
Reviewers: GLIDE/Domain, DRY/Abstraction, Security (×2 rounds, consolidated)
🔒 Security Review
[VULN-001] Cross-Scope Data Leak — ThreadId Stored but Not Filtered (High) — See inline comment on ValkeyContextProvider.cs:178-193.
[VULN-002] Incomplete Query/Tag Escaping (Medium) — See inline comment on ValkeyContextProvider.cs:330-340.
Needs Verification:
- [VERIFY-001] No TTL on stored data — messages persist indefinitely. PII accumulates without bound. Intentional?
- [VERIFY-002] No try/catch around
JsonSerializer.Deserialize<ChatMessage>inValkeyChatHistoryProvider.ProvideChatHistoryAsync— see inline comment. Malformed JSON crashes the session.
Cleared: No arbitrary Valkey command injection (RESP parameterization). Indirect prompt injection risk documented accurately. No hardcoded secrets in samples. Connection security is a deployment concern, correctly documented.
🔴 Critical — Must Fix
1. Unit tests cover ~20% of public API surface
All 23 tests are synchronous constructor/property checks. Zero tests for any async method (ProvideChatHistoryAsync, StoreChatHistoryAsync, ProvideMessagesAsync, StoreAIContextAsync, EnsureIndexAsync, ClearMessagesAsync, GetMessageCountAsync, DisposeAsync), zero tests for security-critical methods (EscapeTag, EscapeQuery, ParseSearchResults), and zero tests for business logic (MaxMessages trimming, MaxMessagesToRetrieve limiting). InternalsVisibleTo is configured but unused. The Mem0 unit tests in this repo set a clear standard these tests fall far short of. Additionally, StateInitializer_NoScopeFields_Throws never triggers the lazy validation — it just asserts Assert.NotNull(provider).
2. CancellationToken never forwarded
Both providers accept CancellationToken in every async method but never use it. At minimum add cancellationToken.ThrowIfCancellationRequested() before I/O calls and between loop iterations. The Mem0 provider forwards CT to its HTTP calls.
3. EnsureIndexAsync race condition — See inline comment on ValkeyContextProvider.cs:268-275.
🟡 Medium
4. Constructor/DisposeAsync duplication — extract ValkeyConnectionManager
Both providers have 2 constructors each with ~80% identical bodies and character-for-character identical DisposeAsync implementations. An internal ValkeyConnectionManager (composition, not inheritance) would eliminate 4 duplicated constructor bodies and 2 identical DisposeAsync methods.
5. Error handling inconsistency between providers
ValkeyContextProvider wraps operations in try/catch (matching Mem0). ValkeyChatHistoryProvider does not — exceptions propagate unhandled. Either add try/catch to match, or document the intentional difference.
6. Synchronous ConnectionMultiplexer.Connect() in constructors
Blocking network call in constructor. Can cause thread-pool starvation in ASP.NET Core / DI scenarios. Consider ConnectAsync() via lazy init, or document that callers should prefer the IConnectionMultiplexer overload.
7. No disposed-state guard — Neither provider has a _disposed flag. After DisposeAsync(), subsequent calls throw opaque errors.
8. StoreChatHistoryAsync pushes messages one-by-one — N round-trips. ListRightPushAsync accepts RedisValue[] for batch push.
9. Missing [JsonConstructor] on State classes — Unlike the Mem0 pattern. Could break session state serialization.
10. No PII redaction in log messages — ConversationId, UserId, AgentId logged directly. Mem0 uses a Redactor pattern.
🟢 Low
- Duplicate
<NoWarn>line in csproj (see inline) - NuGet
<Description>says "vector search" — should say "full-text search" (see inline) EscapeQueryallocateschar[]array per call — makespecialastatic readonlyfieldProvideChatHistoryAsyncfetches all messages then trims — useListRangeAsync(key, -N, -1)- Missing
Throw.IfNull(context)inconsistency across methods - Sample default model
"gpt-5.4-mini"— not a known Azure OpenAI model name MaxResults.ToString()uses current culture — useCultureInfo.InvariantCulture- PR claims bug fix not visible in diff — both sample files are new, not modified
✅ Positives
- Excellent pattern consistency with Mem0Provider — ValkeyProviderScope mirrors Mem0ProviderScope exactly
- Clean separation — ChatHistoryProvider (lists, no search module) vs ContextProvider (FT.SEARCH)
- Ownership semantics —
_ownsConnectioncorrectly tracks disposal responsibility - Thorough XML documentation with security considerations (PII, compromised store, indirect prompt injection)
- Scope validation —
ValidateStateInitializerprevents unbounded queries - Idiomatic Valkey usage — FT.CREATE with "Index already exists" catch, RPUSH+LTRIM, TAG filters
- Two high-quality samples (Azure OpenAI + Bedrock) with clear READMEs
ConfigureAwait(false)used consistently — correct for library code- Lean dependencies — only Abstractions, StackExchange.Redis, Logging.Abstractions
Checklist Assessment
| Area | Status | Notes |
|---|---|---|
| Functionality | ✅ | Providers implement claimed features correctly |
| Code Quality | 🟡 | Good structure; thread safety, disposal, DRY issues |
| Testing | 🔴 | ~20% API coverage, no async/behavioral tests |
| Security | 🟡 | ThreadId leak, incomplete escaping, no TTL |
| Performance | 🟡 | One-by-one push, fetch-all-then-trim, per-call allocations |
| Documentation | ✅ | Excellent XML docs, READMEs; minor NuGet description issue |
| <PropertyGroup> | ||
| <!-- NuGet Package Settings --> | ||
| <Title>Microsoft Agent Framework - Valkey integration</Title> | ||
| <Description>Provides Valkey integration for Microsoft Agent Framework, including chat history persistence and context provider with vector search.</Description> |
There was a problem hiding this comment.
🟢 NuGet description inaccuracy
The <Description> says "context provider with vector search" but ValkeyContextProvider uses full-text search (FT.SEARCH with TEXT fields). There are no VECTOR fields in the schema. Should say "full-text search" instead.
|
|
||
| var db = this._connection.GetDatabase(); | ||
|
|
||
| try | ||
| { | ||
| await db.ExecuteAsync( | ||
| "FT.CREATE", | ||
| this._indexName, |
There was a problem hiding this comment.
🔴 Race condition — _indexCreated not thread-safe
_indexCreated is a plain bool read/written without synchronization. Under concurrent requests, multiple threads can see false and all attempt FT.CREATE. The catch handles "Index already exists", but the flag itself isn't volatile — reads can be stale on other cores.
Fix: Use SemaphoreSlim for full correctness, or at minimum Volatile.Read/Volatile.Write:
private readonly SemaphoreSlim _indexLock = new(1, 1);
private async Task EnsureIndexAsync()
{
if (Volatile.Read(ref _indexCreated)) return;
await _indexLock.WaitAsync().ConfigureAwait(false);
try
{
if (_indexCreated) return;
// ... FT.CREATE ...
Volatile.Write(ref _indexCreated, true);
}
finally { _indexLock.Release(); }
}| <RootNamespace>Microsoft.Agents.AI.Valkey</RootNamespace> | ||
| <VersionSuffix>preview</VersionSuffix> | ||
| <NoWarn>$(NoWarn);CA1873</NoWarn> | ||
| <NoWarn>$(NoWarn);CA1873</NoWarn> |
There was a problem hiding this comment.
🟢 Duplicate <NoWarn> line
<NoWarn>$(NoWarn);CA1873</NoWarn> appears on both line 7 and line 8. Copy-paste error — remove the duplicate.
| messages.Add(message); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 Inconsistent error handling — no try/catch on deserialization
JsonSerializer.Deserialize<ChatMessage> is called here with no try/catch. If the Valkey store contains malformed JSON (corruption, tampering), this throws JsonException unhandled and crashes the session.
ValkeyContextProvider wraps both ProvideMessagesAsync and StoreAIContextAsync in try/catch (matching the Mem0 pattern). This provider does not — neither ProvideChatHistoryAsync nor StoreChatHistoryAsync have error handling.
Either add try/catch to match, or document the intentional difference in error handling strategy.
| var filterExpr = filterParts.Count > 0 ? string.Join(" ", filterParts) : "*"; | ||
| var escapedQuery = $"{filterExpr} {EscapeQuery(queryText)}"; | ||
|
|
||
| var result = await db.ExecuteAsync( | ||
| "FT.SEARCH", | ||
| this._indexName, | ||
| escapedQuery, | ||
| "LIMIT", "0", this.MaxResults.ToString()).ConfigureAwait(false); | ||
|
|
||
| var memories = ParseSearchResults(result); | ||
| var memoryTexts = memories | ||
| .Select(m => m.TryGetValue("content", out var c) ? c : null) | ||
| .Where(c => !string.IsNullOrEmpty(c)) | ||
| .ToList(); | ||
|
|
||
| this._logger?.LogInformation( |
There was a problem hiding this comment.
🔒 [VULN-001] Cross-Scope Data Leak — ThreadId stored but not filtered (High)
StoreAIContextAsync writes thread_id to every HASH document (line 243), but the search filter here only checks ApplicationId, AgentId, and UserId — ThreadId is never included. A developer who sets ThreadId expecting thread-level isolation gets none — all threads for the same user/agent/app share memories.
In multi-thread scenarios (e.g., a support agent handling multiple customer conversations), memories from one thread bleed into another.
Fix:
if (!string.IsNullOrEmpty(scope.ThreadId))
{
filterParts.Add($"@thread_id:{{{EscapeTag(scope.ThreadId)}}}");
}| { | ||
| return value | ||
| .Replace("\\", "\\\\") | ||
| .Replace("{", "\\{") | ||
| .Replace("}", "\\}") | ||
| .Replace("@", "\\@") | ||
| .Replace(" ", "\\ "); | ||
| } | ||
|
|
||
| private static string EscapeQuery(string text) | ||
| { |
There was a problem hiding this comment.
🔒 [VULN-002] Incomplete Query/Tag Escaping (Medium)
EscapeQuery is missing <, % (fuzzy match operator), and , from the special characters array. > is escaped but < is not.
EscapeTag (line 330) is also missing | (tag value separator) and ,. A crafted scope value containing | (e.g., UserId = "user1|user2") could match multiple tag values, potentially retrieving another user's data.
No arbitrary Valkey command injection risk (StackExchange.Redis uses parameterized RESP), but query manipulation within FT.SEARCH is possible.
Fix: Add missing characters to both methods. Consider an allowlist approach for tag values.
995dbf0 to
302d1ce
Compare
302d1ce to
1afc83f
Compare
1afc83f to
2598a07
Compare
2598a07 to
4958338
Compare
4958338 to
effe8f0
Compare
effe8f0 to
5ebf3f1
Compare
5ebf3f1 to
7149fd7
Compare
7149fd7 to
5434eb7
Compare
…text provider - Switch from StackExchange.Redis to Valkey.Glide 1.1.0 (official Valkey .NET client) - Extract optional params into ValkeyChatHistoryProviderOptions - Add JsonSerializerOptions support, remove [RequiresUnreferencedCode] - Make MaxMessages/MaxMessagesToRetrieve readonly via options - Remove ValkeyContextProvider (overlaps with ChatHistoryMemoryProvider + MEVD) - Remove ValkeyProviderScope (only used by context provider) - Remove connection string constructors (caller manages IConnectionMultiplexer) - Update samples to use new API and gpt-5.4-mini
Use JsonSerializerOptions.GetTypeInfo() for Serialize/Deserialize calls to enable NativeAOT/trimming compatibility without suppress attributes. Default to AgentAbstractionsJsonUtilities.DefaultOptions when no options provided. Signed-off-by: Matthias Howell <matthias.howell@improving.com>
Remove ValkeyContextProvider and long-term memory references from sample READMEs since the context provider was removed from this PR. Simplify Valkey server requirements (no search module needed for chat history). Signed-off-by: Matthias Howell <matthias.howell@improving.com>
Signed-off-by: Matthias Howell <matthias.howell@improving.com>
…ty (microsoft#6270) * fix: use getattr for non-OpenAI provider response compatibility Fixes microsoft#6234 Fixes microsoft#6235 Use getattr with None fallback for system_fingerprint and output attributes to prevent AttributeError when non-OpenAI providers return response objects without these fields. * fix: use typed variable for response output to satisfy pyright Fixes microsoft#6235 Use getattr with None fallback for the output attribute, and assign to a typed list variable before the match statement to help pyright narrow the response item types correctly. * fix: rename response_outputs to avoid name collision with case-block variable Fixes microsoft#6235 Rename outputs to response_outputs on line 1974 to avoid mypy error about conflicting variable names in the match statement's case blocks. Also use list[Any] for explicit generic type annotation. * fix: use cast(list[Any]) for response output to satisfy pyright Fixes microsoft#6235 The getattr() call returns Unknown type which pyright cannot narrow in the match statement. Use an explicit cast to list[Any]. * fix: use hasattr guard instead of getattr for response.output Fixes microsoft#6235 Using hasattr(response, 'output') and then accessing response.output directly gives pyright enough type information to verify the match statement exhaustiveness. This avoids the cast(list[Any]) approach which pyright still flagged as partially unknown. * fix: use ternary operator for response_outputs assignment Replace if-else block with ternary expression to satisfy ruff SIM108 lint rule. This fixes the Package Checks (3.11) CI failure. * fix: use ternary with cast for ruff SIM108 and pyright type safety Replace if-else block with ternary expression using cast(list[Any], ...) to satisfy: - ruff SIM108 (use ternary instead of if-else) - ruff E501 (line length < 120) - pyright type narrowing (cast preserves type info lost in ternary) All local checks pass: ruff check, ruff format, pyright, 298 tests. * fix: replace hasattr+cast with try/except to preserve pyright types --------- Co-authored-by: Tao Chen <taochen@microsoft.com>
…urity model and deserialization trust boundaries (microsoft#6295) * docs: clarify checkpoint storage security model and deserialization trust boundaries Add Security Model documentation sections to the checkpoint encoding and Azure Functions serialization modules explaining: - Checkpoint storage is a trusted data source requiring access controls - The RestrictedUnpickler allowlist is defense-in-depth, not a security boundary - Developer responsibilities for securing storage backends - Guidance on using allowed_types and strip_pickle_markers Co-authored-by: Azure SRE Agent <noreply@microsoft.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Azure SRE Agent <noreply@microsoft.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Fix Magentic to share agent replies across team The per-round instruction was sent untargeted (fan-out delivered it to every participant) and replies were never relayed, so a later speaker saw the prior speaker's instruction but not its response - inverted from GroupChatHost and the Python reference. - Target the instruction at the selected speaker only. - Broadcast each reply to the other participants (buffered, no TurnToken), excluding the responder via _currentSpeakerExecutorId, mirroring GroupChatHost. - Persist _currentSpeakerExecutorId across checkpoints. - Add a regression test. * Address review feedback: null-guard, explicit checkpoint key, drop vacuous assertion * Address review feedback: centralize checkpoint keys, clear current speaker - Move CurrentSpeakerStateKey into MagenticConstants as nameof(CurrentSpeakerStateKey) - Clear _currentSpeakerExecutorId in ResetAndReplanAsync and PrepareFinalAnswerAsync so a checkpoint taken in those windows does not persist a stale speaker - Add UTF-8 BOM to RecordingEchoAgent.cs to satisfy the format check.
* Parallelize Purview PSPC cold cache path * Cache Purview payment-required state for scope refresh * Cache Purview payment-required state for scope refresh * Align Purview policy action dedupe and 402 caching Deduplicate combined policy actions by action and restriction action so restriction-only actions are preserved without duplicating identical entries. Cache tenant-level payment-required state from background scope refresh so subsequent calls short-circuit consistently. * .NET: Implement best-effort caching for background job scope retrieval and add unit tests for cache write failures * Purview - feat: Enhance ScopedContentProcessor to queue ContentActivityJob when no applicable scopes are found and update related tests * docs: Update purview package README and AGENTS documentation to reflect caching optimizations and policy enforcement scenarios Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add 'Deploying to Foundry (azd spec)' sections to all Foundry hosted agent samples This commit adds comprehensive deployment documentation to all 13 .NET Foundry hosted agent samples that were missing it. Each sample now includes: - Instructions to initialize an azd project from the sample's agent.manifest.yaml - Steps to deploy using 'azd deploy' - Example environment variable overrides for customization - Link to the official Foundry deployment guide Samples updated: - Hosted-LocalTools - Hosted-Files - Hosted-FoundryAgent - Hosted-McpTools - Hosted-Observability - Hosted-MemoryAgent - Hosted-TextRag - Hosted-ToolboxMcpSkills - Hosted-AzureSearchRag - Hosted-AgentSkills - Hosted-Workflow-Handoff - Hosted-Workflow-Simple - Hosted-Invocations-EchoAgent Each section includes the correct agent name from the sample's manifest and points to the correct GitHub URL for initializing the azd project. Fixes: microsoft#6308 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs(samples): fix Foundry hosted README consistency Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(samples): address PR 6365 README review comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Ben Thomas <25218250+alliscode@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Python: bump package versions for 1.8.1 release * Python: bump agent-framework-foundry-hosting for 1.8.1 release * Python: bump ag-ui and azurefunctions for 1.8.1 release * Remove incorrect agent-framework-foundry changelog entry for microsoft#6259 * Add [1.8.1] changelog compare link and update [Unreleased] base --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ve dependency floor, and update Merge Gatekeeper ignores (microsoft#6148) * Bump Microsoft.Extensions.AI packages to 10.6.0 * Align transitive package versions for Microsoft.Extensions.AI 10.6.0 * Ignore external review check in Merge Gatekeeper --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>
* Add sampling guardrails to MCP tools Add approval, token, and request-count controls to the MCP sampling callback used when an MCPTool is configured with a chat client. - Add `sampling_approval_callback`, `sampling_max_tokens`, and `sampling_max_requests` parameters to `MCPTool` and its `MCPStdioTool`, `MCPStreamableHTTPTool`, and `MCPWebsocketTool` subclasses, positioned directly after `client`. - Gate each server-initiated `sampling/createMessage` request behind the approval callback, which denies by default when no callback is provided. - Clamp the requested `maxTokens` to `sampling_max_tokens` and enforce a per-session request count via `sampling_max_requests`. - Log incoming sampling requests at WARNING level (counts only). - Export `SamplingApprovalCallback` from the public API. - Add tests, a sample, and documentation updates. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make sampling denial message context-aware Distinguish the deny-by-default case (no approval callback configured) from an explicit denial by a configured `sampling_approval_callback`, so the returned ErrorData message is accurate for callback-driven denials and exceptions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ool results returning None (microsoft#6421) * Parse structuredContent from MCP CallToolResult (microsoft#3313) The _parse_tool_result_from_mcp method only iterated over the content field from CallToolResult, ignoring the structuredContent field entirely. MCP servers that return JSON data via structuredContent (e.g., Power BI MCP) appeared to return None. Add handling for structuredContent: when present, serialize it as JSON text and append it to the result list. This preserves the data for the LLM while maintaining backward compatibility with existing behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Python: Parse MCP CallToolResult.structuredContent field to prevent tool results returning None Fixes microsoft#3313 * Address review feedback: add default=str to json.dumps and remove .checkpoints/ - Add default=str to json.dumps for structuredContent serialization so non-JSON-serializable values (e.g. bytes) degrade gracefully instead of raising TypeError - Remove all .checkpoints/ runtime artifacts from the repository - Add **/.checkpoints/ to .gitignore to prevent future accidental commits - Add test for non-serializable structuredContent values Fixes microsoft#3313 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for microsoft#3313: Python: MCP CallToolResult.structuredContent field is not parsed, causing tool results to return None --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…microsoft#6410) * HarnessAgent: Disable compaction when max tokens not provided * Fix regression. * Address PR comments * Require max_output_tokens to be positive Reject max_output_tokens=0 (must be positive), mirroring max_context_window_tokens. Addresses PR review feedback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ey.csproj Co-authored-by: Roger Barreto <19890735+rogerbarreto@users.noreply.github.com>
…Agent to RC (microsoft#6454) * Update release version for 2026-06-10 release * Switch GitHub.Copilot to RC
* Fix .NET Copilot integration tests for SDK v1.0.0
- Remove hard-skip in favor of runtime Assert.Skip when COPILOT_GITHUB_TOKEN is not set
- Add [Trait("Category", "Integration")] for CI filtering
- Fix FunctionTool test: use explicit SessionConfig with Tools, OnPermissionRequest, and SystemMessage
- Mark RemoteMcp test as IntegrationDisabled (requires OAuth flow)
- Create explicit sessions in all tests and delete after each (cleanup)
- Remove unused System.Diagnostics import
- Simplify SkipIfCopilotNotConfigured to only check env var
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address review: use try/finally for session cleanup, IsNullOrWhiteSpace
- Wrap act/assert in try/finally so sessions are always deleted even on failure
- Use IsNullOrWhiteSpace instead of IsNullOrEmpty for token check
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add COPILOT_GITHUB_TOKEN to .NET integration test workflow
The Copilot SDK runtime reads this env var directly for authentication.
No Node.js/npm install needed - the SDK downloads the CLI binary at build time.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
microsoft#6018) * .NET: Add Hosted-Toolbox-AuthPaths sample and auto-map /readiness with toolbox health gating (microsoft#5777) Add a new hosted agent sample demonstrating five MCP tool authentication paths (API key, agent MI, project MI, custom OAuth, literal token) via a Foundry Toolbox. Package changes (Microsoft.Agents.AI.Foundry.Hosting): - MapFoundryResponses now auto-maps GET /readiness via MapHealthChecks, idempotent across Tier 1/2 (AgentHost, already mapped) and Tier 3 (WebApplication, gap filled). - AddFoundryResponses registers AddHealthChecks() so the pipeline is available. - AddFoundryToolboxes registers FoundryToolboxHealthCheck on the /readiness aggregate, gating readiness on pre-registered toolbox startup outcome (per spec section 3.1). - FoundryToolboxService now exposes StartupStatus and FailedToolboxNames properties. New types: - FoundryToolboxStartupStatus (public enum): Pending, Healthy, Failed, NoEndpoint. - FoundryToolboxHealthCheck (internal IHealthCheck): adapts startup status to the AspNetCore HealthChecks pipeline with failed toolbox names in result data. Tests: - 3 new tests for /readiness auto-mapping (Tier 3 default, pre-mapped skip, idempotent). - 4 new tests for FoundryToolboxHealthCheck (Pending, NoEndpoint, Failed, Healthy). - 3 enhanced FoundryToolboxServiceTests with StartupStatus assertions. * .NET: Align FoundryToolboxService with tools-integration-spec (microsoft#5777 Part A) Bring Microsoft.Agents.AI.Foundry.Hosting's toolbox path into compliance with tools-integration-spec.md sections 2-4, 6.3, and 9. Empirically validated against tao-foundry-prj: the previous code (reading FOUNDRY_AGENT_TOOLSET_ENDPOINT, which the platform never injects) silently registered zero tools in production. Package changes (Microsoft.Agents.AI.Foundry.Hosting): - FoundryToolboxService.StartAsync now derives the toolbox proxy base URL from the platform-injected FOUNDRY_PROJECT_ENDPOINT and constructs the per-toolbox URL as {FOUNDRY_PROJECT_ENDPOINT}/toolboxes/{name}/mcp?api-version={ApiVersion} per spec sections 2-3. The legacy FOUNDRY_AGENT_TOOLSET_ENDPOINT env var is removed outright (preview package, no production consumers). - FoundryToolboxOptions.ApiVersion default flipped to 'v1' to match spec example. - FoundryToolboxBearerTokenHandler always sends the mandatory Foundry-Features: Toolboxes=V1Preview header per spec section 2, merging any additional flags supplied via the FOUNDRY_AGENT_TOOLSET_FEATURES env var. - FoundryToolboxBearerTokenHandler token scope changed from https://cognitiveservices.azure.com/.default to https://ai.azure.com/.default per spec section 4. - FoundryToolboxBearerTokenHandler propagates W3C trace context (traceparent, tracestate, baggage) from Activity.Current per spec section 6.3. Sample changes: - Hosted-Toolbox-AuthPaths and Hosted-Toolbox Program.cs, README.md, and .env.example corrected to describe the actual env-var contract (FOUNDRY_PROJECT_ENDPOINT auto-injected; AZURE_AI_PROJECT_ENDPOINT as the local-dev fallback). Removes the misleading 'auto-injected by Foundry runtime' claims for FOUNDRY_AGENT_TOOLSET_ENDPOINT. - Hosted-Toolbox-AuthPaths/agent.manifest.yaml declares the toolbox and model dependencies under resources[] per the AgentManifest schema so azd ai agent init users get them provisioned automatically. Tests: - 4 new FoundryToolboxServiceTests covering env-var derivation, EndpointOverride precedence, trailing-slash normalization, and the existing NoEndpoint behavior under the new env var name. - 4 new FoundryToolboxBearerTokenHandlerTests covering token scope, mandatory feature header always present, header merging with override, no duplicate mandatory flag, trace context propagation from Activity.Current, and no override of caller-set traceparent. - New FoundryProjectEndpointEnvFixture xUnit collection definition serializes env-var-mutating tests across FoundryToolboxServiceTests and FoundryToolboxHealthCheckTests, preventing parallel-execution races. - FoundryToolboxHealthCheckTests adjusted for the new env var name. * .NET: Drop ACA prereq from Hosted-Toolbox-AuthPaths README (microsoft#5777 Part B) Empirically verified that any Azure Cognitive Services MCP endpoint already in the Foundry project (e.g., a Language service MCP) accepts Entra tokens and can serve Paths 2 and 3 without deploying a separate Azure MCP Server to ACA. README updates: - Step 0 rewritten: 'Identify an Entra-authenticated MCP target in your project' instead of 'Deploy Azure MCP Server to Azure Container Apps' (the original azmcp-foundry-aca-mi setup is now optional, not required). - Auth-paths matrix updated to describe AAD-based connections targeting a Cognitive Services MCP URL (e.g., Language service) instead of an ACA URL. - Step 2 connections table updated: the Entra ID category is now a single 'AAD' authType. The original 'Agent Identity' vs 'Project Managed Identity' as selectable connection sub-types is NOT exposed via the ARM control plane today; the platform selects the calling principal contextually. Both connections in the walkthrough share the same shape and target. - Added an explicit RBAC note: the agent identity AND project MI must hold the required role (typically Cognitive Services User) on the target resource; without it the MCP server returns HTTP 401 even though the connection wiring is correct. - Toolbox tool entries renamed lang_entra_agent / lang_entra_project to match the new connection names. Empirical validation supporting these changes is captured in the session plan.md (Part B addendum). * .NET: Document correct connection shape for Hosted-Toolbox-AuthPaths Paths 2/3 (microsoft#5777) Updates the sample README with the verified connection shape and RBAC procedure for Microsoft Entra agent-identity and project-managed-identity MCP authentication: - Connection authType values: AgenticIdentityToken (agent identity) and ProjectManagedIdentity (project MI), both with category=RemoteTool. - Top-level audience property required; for Cognitive Services targets the value is https://cognitiveservices.azure.com. - Connections created via ARM REST (the Foundry portal wizard does not yet expose these authTypes). - RBAC grants target the project's shared agent identity blueprint principal (project.properties.agentIdentity.agentIdentityId) for Path 2 and the project's system-assigned MI (project.identity.principalId) for Path 3. - Troubleshooting table updated with the audience-mismatch symptom and the startup-cache behavior of FoundryToolboxService. * .NET: Drop Path 3 (project MI) and align with new agent model in Hosted-Toolbox-AuthPaths (microsoft#5777) Updates the sample to use only the new Foundry agent object model and removes the project managed identity path: - Auth-path matrix reduced to four paths: key, Entra agent identity, custom OAuth, inline authorization. Project managed identity is moved into a note describing when it applies (multiple agents sharing access) rather than as a documented sample path. - RBAC instructions reference the agent's own instance_identity.principal_id from the agent ARM resource (new agent object model) instead of the project's shared agent identity blueprint (legacy model). - Step 2 (connections) creates only the AgenticIdentityToken connection. - Step 3 (toolbox tools) lists four tool entries instead of five. - Sample prompts and troubleshooting table updated to match. * .NET: Restore Path 3 (project MI) to Hosted-Toolbox-AuthPaths matrix (microsoft#5777) The sample's purpose is to enumerate every authentication path a Foundry toolbox can drive, not to pick one. Path 3 belongs alongside the other four with explicit guidance for when each path is the right choice. - Path 3 (project managed identity, authType=ProjectManagedIdentity) restored to the matrix with a 'When to pick this' column. - Step 2 (connections) provisions both lang-mcp-agent-id and lang-mcp-project-mi via ARM REST. - Step 3 (toolbox) lists five tool entries (one per path). - RBAC instructions cover both the agent's instance identity (Path 2) and the project's system-assigned MI (Path 3). - Sample prompts include all five paths. - Troubleshooting table updated accordingly. * .NET: Fix duplicate line in Hosted-Toolbox-AuthPaths README (microsoft#5777) * .NET: Fix broken markdown link to ToolCallingApprovalHostedAgentFixture (microsoft#5777) * .NET: Fix relative path depth in markdown link (microsoft#5777) * .NET: Address Copilot review feedback for microsoft#5777 - FoundryToolboxHealthCheck description: rename FOUNDRY_AGENT_TOOLSET_ENDPOINT → FOUNDRY_PROJECT_ENDPOINT (stale reference; operator-facing in /readiness body). - FoundryToolboxStartupStatus.NoEndpoint XML doc: same rename. - ServiceCollectionExtensions XML docs: same rename + URL shape update. - Foundry.Hosting.IntegrationTests.TestContainer: remove explicit app.MapGet('/readiness') — now redundant + would conflict with the auto-mapped readiness route from MapFoundryResponses. - Hosted-Toolbox-AuthPaths agent.manifest.yaml: parameterize TOOLBOX_NAME via {{TOOLBOX_NAME}} template substitution and declare it under parameters with a default of 'auth-paths-toolbox' so the README's 'use any name' guidance actually works for hosted deployments. * .NET: Address Copilot review round 2 — fallback env + dedup + naming (microsoft#5777) - FoundryToolboxService.StartAsync: fall back to AZURE_AI_PROJECT_ENDPOINT when FOUNDRY_PROJECT_ENDPOINT is absent. Matches the local-dev convention used by the samples and resolves the doc/code mismatch flagged in review. - FoundryToolboxHealthCheck description updated for the fallback. - AddFoundryToolboxes: guard against duplicate health-check registration via an explicit name-uniqueness check on HealthCheckServiceOptions.Registrations. AddCheck<T>(name, ...) does not dedupe by name, so repeated AddFoundryToolboxes calls would have registered multiple instances. - FoundryToolboxOptions.EndpointOverride doc: clarify URL becomes {EndpointOverride}/toolboxes/{name}/mcp (was missing /toolboxes/ segment). - Hosted-Toolbox sample (Program.cs + README): switch FOUNDRY_TOOLBOX_NAME to TOOLBOX_NAME (the FOUNDRY_* prefix is reserved by the platform), default changed from 'my-toolset' to 'my-toolbox', terminology updated from 'Toolset' to 'Toolbox'. - FoundryToolboxServiceTests: 2 test renames to reflect what they actually assert (StartupStatus + FailedToolboxNames, not URL shape directly). - Tests adjusted to clear both env vars in NoEndpoint scenarios. * .NET: Fix stale NoEndpoint XML doc and misleading test comment (microsoft#5777) Update FoundryToolboxStartupStatus.NoEndpoint XML doc to mention both FOUNDRY_PROJECT_ENDPOINT and AZURE_AI_PROJECT_ENDPOINT (the service checks both since the fallback was added). Fix test comment that claimed URL derivation validation when the test only asserts on StartupStatus and FailedToolboxNames. * Remove OAuth consent path from AuthPaths sample, keep four working auth paths The interactive OAuth identity passthrough path needs a protocol gap closed in the hosting package (the proprietary oauth_consent_request item is not representable through the OpenAI/MEAI abstractions), so it is deferred to a separate spike branch. This strips the OAuth path from the AuthPaths sample, the companion REPL client, the agent manifest, and the docs, then renumbers the inline Authorization path so the sample teaches four contiguous paths: API key via connection, Entra agent identity, Entra project managed identity, and inline Authorization (anti-pattern). Package code is unchanged; the consent infrastructure already present in main stays as baseline. Both samples build with --warnaserror and all 246 hosting unit tests pass. * .NET: Drop project MI auth path and dedicated client from Hosted-Toolbox-AuthPaths (microsoft#5777) Live validation against tao-foundry-prj showed the ProjectManagedIdentity path failing with an unresolved token audience 401, so the sample now ships three working auth paths instead of four: connection key, agent managed identity, and inline Authorization. Changes: - Remove the project managed identity path from the AuthPaths sample matrix, prerequisites, connections, toolbox table, prompts, Program.cs instructions and agent.manifest.yaml. - Delete the near duplicate Hosted-Toolbox-AuthPaths-Client project and remove it from the solution. The README now drives the agent with the shared SimpleAgent REPL via AsAIAgent(agentEndpoint). - Correct the troubleshooting note: the Foundry toolbox tools/list is all or nothing, so one bad source returns -32007, fails startup, and returns 424 for every path. Add the allowed_tools caveat that names must match the upstream server. - Mark the toolbox startup status and health check experimental under AgentsAIExperiments (MAAI001) instead of AIOpenAIResponses, and update the package NoWarn set accordingly. * .NET: Address PR review nits for Hosted-Toolbox-AuthPaths (microsoft#5777) - Remove duplicated NU1903 comment in Foundry.Hosting csproj. - Fix stale 'four-tool' cross-links in Hosted-Toolbox and Hosted-McpTools READMEs to describe the three-path toolbox driven by the shared SimpleAgent REPL. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * .NET: Address toolbox startup-status review feedback (microsoft#5777) - Rename FoundryToolboxStartupStatus.Failed to Unhealthy so it is the proper opposite of Healthy, and clarify the doc comment covers the partial-failure case. - Raise the missing-endpoint toolbox log from Information to Warning, since enabling toolboxes is an explicit opt-in and a silently disabled toolbox warrants a higher-severity signal. - Update unit tests and the AuthPaths README troubleshooting row accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * .NET: Reword toolbox-wiring comment to avoid hosting-layer internals (microsoft#5777) Address PR review feedback: explain how a Foundry Toolbox is attached using the public API (AddFoundryToolboxes vs the CreateHostedMcpToolbox marker) and observable behavior, instead of naming the internal AgentFrameworkResponseHandler type and FoundryToolboxService.Tools property. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
microsoft#6455) (microsoft#6457) * .NET: Make GitHub.Copilot.SDK build targets reach transitive consumers (microsoft#6455) Microsoft.Agents.AI.GitHub.Copilot now ships a buildTransitive/ bridge so consumers who only reference this package (the normal use case) get the GitHub.Copilot.SDK's CLI binary-download MSBuild targets executed at build time. Without this, the SDK shipped its targets under build/ which NuGet only auto-imports for projects with a direct PackageReference to the SDK, so consumers of the adapter package got only the managed .dll, no copilot.exe in their output, and a runtime InvalidOperationException on the first RunAsync. The bridge consists of two files under buildTransitive/: * Microsoft.Agents.AI.GitHub.Copilot.props is generated at this package's pack time and pins the SDK version (from PackageVersion items in Directory.Packages.props) into _MicrosoftAgentsAICopilotSdkVersion. * Microsoft.Agents.AI.GitHub.Copilot.targets is static and imports the SDK's own build/GitHub.Copilot.SDK.targets from the NuGet cache using the pinned version. The version-pin condition no-ops gracefully if the resolved SDK differs from what was baked in (e.g. consumer overrides the SDK version directly), so this is purely additive. Verified by packing locally, restoring from a flat local feed, and building a transitive-only consumer (PackageReference to MAF only, no direct SDK ref). copilot.exe lands at bin/{cfg}/{tfm}/runtimes/{rid}/ native/copilot.exe as expected, matching the path the SDK's runtime CopilotClient looks at. Fixes microsoft#6455 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address Copilot review feedback (microsoft#6457) - buildTransitive/.targets: compute the full SDK targets path with a single Path.Combine call into one property (_MicrosoftAgentsAICopilotSdkTargetsPath), used in both Project= and Exists() — no more split between Path.Combine for the directory and inline / separator for the file name. - Split the version-defaulting Condition between the two files: the generated .props now just bakes the packaged SDK version into a dedicated property (_MicrosoftAgentsAICopilotSdkPackagedVersion), and the static .targets file is the single place that defaults _MicrosoftAgentsAICopilotSdkVersion to it. Removes the need for any MSBuild escape gymnastics in the pack-time string construction, and keeps the consumer override path the same. - _GenerateBuildTransitiveProps now hangs off public BeforeTargets (Build, Pack) in addition to _GetPackageFiles, so the file is generated even without a full pack, and we're not solely dependent on an underscore-prefixed internal target. The <None Pack=true /> items live in a top-level ItemGroup so they are collected at evaluation time instead of being added from inside the Target. End-to-end retested with a transitive-only consumer (PackageReference to MAF only, no direct GitHub.Copilot.SDK ref): copilot.exe lands at bin/Debug/net10.0/runtimes/win-x64/native/copilot.exe (141.8 MB) as before. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Tamir Dresher <tamirdresher@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* declarative workflow approval flow fix * Update mcp handler cache construction * fix method argument. * Update dotnet/src/Microsoft.Agents.AI.Workflows.Declarative/ObjectModel/InvokeFunctionToolExecutor.cs Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix identation --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Motivation and Context
The .NET Agent Framework currently has no dedicated Valkey integration package. Teams running Valkey in production — whether self-hosted or through managed cloud services like AWS ElastiCache for Valkey or GCP Memorystore for Valkey — have no first-party way to use Valkey for persistent chat history or long-term memory context. This has become increasingly common since the Redis license change.
The Microsoft.Agents.AI.Valkey package was scaffolded with working ValkeyChatHistoryProvider and ValkeyContextProvider implementations. This PR completes the package by adding missing pieces identified during review, fixing bugs in the existing sample, and adding a Bedrock-powered sample for AWS users.
Fixes issue microsoft#5445
Description
###Valkey package improvements:
###Bug fix in existing sample:
###New Bedrock sample (AgentWithMemory_Step03_MemoryUsingValkey_Bedrock):
###Infrastructure:
Added AWSSDK.Extensions.Bedrock.MEAI v4.0.6.10 to Directory.Packages.props
Registered the Bedrock sample in agent-framework-dotnet.slnx
Unit tests:
Contribution Checklist