Skip to content

Python: Persist hosted MCP call/results as canonical mcp_call output#6070

Open
Hameedkunkanoor wants to merge 8 commits into
microsoft:mainfrom
Hameedkunkanoor:hameed-kunkanoor/mcp-toolbox-fix-single
Open

Python: Persist hosted MCP call/results as canonical mcp_call output#6070
Hameedkunkanoor wants to merge 8 commits into
microsoft:mainfrom
Hameedkunkanoor:hameed-kunkanoor/mcp-toolbox-fix-single

Conversation

@Hameedkunkanoor
Copy link
Copy Markdown

@Hameedkunkanoor Hameedkunkanoor commented May 25, 2026

Motivation and Context

This PR recreates and carries forward the original hosted MCP persistence fix from #5950, adapted on top of current main and finalized with follow-up CI/type-safety hardening.
Fix for #5546.

Problem addressed:

  • Follow-up turns could fail with a 400 when replayed hosted MCP history contained an unbalanced tool-output shape.
  • Hosted MCP call and result could be persisted in split form, which could replay as orphaned output.

Scenario supported:

  • Cross-turn hosted MCP conversations where tool call identity and tool result remain paired in persisted/replayed history.

Dependency and rollout note:

Description

Overall approach:

  • Keep foundry_hosting write-side persistence on canonical single-item mcp_call representation and keep replay reconstruction aligned with that shape.

Changes made:

  1. Preserve original MCP call id when opening MCP builders via item_id/call_id mapping.
  2. Streaming conversion path: mcp_server_tool_result completes the active mcp_call builder instead of falling back to custom_tool_call_output.
  3. Non-streaming conversion path: adjacent mcp_server_tool_call + mcp_server_tool_result are coalesced into one completed mcp_call output item.
  4. Replay reconstruction: persisted mcp_call items with output reconstruct back into MCP call/result content.
  5. Dependency update: bump foundry_hosting floor from azure-ai-agentserver-responses>=1.0.0b5,<2 to >=1.0.0b7,<2.
  6. Test coverage: scenario and regression tests for streaming/non-streaming persistence, replay reconstruction for both item shapes, and multi-turn round-trip behavior.
  7. Follow-up hardening from recreation cycle: fixes for missing Mapping import, pyright/mypy typing issues in MCP output stringification, and safer mapping serialization fallback.

Files changed:

  • python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py
  • python/packages/foundry_hosting/tests/test_responses.py
  • python/packages/foundry_hosting/pyproject.toml

Validation

  • Updated unit tests in tests/test_responses.py for MCP call/result coalescing, call-id mapping, output persistence, and replay reconstruction.
  • CI follow-up fixes were applied to satisfy package checks (typing/lint) on the recreated PR branch.

Risk / Impact

  • Low-to-moderate behavioral risk in response-event shaping for hosted MCP events.
  • Expected impact: deterministic canonical MCP output structure for clients and replay paths, avoiding orphaned function/tool output in follow-up turns.

Relationship to Prior PR

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • Unit tests were updated for the new behavior
  • No breaking API surface change intended

- Preserve hosted MCP call/result pairs as canonical mcp_call output items

- Coalesce MCP call + result in non-streaming conversion path

- Keep call-id alignment for MCP tool call tracking and output mapping

- Update tests and package metadata
Copilot AI review requested due to automatic review settings May 25, 2026 13:58
@github-actions github-actions Bot changed the title Persist hosted MCP call/results as canonical mcp_call output Python: Persist hosted MCP call/results as canonical mcp_call output May 25, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates the Foundry hosting responses adapter to coalesce hosted MCP tool call + tool result into a single mcp_call output item (including output), adds round-trip coverage for the new behavior, and bumps the azure-ai-agentserver-responses dependency to pick up the needed model/events support.

Changes:

  • Coalesce hosted MCP mcp_server_tool_call + mcp_server_tool_result into a single mcp_call output item (non-streaming and streaming).
  • Reconstruct MCP result content when reading mcp_call items that include output.
  • Add tests for persistence, streaming emission, reconstruction, and multi-turn history replay; bump azure-ai-agentserver-responses to b6.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py Adds hosted-MCP coalescing and output stringification; updates streaming handler and item-to-message reconstruction to carry MCP output.
python/packages/foundry_hosting/tests/test_responses.py Adds regression tests ensuring MCP calls/results persist/stream as a single mcp_call and replay correctly across turns.
python/packages/foundry_hosting/pyproject.toml Bumps azure-ai-agentserver-responses to a newer beta required for MCP output support.

Comment thread python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py Outdated
Comment thread python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py Outdated
@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented May 26, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/foundry_hosting/agent_framework_foundry_hosting
   _responses.py74712083%183–186, 251, 328–329, 339, 376, 431, 445, 495, 498–502, 521, 524, 530, 532, 553–555, 584–586, 591, 593, 600, 602–603, 605, 607, 613, 617, 619–621, 625, 628, 633–639, 642–643, 645–646, 654–659, 960, 973, 1442–1444, 1446, 1493–1494, 1496–1497, 1499–1500, 1502–1503, 1508, 1517, 1520–1522, 1524, 1538, 1551, 1596–1597, 1599, 1604–1608, 1610, 1617–1618, 1620–1621, 1627, 1629–1633, 1640, 1646, 1668, 1674, 1680, 1682, 1684–1691, 1699, 1701, 1745–1747, 1757–1758
TOTAL36359432488% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7245 34 💤 0 ❌ 0 🔥 1m 53s ⏱️

@moonbox3 moonbox3 requested a review from eavanvalkenburg May 26, 2026 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants