MCPToolset "Attempted to exit cancel scope in a different task" under to_a2a() multi-agent (A2A) — scales with concurrent McpToolset count; SSE transport is race-free

## 🔴 Required Information

**Describe the Bug:**
When an `LlmAgent` is served via `to_a2a()` and holds one or more `McpToolset`s using `StreamableHTTPServerParams`, MCP session creation during tool discovery intermittently fails with:

`Failed to create MCP session: ... unhandled errors in a TaskGroup (1 sub-exception)` whose inner exception is `RuntimeError: Attempted to exit cancel scope in a different task than it was entered in`.

ADK retries `get_tools`, but under load the agent proceeds with an **incomplete tool list** — the agent's own MCP tools silently disappear from the spec, the LLM then calls a now-missing tool and the A2A request hard-fails (`ValueError: Tool '<name>' not found`).

This is the same root cause as #4454, but on a different and very common surface: **not `adk web` + Cloud Run, but `to_a2a()` ASGI agents in a hub-and-spoke multi-agent system** (a manager dispatches to A2A sub-agents; each sub-agent owns its MCP server). We also add quantified new findings: the failure rate is ~linear in the number of concurrent `McpToolset` sessions per process, and **the SSE transport does not exhibit the race at all when there is exactly one session**.

**Steps to Reproduce:**
1. Stand up an MCP server with Streamable HTTP transport.
2. Build an `LlmAgent` whose `tools` include N `McpToolset(connection_params=StreamableHTTPServerParams(url=...))` (N ≥ 1; the rate rises sharply with N ≥ 2).
3. Serve it with `to_a2a(agent)` (uvicorn ASGI).
4. Drive it with repeated A2A requests from a separate manager agent (each request triggers tool discovery / `get_tools`).
5. Observe intermittent MCP-session-creation failures; with N ≥ 2 the agent regularly proceeds with a truncated tool list and the request fails with `Tool '<own-tool>' not found`.

**Expected Behavior:**
`McpToolset` consistently creates its MCP session and returns the full tool list per A2A request, regardless of how many `McpToolset`s the agent holds or how it is served.

**Observed Behavior:**
`RuntimeError: Attempted to exit cancel scope in a different task than it was entered in`, surfaced as `unhandled errors in a TaskGroup (1 sub-exception)`, in a tight `Retrying get_tools` loop. Occurrence count per single A2A task, swept by concurrent session count (real production measurements, identical workload):

| concurrent McpToolset sessions in the process | race occurrences / task | task outcome |
|---|---|---|
| 3 (streamable-http) | 84 | survived only via ADK retry |
| 2 (streamable-http) | 40 | HARD FAIL — own tools dropped, `Tool 'run_python_code' not found` |
| 1 (SSE transport) | 0 | clean, reproduced 0 across many runs |
| 1 (streamable-http) | still races | — |

**Environment Details:**

 - ADK Library Version (pip show google-adk): 1.33.0
 - Desktop OS: Linux (Debian 13 / Docker container; host macOS)
 - Python Version (python -V): 3.12.13

**Model Information:**

 - Are you using LiteLLM: No
 - Which model is being used: gemini-2.5-flash (model-independent — this is a tool-discovery transport bug; it reproduces regardless of model)

---

## 🟡 Optional Information

**Regression:**
Per #4454, worked on ADK 1.14.0; broken from 1.24.x onward. We confirm it is still present on 1.33.0. Bisecting to ADK 1.31.1 did not help.

**Logs:**
```text
INFO  mcp_session_manager: Retrying get_tools due to error: Failed to
      create MCP session: Failed to create MCP session: unhandled errors
      in a TaskGroup (1 sub-exception)
WARN  session_context: Error on session runner task: unhandled errors in
      a TaskGroup (1 sub-exception)
WARN  llm_agent: Failed to get tools from toolset McpToolset: Failed to
      create MCP session: ... (1 sub-exception)
ERROR a2a_agent_executor_impl: Error handling A2A request:
      Tool 'run_python_code' not found.
ValueError: Tool 'run_python_code' not found.
```
(The inner `RuntimeError: Attempted to exit cancel scope in a different task` is collapsed by the task-group exception-squashing; the message string matches #4454 exactly.)

**Additional Context:**
- Reproduces with the agent's own MCP server only (single tool source) — it is per-connection, not specific to any one MCP server.
- Tried and did **not** fix it: anyio 3.x downgrade (hard dependency conflict — starlette/adk require anyio≥4.9), A2A Executor V2 / `force_new_version`, PR #5509 (`use_isolated_event_loop`, would not apply cleanly to 1.33), ADK 1.31.1.
- The **only** reliable mitigation we found: switch the `McpToolset` to the **SSE transport** (`SseConnectionParams`) **and** keep exactly **one** `McpToolset` per agent process (move any shared/cross-agent tools to a plain blocking HTTP client instead of a 2nd `McpToolset`). With that, race occurrences went 84 → 0 and stayed 0 across many production runs.
- Likely upstream root cause: modelcontextprotocol/python-sdk #577 and #1805 (`streamablehttp_client` task group entered/exited across asyncio tasks). Cross-reference: #4454, PR #5509.

**Minimal Reproduction Code:**
```python
from google.adk.agents import LlmAgent
from google.adk.a2a.utils.agent_to_a2a import to_a2a
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import (
    StreamableHTTPServerParams,
)

agent = LlmAgent(
    name="leaf",
    model="gemini-2.5-flash",
    tools=[
        McpToolset(connection_params=StreamableHTTPServerParams(
            url="http://mcp-a:8082/mcp")),
        McpToolset(connection_params=StreamableHTTPServerParams(
            url="http://mcp-b:8081/mcp")),  # 2+ sessions -> high rate
    ],
    disallow_transfer_to_parent=True,
    disallow_transfer_to_peers=True,
)

a2a_app = to_a2a(agent)  # serve with uvicorn; drive with repeated A2A requests
```

**How often has this issue occurred?:**

 - Intermittently per attempt, but effectively Always under realistic multi-agent concurrency (84 occurrences in a single task at 3 concurrent sessions).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCPToolset "Attempted to exit cancel scope in a different task" under to_a2a() multi-agent (A2A) — scales with concurrent McpToolset count; SSE transport is race-free #5729

🔴 Required Information

🟡 Optional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

concurrent McpToolset sessions in the process	race occurrences / task	task outcome
3 (streamable-http)	84	survived only via ADK retry
2 (streamable-http)	40	HARD FAIL — own tools dropped, `Tool 'run_python_code' not found`
1 (SSE transport)	0	clean, reproduced 0 across many runs
1 (streamable-http)	still races	—

MCPToolset "Attempted to exit cancel scope in a different task" under to_a2a() multi-agent (A2A) — scales with concurrent McpToolset count; SSE transport is race-free #5729

Description

🔴 Required Information

🟡 Optional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions