Skip to content

Commit c7f646e

Browse files
authored
feat(mcp): centralize MCP parsing middleware and add comprehensive documentation (#605)
1 parent f0a7e00 commit c7f646e

File tree

10 files changed

+1151
-108
lines changed

10 files changed

+1151
-108
lines changed

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ consistency, and security.
4040
- [Run MCP servers using protocol schemes](#run-mcp-servers-using-protocol-schemes)
4141
- [Advanced usage](#advanced-usage)
4242
- [Customize permissions](#customize-permissions)
43+
- [Security and middleware architecture](#security-and-middleware-architecture)
4344
- [Run ToolHive in Kubernetes](#run-toolhive-in-kubernetes)
4445
- [Add an MCP server to the registry](#add-an-mcp-server-to-the-registry)
4546
- [API Documentation](#api-documentation)
@@ -525,6 +526,14 @@ Two built-in profiles are included for convenience:
525526
- `network`: Permits outbound network connections to any host on any port (not
526527
recommended for production use).
527528

529+
### Security and middleware architecture
530+
531+
ToolHive uses a layered middleware architecture to provide authentication, authorization, and auditing capabilities for MCP servers. The middleware chain ensures secure request processing and comprehensive observability.
532+
533+
For detailed information about the middleware architecture, including request flow diagrams and configuration options, see the [Middleware Architecture](./docs/middleware.md) documentation.
534+
535+
For authorization-specific configuration and Cedar policy examples, see the [Authorization Framework](./docs/authz.md) documentation.
536+
528537
### Run ToolHive in Kubernetes
529538

530539
ToolHive can also be used to deploy MCP servers in a Kubernetes cluster via our

docs/middleware.md

Lines changed: 354 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,354 @@
1+
# Middleware Architecture
2+
3+
This document describes the middleware architecture used in ToolHive for processing MCP (Model Context Protocol) requests. The middleware chain provides authentication, parsing, authorization, and auditing capabilities in a modular and extensible way.
4+
5+
## Overview
6+
7+
ToolHive uses a layered middleware architecture to process incoming MCP requests. Each middleware component has a specific responsibility and operates in a well-defined order to ensure proper request handling, security, and observability.
8+
9+
The middleware chain consists of the following components:
10+
11+
1. **Authentication Middleware**: Validates JWT tokens and extracts client identity
12+
2. **MCP Parsing Middleware**: Parses JSON-RPC MCP requests and extracts structured data
13+
3. **Authorization Middleware**: Evaluates Cedar policies to authorize requests
14+
4. **Audit Middleware**: Logs request events for compliance and monitoring
15+
16+
## Architecture Diagram
17+
18+
```mermaid
19+
graph TD
20+
A[Incoming MCP Request] --> B[Authentication Middleware]
21+
B --> C[MCP Parsing Middleware]
22+
C --> D[Authorization Middleware]
23+
D --> E[Audit Middleware]
24+
E --> F[MCP Server Handler]
25+
26+
B --> B1[JWT Validation]
27+
B1 --> B2[Extract Claims]
28+
B2 --> B3[Add to Context]
29+
30+
C --> C1[JSON-RPC Parsing]
31+
C1 --> C2[Extract Method & Params]
32+
C2 --> C3[Extract Resource ID & Args]
33+
C3 --> C4[Store Parsed Data]
34+
35+
D --> D1[Get Parsed MCP Data]
36+
D1 --> D2[Create Cedar Entities]
37+
D2 --> D3[Evaluate Policies]
38+
D3 --> D4{Authorized?}
39+
D4 -->|Yes| D5[Continue]
40+
D4 -->|No| D6[403 Forbidden]
41+
42+
E --> E1[Determine Event Type]
43+
E1 --> E2[Extract Audit Data]
44+
E2 --> E3[Log Event]
45+
46+
style A fill:#e1f5fe
47+
style F fill:#e8f5e8
48+
style D6 fill:#ffebee
49+
```
50+
51+
## Middleware Flow
52+
53+
```mermaid
54+
sequenceDiagram
55+
participant Client
56+
participant Auth as Authentication
57+
participant Parser as MCP Parser
58+
participant Authz as Authorization
59+
participant Audit as Audit
60+
participant Server as MCP Server
61+
62+
Client->>Auth: HTTP Request with JWT
63+
Auth->>Auth: Validate JWT Token
64+
Auth->>Auth: Extract Claims
65+
Note over Auth: Add claims to context
66+
67+
Auth->>Parser: Request + JWT Claims
68+
Parser->>Parser: Parse JSON-RPC
69+
Parser->>Parser: Extract MCP Method
70+
Parser->>Parser: Extract Resource ID & Arguments
71+
Note over Parser: Add parsed data to context
72+
73+
Parser->>Authz: Request + Parsed MCP Data
74+
Authz->>Authz: Get Parsed Data from Context
75+
Authz->>Authz: Create Cedar Entities
76+
Authz->>Authz: Evaluate Policies
77+
78+
alt Authorized
79+
Authz->>Audit: Authorized Request
80+
Audit->>Audit: Extract Event Data
81+
Audit->>Audit: Log Audit Event
82+
Audit->>Server: Process Request
83+
Server->>Client: Response
84+
else Unauthorized
85+
Authz->>Client: 403 Forbidden
86+
end
87+
```
88+
89+
## Middleware Components
90+
91+
### 1. Authentication Middleware
92+
93+
**Purpose**: Validates JWT tokens and extracts client identity information.
94+
95+
**Location**: `pkg/auth/middleware.go`
96+
97+
**Responsibilities**:
98+
- Validate JWT token signature and expiration
99+
- Extract JWT claims (sub, name, roles, etc.)
100+
- Add claims to request context for downstream middleware
101+
102+
**Context Data Added**:
103+
- JWT claims with `claim_` prefix (e.g., `claim_sub`, `claim_name`)
104+
105+
### 2. MCP Parsing Middleware
106+
107+
**Purpose**: Parses JSON-RPC MCP requests and extracts structured information.
108+
109+
**Location**: `pkg/mcp/parser.go`
110+
111+
**Responsibilities**:
112+
- Parse JSON-RPC 2.0 messages
113+
- Extract MCP method names (e.g., `tools/call`, `resources/read`)
114+
- Extract resource IDs and arguments based on method type
115+
- Store parsed data in request context
116+
117+
**Context Data Added**:
118+
- `ParsedMCPRequest` containing:
119+
- Method name
120+
- Request ID
121+
- Raw parameters
122+
- Extracted resource ID
123+
- Extracted arguments
124+
125+
**Supported MCP Methods**:
126+
- `initialize` - Client initialization
127+
- `tools/call`, `tools/list` - Tool operations
128+
- `prompts/get`, `prompts/list` - Prompt operations
129+
- `resources/read`, `resources/list` - Resource operations
130+
- `notifications/*` - Notification messages
131+
- `ping`, `logging/setLevel` - System operations
132+
133+
### 3. Authorization Middleware
134+
135+
**Purpose**: Evaluates Cedar policies to determine if requests are authorized.
136+
137+
**Location**: `pkg/authz/middleware.go`
138+
139+
**Responsibilities**:
140+
- Retrieve parsed MCP data from context
141+
- Create Cedar entities (Principal, Action, Resource)
142+
- Evaluate Cedar policies against the request
143+
- Allow or deny the request based on policy evaluation
144+
- Filter list responses based on user permissions
145+
146+
**Dependencies**:
147+
- Requires JWT claims from Authentication middleware
148+
- Requires parsed MCP data from MCP Parsing middleware
149+
150+
### 4. Audit Middleware
151+
152+
**Purpose**: Logs request events for compliance, monitoring, and debugging.
153+
154+
**Location**: `pkg/audit/auditor.go`
155+
156+
**Responsibilities**:
157+
- Determine event type based on request characteristics
158+
- Extract audit-relevant data from request and response
159+
- Log structured audit events
160+
- Track request duration and outcome
161+
162+
**Event Types**:
163+
- `mcp_tool_call` - Tool execution events
164+
- `mcp_resource_read` - Resource access events
165+
- `mcp_prompt_get` - Prompt retrieval events
166+
- `mcp_list_operation` - List operation events
167+
- `http_request` - General HTTP request events
168+
169+
## Data Flow Through Context
170+
171+
The middleware chain uses Go's `context.Context` to pass data between components:
172+
173+
```mermaid
174+
graph LR
175+
A[Request Context] --> B[+ JWT Claims]
176+
B --> C[+ Parsed MCP Data]
177+
C --> D[+ Authorization Result]
178+
D --> E[+ Audit Metadata]
179+
180+
subgraph "Authentication"
181+
B
182+
end
183+
184+
subgraph "MCP Parser"
185+
C
186+
end
187+
188+
subgraph "Authorization"
189+
D
190+
end
191+
192+
subgraph "Audit"
193+
E
194+
end
195+
```
196+
197+
## Configuration
198+
199+
### Enabling Middleware
200+
201+
The middleware chain is automatically configured when starting an MCP server with ToolHive:
202+
203+
```bash
204+
# Basic MCP server (Authentication + Parsing + Audit)
205+
thv run --transport sse --name my-server my-image:latest
206+
207+
# With authorization enabled
208+
thv run --transport sse --name my-server --authz-config authz.yaml my-image:latest
209+
210+
# With custom audit configuration
211+
thv run --transport sse --name my-server --audit-config audit.yaml my-image:latest
212+
```
213+
214+
### Middleware Order
215+
216+
The middleware order is critical and enforced by the system:
217+
218+
1. **Authentication** - Must be first to establish client identity
219+
2. **MCP Parsing** - Must come after authentication to access JWT context
220+
3. **Authorization** - Must come after parsing to access structured MCP data
221+
4. **Audit** - Must be last to capture the complete request lifecycle
222+
223+
## Error Handling
224+
225+
Each middleware component handles errors gracefully:
226+
227+
```mermaid
228+
graph TD
229+
A[Request] --> B{Auth Valid?}
230+
B -->|No| C[401 Unauthorized]
231+
B -->|Yes| D{MCP Parseable?}
232+
D -->|No| E[Continue without parsing]
233+
D -->|Yes| F{Authorized?}
234+
F -->|No| G[403 Forbidden]
235+
F -->|Yes| H[Process Request]
236+
237+
style C fill:#ffebee
238+
style G fill:#ffebee
239+
style H fill:#e8f5e8
240+
```
241+
242+
**Error Responses**:
243+
- `401 Unauthorized` - Invalid or missing JWT token
244+
- `403 Forbidden` - Valid token but insufficient permissions
245+
- `400 Bad Request` - Malformed MCP request (when parsing is required)
246+
247+
## Performance Considerations
248+
249+
### Parsing Optimization
250+
251+
The MCP parsing middleware uses efficient strategies:
252+
253+
- **Map-based method handlers** instead of large switch statements
254+
- **Single-pass parsing** of JSON-RPC messages
255+
- **Lazy evaluation** - only parses MCP-specific endpoints
256+
- **Context reuse** - parsed data shared across middleware
257+
258+
### Authorization Caching
259+
260+
The authorization middleware optimizes policy evaluation:
261+
262+
- **Policy compilation** happens once at startup
263+
- **Entity creation** is optimized for common patterns
264+
- **Result caching** for repeated identical requests (when enabled)
265+
266+
## Monitoring and Observability
267+
268+
### Audit Events
269+
270+
All middleware components contribute to audit events:
271+
272+
```json
273+
{
274+
"type": "mcp_tool_call",
275+
"loggedAt": "2025-06-03T13:02:28Z",
276+
"source": {"type": "network", "value": "192.0.2.1"},
277+
"outcome": "success",
278+
"subjects": {"user": "user123"},
279+
"component": "toolhive-api",
280+
"target": {
281+
"endpoint": "/messages",
282+
"method": "POST",
283+
"type": "tool",
284+
"resource_id": "weather"
285+
},
286+
"data": {
287+
"request": {"location": "New York"},
288+
"response": {"temperature": "22°C"}
289+
},
290+
"metadata": {
291+
"auditId": "uuid",
292+
"duration_ms": 150,
293+
"transport": "http"
294+
}
295+
}
296+
```
297+
298+
### Metrics
299+
300+
Key metrics tracked by the middleware:
301+
302+
- **Request duration** - Time spent in each middleware component
303+
- **Authorization decisions** - Permit/deny rates and reasons
304+
- **Parsing success rates** - MCP message parsing statistics
305+
- **Error rates** - Authentication and authorization failures
306+
307+
## Extending the Middleware
308+
309+
### Adding New Middleware
310+
311+
To add new middleware to the chain:
312+
313+
1. Implement the `func(http.Handler) http.Handler` interface
314+
2. Add configuration options to the runner
315+
3. Insert at the appropriate position in the chain
316+
4. Update tests to include the new middleware
317+
318+
### Custom Authorization Policies
319+
320+
See the [Authorization Framework](authz.md) documentation for details on writing Cedar policies.
321+
322+
### Custom Audit Events
323+
324+
The audit middleware can be extended to capture additional event types and data fields based on your requirements.
325+
326+
## Troubleshooting
327+
328+
### Common Issues
329+
330+
**Middleware Order Problems**:
331+
- Ensure authentication runs before authorization
332+
- Ensure MCP parsing runs before authorization
333+
- Check that all required middleware is included in tests
334+
335+
**Context Data Missing**:
336+
- Verify middleware order is correct
337+
- Check that upstream middleware completed successfully
338+
- Ensure context keys are correctly defined and used
339+
340+
**Performance Issues**:
341+
- Monitor middleware execution time
342+
- Check for inefficient policy evaluation
343+
- Consider enabling authorization result caching
344+
345+
### Debug Information
346+
347+
Enable debug logging to see middleware execution:
348+
349+
```bash
350+
export LOG_LEVEL=debug
351+
thv run --transport sse --name my-server my-image:latest
352+
```
353+
354+
This will show detailed information about each middleware component's execution and data flow.

0 commit comments

Comments
 (0)