๐Ÿ‘ค
User
Browser
โ†’
API Gateway
Public
โ†’
ฮป
Lambda
10.200.1.91
โ†’
๐Ÿ”€
TGW
tgw-xxx
โ†’
SSL Decrypt โœ…
Prisma
Prisma AIRS
Firewall
โ†’
๐Ÿ”€
TGW
Return
โ†’
๐Ÿง 
Bedrock
10.200.2.226

๐Ÿ”ต AI Chatbot - Direct

Simple Q&A with Web Search

10.200.1.91 (Lambda ENI) โ†’ 10.200.2.226 (Bedrock)
๐Ÿ‘‹ Hi! I'm powered by Claude Sonnet 4.6 via AWS Bedrock.

๐Ÿ”“ SSL Decryption Enabled
Traffic inspected at Prisma AIRS โ€ข Full payload visibility
Ask me anything!

๐Ÿ”‘ Key Difference: Tool Calling vs Direct Response

๐Ÿ”ต Direct Chatbot
User: "What is Paris?"
Flow: 1 HTTP POST โ†’ Direct Answer
Sessions: 1 TCP, 1 HTTP POST
Duration: ~3-5 seconds
Tools: None (answers from knowledge)
๐ŸŸข Travel Agent (Tool Calling)
User: "Find flights to Paris"
Flow: 2 HTTP POSTs in loop
Sessions: 1 TCP, 2 HTTP POSTs
Duration: ~10-15 seconds
Tools: search_flights (real data)

๐Ÿ“ก Network Path (Same as Direct Chatbot)

๐Ÿ‘ค
User
Browser
โ†’
API Gateway
Public
โ†’
Tool Orchestration
ฮป
Agent Lambda
10.200.1.91
โ†’
๐Ÿ”€
TGW
tgw-xxx
โ†’
SSL Decrypt โœ…
Prisma
Prisma AIRS
Sees ALL HTTP Requests
โ†’
๐Ÿ”€
TGW
Return
โ†’
๐Ÿง 
Bedrock
10.200.2.226
๐Ÿ”„ What is Tool Calling? (Example: "Find flights to Paris")
โฑ๏ธ 00:00 - TCP Connection Established
Lambda (10.200.1.147) โ†’ Bedrock (10.200.2.226)
Connection stays OPEN (HTTP Keep-Alive)
๐Ÿ“ก Iteration 1 - HTTP POST #1 (00:01)
โ†’ Request: Query + 4 tool definitions (search_flights, search_hotels, etc.)
โ†’ Goes through: Lambda โ†’ TGW โ†’ Prisma AIRS (SSL decrypt) โ†’ Bedrock
โ† Response (00:05): stopReason="tool_use"
โ† Claude says: "I need search_flights tool with params: origin=NYC, dest=Paris"
โš ๏ธ Firewall sees: HTTP POST /converse (request + response)
๐Ÿ”ง Tool Execution (00:05 - LOCAL, NO NETWORK)
โ†’ Lambda executes search_flights() locally (mock data)
โ†’ Returns: [Flight 1: $715, Flight 2: $728, Flight 3: $747]
โฑ๏ธ Takes ~0.1 seconds (no network call)
๐Ÿ“ก Iteration 2 - HTTP POST #2 (00:06)
โ†’ Request: Tool results (flight data)
โ†’ SAME TCP connection (reused)
โ†’ Goes through: Lambda โ†’ TGW โ†’ Prisma AIRS (SSL decrypt) โ†’ Bedrock
โ† Response (00:15): stopReason="end_turn"
โ† Claude returns: Formatted answer with flight table
โš ๏ธ Firewall sees: HTTP POST /converse (request + response)
โฑ๏ธ 00:15 - TCP Connection Closed
Total Duration: 15 seconds
HTTP POSTs: 2 (both visible on Prisma AIRS)
๐Ÿ” What Prisma AIRS Firewall Sees
Traffic Log Entry:
Source: 10.200.1.147 (Lambda ENI)
Destination: 10.200.2.226:443 (Bedrock)
Protocol: SSL/HTTPS
Duration: 15 seconds
Sessions: 1 TCP session
Decrypted: Yes โœ…
Bytes Transferred: ~8 KB
HTTP Transactions (drill into session details):
POST #1: /model/us.anthropic.claude-sonnet-4-6/converse
Time: 00:01
Request Body: Query + tool definitions
Response: tool_use (search_flights)

POST #2: /model/us.anthropic.claude-sonnet-4-6/converse
Time: 00:06
Request Body: Tool results
Response: Final formatted answer
โœ… Summary: Tool Calling Traffic Pattern
Key Points:
โ€ข ONE TCP session from Lambda to Bedrock (HTTP Keep-Alive)
โ€ข MULTIPLE HTTP POST requests within that session (1 per iteration)
โ€ข All requests visible on Prisma AIRS with SSL decryption
โ€ข Tool execution happens locally in Lambda (no network call)
โ€ข Same network path as Direct Chatbot (10.200.1.91 โ†’ Firewall โ†’ 10.200.2.226)
โ€ข Difference: Direct = 1 HTTP POST, Tool Calling = 2+ HTTP POSTs

๐ŸŸข AI Travel Agent

Autonomous Tool Calling (Flights, Hotels, Budget, Itinerary)

๐Ÿค– Converse API + Tools
10.200.1.147 (Lambda ENI) โ†’ 10.200.2.226 (Bedrock)
๐Ÿ‘‹ Hi! I'm an AI Travel Agent powered by Claude Sonnet 4.6 with autonomous tool calling.

๐Ÿ› ๏ธ Available Tools:
โœˆ๏ธ search_flights โ€ข ๐Ÿจ search_hotels โ€ข ๐Ÿ’ฐ calculate_budget โ€ข ๐Ÿ—บ๏ธ create_itinerary
Try: "Find flights from NYC to Paris" or click "๐Ÿ“ Sample Prompts" for examples!
๐Ÿ“ก REQUEST 1: Knowledge Base Retrieval (bedrock-agent-runtime.retrieve)
๐Ÿ‘ค
User
Browser
โ†’
API Gateway
Public
โ†’
ฮป
RAG Lambda
10.200.3.15
โ†’
๐Ÿ”€
TGW
tgw-xxx
โ†’
SSL Decrypt โœ…
Prisma
Prisma AIRS
Firewall
โ†’
๐Ÿ”€
TGW
Return
โ†’
๐Ÿ”
Bedrock KB
10.200.2.19
โ†’
๐Ÿ”Ž
OpenSearch
Vector Search
โ†’
๐Ÿ“ฆ
S3
Retrieve Chunks
๐Ÿ”ฅ Firewall sees: Query in REQUEST + S3 chunks in RESPONSE
๐Ÿ“ก REQUEST 2: Answer Generation (bedrock-runtime.invoke_model)
ฮป
RAG Lambda
With S3 Chunks
โ†’
๐Ÿ”€
TGW
tgw-xxx
โ†’
SSL Decrypt โœ…
Prisma
Prisma AIRS
Firewall
โ†’
๐Ÿ”€
TGW
Return
โ†’
๐Ÿง 
Bedrock Runtime
10.200.2.226
โ†’
๐Ÿค–
Claude Model
Generate Answer
๐Ÿ”ฅ Firewall sees: S3 chunks in REQUEST prompt + Answer in RESPONSE
๐Ÿ”„ What is RAG Dual Inspection? (Example: "What's in the knowledge base?")
โฑ๏ธ 00:00 - REQUEST 1: Retrieve from Knowledge Base
Lambda โ†’ TGW โ†’ Prisma AIRS (SSL decrypt) โ†’ Bedrock KB (10.200.2.19)
โ†’ API: bedrock-agent-runtime.retrieve() with query
โ†’ OpenSearch vector search + S3 chunk retrieval
โ† Response (00:02): Returns top-K chunks from S3
โš ๏ธ Firewall sees: Query in REQUEST + S3 chunks in RESPONSE
๐Ÿ“ก REQUEST 2: Generate Answer with Chunks
Lambda โ†’ TGW โ†’ Prisma AIRS (SSL decrypt) โ†’ Bedrock Runtime (10.200.2.226)
โ†’ API: bedrock-runtime.invoke_model()
โ†’ Prompt includes: Query + S3 chunks from REQUEST 1
โ† Response (00:05): Claude generates answer based on chunks
โš ๏ธ Firewall sees: S3 chunks in REQUEST prompt + Answer in RESPONSE
โœ… Result: Dual Inspection
Total Duration: ~5 seconds
HTTP POSTs: 2 (both visible on Prisma AIRS)
๐Ÿ”ฅ Firewall sees S3 data TWICE:
1. In retrieve() response (chunks with scores)
2. In invoke_model() request (chunks in prompt)
๐Ÿ”ฅ What Prisma AIRS Firewall Sees (RAG)
๐Ÿ“Š Traffic Log Entry 1 (bedrock-agent-runtime.retrieve):
โ€ข Source: 10.200.3.15:xxxxx (RAG Lambda)
โ€ข Dest: 10.200.2.19:443 (Bedrock KB VPC Endpoint)
โ€ข URL: /knowledgebases/{kb-id}/retrieve
โ€ข Method: POST
โ€ข Decrypted: Yes โœ…
โ€ข Request Body: {"input": {"text": "user query"}}
โ€ข Response Body: S3 chunks with scores

๐Ÿ“Š Traffic Log Entry 2 (bedrock-runtime.invoke_model):
โ€ข Source: 10.200.3.15:xxxxx (RAG Lambda)
โ€ข Dest: 10.200.2.226:443 (Bedrock Runtime VPC Endpoint)
โ€ข URL: /model/us.anthropic.claude-sonnet-4-6/invoke
โ€ข Method: POST
โ€ข Decrypted: Yes โœ…
โ€ข Request Body: {"messages": [{"content": "user query + S3 chunks"}]}
โ€ข Response Body: Claude's answer based on chunks

โœ… DLP Use Case:
โ€ข Inspect S3 knowledge base content for PII/sensitive data
โ€ข Block SSN, credit cards, credentials in both requests
โ€ข Monitor data exfiltration from knowledge base
โ€ข Perfect visibility: S3 data appears in BOTH network packets
๐ŸŽฏ Key Insights
โ€ข RAG = 2 API calls: retrieve() + invoke_model()
โ€ข Dual Inspection: Firewall sees S3 chunks TWICE (retrieve response + invoke request)
โ€ข Network Path: Lambda (10.200.3.15) โ†’ Firewall โ†’ KB (10.200.2.19) + Runtime (10.200.2.226)
โ€ข Perfect for DLP: Full visibility into knowledge base content in transit

๐ŸŸ  RAG Knowledge Base

Retrieval-Augmented Generation with Dual Inspection

10.200.3.15 (Lambda) โ†’ 10.200.2.19 (KB) + 10.200.2.226 (Runtime)
๐Ÿ‘‹ Hi! I'm powered by Claude Sonnet 4.6 with RAG (Retrieval-Augmented Generation) via AWS Bedrock.

๐Ÿ”“ SSL Decryption Enabled
Traffic inspected at Prisma AIRS โ€ข Full payload visibility
Ask me about documents in the knowledge base!

๐Ÿš€ Deploy This Architecture

Complete infrastructure-as-code package to replicate this entire setup in your AWS account. Deploy all 3 AI chatbots with Prisma AIRS firewall providing full SSL decryption and traffic visibility.

๐Ÿ“ฆ
GitHub Repository
Complete Terraform modules and documentation
View on GitHub โ†’
โฑ๏ธ
~60 minutes
Total deployment time
๐Ÿ—๏ธ
4 VPCs
With Transit Gateway
๐Ÿค–
3 Chatbots
Direct, Agent, RAG

Required Before Deployment

Infrastructure (Must Exist)

โœ“ Transit Gateway - Deployed in us-east-2, TGW ID required
โœ“ Security VPC - 10.200.0.0/24 with Prisma AIRS firewall
โœ“ Prisma AIRS - SSL decryption enabled
โœ“ NAT Gateway - In Security VPC for internet access

AWS Account

โœ“ Admin or PowerUser access
โœ“ AWS Bedrock enabled in us-east-2
โœ“ Model access: Claude Sonnet 4.6, Titan Embeddings v2

Tools

โœ“ Terraform 1.5+
โœ“ AWS CLI 2.x
โœ“ jq (JSON processor)
โœ“ Python 3.9+

API Keys

โœ“ SerpAPI key from serpapi.com (free tier: 100 searches/month)

Creates VPCs, Transit Gateway attachments, IAM roles, and Bedrock VPC endpoints.

cd terraform/shared-infrastructure/

# Configure
cp terraform.tfvars.example terraform.tfvars
vi terraform.tfvars  # Update: aws_account_id, tgw_id, prisma_airs_eni_ip

# Deploy
terraform init
terraform plan
terraform apply
โœ“ What Gets Created:
โ€ข 4 VPCs (Chatbot, Endpoints, RAG, plus 3 future VPCs)
โ€ข Transit Gateway attachments and route tables
โ€ข IAM roles for Lambda functions
โ€ข 2 Bedrock VPC Endpoints (Runtime + Agent Runtime)

Deploys Lambda function, API Gateway, S3 bucket, and CloudFront distribution.

cd ../direct-architecture/

# Configure
cp terraform.tfvars.example terraform.tfvars
vi terraform.tfvars  # Update: aws_account_id, serpapi_key

# Deploy
terraform init
terraform plan
terraform apply

# Save outputs
terraform output api_gateway_url
terraform output cloudfront_url
โœ“ What Gets Created:
โ€ข Lambda function in Chatbot VPC
โ€ข API Gateway with CORS enabled
โ€ข S3 bucket and CloudFront distribution for web UI
โ€ข Security groups and IAM permissions

Deploys Travel Agent Lambda with tool calling capabilities (flights, weather, hotels).

cd ../travel-agent-architecture/

# Configure
cp terraform.tfvars.example terraform.tfvars
vi terraform.tfvars  # Update: aws_account_id, serpapi_key

# Deploy
terraform init
terraform plan
terraform apply

# Save outputs
terraform output api_gateway_url
โœ“ What Gets Created:
โ€ข Lambda function in Chatbot VPC (same as Direct)
โ€ข API Gateway with CORS enabled
โ€ข Tool definitions (search_flights, get_weather, book_hotel)
โ€ข Security groups

Deploys RAG Lambda, Bedrock Knowledge Base, and OpenSearch Serverless collection.

cd ../rag-architecture/

# Configure
cp terraform.tfvars.example terraform.tfvars
vi terraform.tfvars  # Update: aws_account_id

# Deploy
terraform init
terraform plan
terraform apply

# Upload sample documents
S3_BUCKET=$(terraform output -raw s3_bucket_name)
aws s3 cp sample-doc.pdf s3://${S3_BUCKET}/

# Trigger Knowledge Base sync
KB_ID=$(terraform output -raw knowledge_base_id)
DS_ID=$(terraform output -raw data_source_id)
aws bedrock-agent start-ingestion-job \
  --knowledge-base-id $KB_ID \
  --data-source-id $DS_ID
โœ“ What Gets Created:
โ€ข Lambda function in RAG VPC
โ€ข Bedrock Knowledge Base
โ€ข OpenSearch Serverless collection
โ€ข S3 bucket for documents
โ€ข API Gateway with CORS enabled

Test Direct Chatbot

DIRECT_API=$(terraform output -raw api_gateway_url)

curl -X POST $DIRECT_API \
  -H "Content-Type: application/json" \
  -d '{"query":"What is AWS Bedrock?"}'

Test Travel Agent

AGENT_API=$(terraform output -raw api_gateway_url)

curl -X POST $AGENT_API \
  -H "Content-Type: application/json" \
  -d '{"query":"Plan a trip to Paris"}'

Test RAG Chatbot

RAG_API=$(terraform output -raw api_gateway_url)

curl -X POST $RAG_API \
  -H "Content-Type: application/json" \
  -d '{"query":"Summarize the documents"}'

Verify Prisma AIRS Traffic

1. On Prisma AIRS: Monitor โ†’ Logs โ†’ Traffic
2. Filter: addr.src in 10.200.1.0/24 or addr.src in 10.200.3.0/24
3. Verify: Decrypted: Yes โœ“
4. Check full URL path visible

Lambda Timeout

Symptom: Task timed out after 300 seconds
Fix: Verify VPC route table has default route to Transit Gateway
aws ec2 describe-route-tables \
  --filters "Name=tag:Name,Values=*chatbot*"

Prisma AIRS Not Seeing Traffic

Symptom: No traffic logs in firewall
Fix:
1. Verify TGW route tables route 10.200.0.0/16 through Security VPC
2. Check Prisma AIRS routes include 10.200.2.0/24
3. Verify TGW attachments are active

RAG Returns No Results

Symptom: Empty responses from RAG chatbot
Fix: Upload documents and wait 5-10 minutes for indexing
aws s3 cp documents/ s3://YOUR_BUCKET/ --recursive

aws bedrock-agent start-ingestion-job \
  --knowledge-base-id YOUR_KB_ID \
  --data-source-id YOUR_DS_ID