# Implementation Prompt: sh0 AI Phase 6 (Web Search + URL Browsing), Phase 6.5 (File & Image Uploads), & Phase 8 (Specialist Agents) > **Context:** sh0-core is a self-hosted deployment platform (Rust + Svelte 5). It already has an MCP server with 25 tools (Phases 1-5), an AI gateway at sh0-website, a prepaid wallet billing system, and an AI Sandbox (Alpine container per app). This prompt covers 3 features to implement in a single session. > > **Reference:** Read `.../sh0.dev/AI-FEATURES.md` for the complete AI architecture. Read `.../sh0.dev/deblo-pro-features-reference.md` for Deblo Pro's approach (sections 9 and 13). --- ## Phase 6: Web Search & URL Browsing (2 new MCP tools) ### Why Users ask the AI questions about error messages, framework documentation, and best practices. Currently, the AI can only answer from its training data. Adding web search and URL browsing lets the AI find up-to-date information, read documentation pages, and cite sources. ### What to Implement #### Step 1: Web search service **New file:** `crates/sh0-api/src/services/web.rs` Two functions: 1. **`web_search(query: &str, max_results: usize) -> Result`** - Provider: Tavily AI Search API (`https://api.tavily.com/search`) - API key from environment variable `TAVILY_API_KEY` or from sh0 settings table - Parameters: `query`, `max_results` (default 5), `search_depth: "basic"` - Return: `WebSearchResult { answer: Option, results: Vec }` - Each `SearchResult`: `{ title, url, content (max 1500 chars), score }` - Truncate each result content to 1,500 characters 2. **`browse_url(url: &str, max_chars: usize) -> Result`** - Provider: Jina Reader API (`https://r.jina.ai/{url}`) - API key from environment variable `JINA_API_KEY` or from sh0 settings table (optional -- Jina has a free tier) - Returns markdown-formatted page content - Truncate to `max_chars` (default 8,000) - Return: `BrowsedPage { title: Option, content: String, url: String }` - Handle errors gracefully (timeout, 404, blocked) Use `reqwest` for HTTP calls with 30-second timeout. Add `mod web;` to `crates/sh0-api/src/services/mod.rs`. **Reference implementation:** `.../deblo.ai/backend/app/services/web_tools.py` #### Step 2: Two new MCP tools Add to `crates/sh0-api/src/mcp/tools.rs`: 1. **`web_search`** (risk: read) - Params: `query` (string, required), `max_results` (integer, optional, default 5) - Calls `services::web::web_search()` - Returns formatted results with URLs and snippets - Audit logged: `mcp:web_search` 2. **`browse_url`** (risk: read) - Params: `url` (string, required) - Calls `services::web::browse_url()` - Returns page content as markdown - Audit logged: `mcp:browse_url` Add manual tool definitions in `crates/sh0-api/src/mcp/openapi.rs` (follow the existing pattern for `get_app_logs` and sandbox tools). Update the test assertion from 25 to 27 tools. #### Step 3: Settings for API keys Add to the settings table (or environment variables): - `TAVILY_API_KEY` -- required for web_search - `JINA_API_KEY` -- optional for browse_url If the key is not configured, the tool should return a clear error message: `"Web search is not configured. Ask your administrator to set the TAVILY_API_KEY in Settings > AI."` **Settings UI:** Add API key fields to the dashboard Settings page AI section (if one exists) or to AppSettings. Look at how the existing settings pattern works in `crates/sh0-db/src/models/setting.rs` and the dashboard settings page. #### Step 4: Dashboard updates - **ProcessingSteps.svelte** (`dashboard/src/lib/components/ai/ProcessingSteps.svelte`): add icons/labels for `web_search` (Globe icon, "Searching the web...") and `browse_url` (BookOpen icon, "Reading page...") - **AiCapabilitiesModal.svelte** (`dashboard/src/lib/components/ai/AiCapabilitiesModal.svelte`): add a 6th "Web" category with the 2 tools - **i18n**: add keys in all 5 language files (`dashboard/src/lib/i18n/{en,fr,es,pt,sw}.ts`) - **TOOL_LABELS**: update `dashboard/src/lib/ai-tools.ts` with the 2 new tool names #### Step 5: System prompt update Update `sh0-website/src/lib/server/ai-system-prompt.ts` (the MCP system prompt variant) to mention web search: ``` When the user asks about error messages, documentation, or current best practices, use web_search to find up-to-date information. Use browse_url to read specific documentation pages. Always cite your sources with URLs. ``` #### Step 6: sh0-website /ai page The `/ai` page on sh0-website likely has a "Coming Soon" section listing "Web Search". Remove it from "Coming Soon" and add it to the existing features list. Check `sh0-website/src/routes/ai/+page.svelte`. --- ## Phase 6.5: File & Image Uploads in Chat ### Why The ChatInput component has two placeholder buttons (Paperclip for files, Image for images) that are currently disabled with "Coming Soon" tooltips. Anthropic's Claude API natively supports image (JPEG, PNG, GIF, WebP) and PDF content blocks in messages. Users need to upload screenshots of error pages, share PDF documentation, or attach configuration files for the AI to analyze. ### Current State **Placeholders to replace:** - `dashboard/src/lib/components/ai/ChatInput.svelte` -- lines 84-102: two disabled buttons with `cursor-default` and `title={t('common.coming_soon')}` - The `send()` function in `dashboard/src/lib/stores/ai.svelte.ts` only accepts a `string` for `userContent` - The `streamChat()` function in `dashboard/src/lib/ai.ts` sends `messages` with `content: string | ContentBlock[]` - The gateway at `sh0-website/src/routes/api/ai/chat/+server.ts` passes messages through to the Anthropic API **Key insight:** The Anthropic API already supports multimodal content blocks. The `ContentBlock` type in `ai.ts` just needs `image` and `document` types added. The gateway passes messages through, so the main work is on the dashboard side (file selection, preview, base64 encoding, message format). ### What to Implement #### Step 1: Extend ContentBlock types Update `dashboard/src/lib/ai.ts`: ```typescript export type ContentBlock = | { type: 'text'; text: string } | { type: 'image'; source: { type: 'base64'; media_type: string; data: string } } | { type: 'document'; source: { type: 'base64'; media_type: 'application/pdf'; data: string }; title?: string } | { type: 'tool_use'; id: string; name: string; input: Record } | { type: 'tool_result'; tool_use_id: string; content: string; is_error?: boolean }; ``` #### Step 2: File upload state and logic in ChatInput Update `dashboard/src/lib/components/ai/ChatInput.svelte`: 1. **Add state for attached files:** ```typescript let attachments = $state>([]); ``` 2. **Replace the disabled Paperclip button** with a working file picker: - Accepts: `.pdf, .txt, .json, .yaml, .yml, .toml, .env, .md, .csv, .log, .conf, .xml, .html, .css, .js, .ts, .py, .rs, .go, .sh, .sql, .dockerfile` - For PDF files: create a `document` content block (base64-encoded) - For text files: read as text, create a `text` content block with the file content wrapped as: `[File: {filename}]\n\`\`\`\n{content}\n\`\`\`` - Max file size: 10 MB for PDFs, 1 MB for text files - Multiple files allowed (up to 5) 3. **Replace the disabled Image button** with a working image picker: - Accepts: `.jpg, .jpeg, .png, .gif, .webp` - Create an `image` content block with base64-encoded data - Max file size: 5 MB per image (Anthropic limit: ~20 MB base64) - Show thumbnail preview in the input area - Multiple images allowed (up to 4 per message) 4. **Show attachment preview strip** between textarea and toolbar: - Small thumbnail cards for images (with X to remove) - File icon + filename for documents/text (with X to remove) - Total attachment count badge if > 0 5. **Add hidden file inputs:** ```svelte ``` #### Step 3: Update send function Update `dashboard/src/lib/stores/ai.svelte.ts`: 1. **Change `send()` signature** to accept attachments: ```typescript export async function send( userContent: string, attachments?: Array<{ type: 'image' | 'document' | 'text'; data: string; mediaType: string; filename: string }>, serverContext?: { version?: string; appCount?: number; plan?: string } ) ``` 2. **Build content blocks** when attachments are present: ```typescript if (attachments && attachments.length > 0) { const blocks: ContentBlock[] = []; for (const att of attachments) { if (att.type === 'image') { blocks.push({ type: 'image', source: { type: 'base64', media_type: att.mediaType, data: att.data } }); } else if (att.type === 'document') { blocks.push({ type: 'document', source: { type: 'base64', media_type: 'application/pdf', data: att.data }, title: att.filename }); } else { blocks.push({ type: 'text', text: `[File: ${att.filename}]\n\`\`\`\n${att.data}\n\`\`\`` }); } } blocks.push({ type: 'text', text: userContent.trim() }); userMsg = { role: 'user', content: blocks }; } ``` 3. **Update `handleSend()` in `+page.svelte`** to collect attachments from ChatInput and pass them to `send()`. #### Step 4: Update the chat page Update `dashboard/src/routes/(app)/ai/+page.svelte`: 1. **ChatInput props**: Add `onattachmentschange` callback or pass attachments up 2. **`handleSend()`**: Collect attachments from ChatInput, pass to `send()`, then clear attachments #### Step 5: Update UserBubble for multimodal display Update `dashboard/src/lib/components/ai/UserBubble.svelte`: 1. Accept `content: string | ContentBlock[]` instead of just `string` 2. When content is an array, render: - Image blocks as `` tags (using `data:${media_type};base64,${data}`) - Document blocks as a file card (PDF icon + title) - Text blocks as regular text 3. Update the message rendering in `+page.svelte` to pass full content to UserBubble #### Step 6: Gateway changes (sh0-website) The gateway (`sh0-website/src/routes/api/ai/chat/+server.ts`) passes messages directly to the Anthropic SDK as `Anthropic.Messages.MessageParam[]` (line 82: `messages.slice(-MAX_MESSAGES)`). The Anthropic SDK already accepts `image` and `document` content blocks natively, so the gateway passes them through. **Required changes:** 1. **Body size limit**: Base64-encoded images can be large (5 MB image = ~6.7 MB base64). SvelteKit has a default body size limit. If it's too low, increase it: - Check `svelte.config.js` or `hooks.server.ts` for body size config - SvelteKit adapter-node default is 512 KB -- **this MUST be increased to at least 25 MB** to support multiple image attachments - In `svelte.config.js` kit config: `kit: { csrf: { checkOrigin: false }, ... }` -- look for `bodyParser` or adapter config - Or in the adapter-node config: `adapter: adapter({ bodyParser: { sizeLimit: '25mb' } })` - Or configure in the route itself by exporting: `export const config = { body: { maxSize: '25mb' } };` 2. **Message sanitization**: Verify that the gateway does NOT strip or transform content blocks. Check lines around `apiMessages` construction. Currently it just does `messages.slice(-MAX_MESSAGES)` which is correct. 3. **System prompt update**: Update `ai-system-prompt.ts` to mention that users can upload images and PDFs: ``` Users can upload images (screenshots, diagrams) and PDF documents for you to analyze. When an image is attached, describe what you see and relate it to the user's question. When a PDF is attached, summarize the relevant content and answer questions about it. ``` ### Supported formats (Anthropic native) | Type | Formats | Max size | Content block | |------|---------|----------|---------------| | Images | JPEG, PNG, GIF, WebP | 5 MB | `{ type: "image", source: { type: "base64", media_type, data } }` | | PDF | PDF | 10 MB | `{ type: "document", source: { type: "base64", media_type: "application/pdf", data } }` | | Text files | Any text | 1 MB | Inline as text block: `[File: name]\n\`\`\`content\`\`\`` | ### Verification 1. Upload a PNG screenshot -> AI describes what it sees 2. Upload a PDF document -> AI summarizes the content 3. Upload a `.yaml` config file -> AI reviews the configuration 4. Upload multiple images -> all appear in the user bubble 5. Remove an attachment before sending -> works correctly 6. File size validation -> rejects files over the limit with a toast/error 7. `npm run build` (dashboard) -- zero errors ### Files to modify | File | Change | |------|--------| | `dashboard/src/lib/ai.ts` | Add `image` and `document` to `ContentBlock` type | | `dashboard/src/lib/components/ai/ChatInput.svelte` | Replace placeholders with working file/image pickers, attachment preview strip | | `dashboard/src/lib/stores/ai.svelte.ts` | Update `send()` to accept and encode attachments | | `dashboard/src/routes/(app)/ai/+page.svelte` | Wire attachments from ChatInput to send(), update message rendering | | `dashboard/src/lib/components/ai/UserBubble.svelte` | Render multimodal content (images, documents, text files) | | `dashboard/src/lib/i18n/{en,fr,es,pt,sw}.ts` | Add keys: `ai.attach_file`, `ai.attach_image`, `ai.file_too_large`, `ai.max_attachments` | | `sh0-website/src/routes/api/ai/chat/+server.ts` | Verify multimodal content blocks pass through; increase body size limit | | `sh0-website/svelte.config.js` | Increase adapter-node body size limit to 25 MB | | `sh0-website/src/lib/server/ai-system-prompt.ts` | Add image/PDF upload instructions to system prompt | --- ## Phase 8: Specialist Agents (6 personas) ### Why Different users have different needs. A DevOps engineer debugging networking wants different AI behavior than a DBA optimizing queries. Specialist agents are persona overlays on the base system prompt that adjust expertise, tool preferences, and response style. This is purely gateway + dashboard work -- no Rust changes needed. ### What to Implement #### Step 1: Agent definitions **New file:** `sh0-website/src/lib/server/ai-agents.ts` ```typescript export interface Sh0Agent { id: string; name: string; role: string; icon: string; // Lucide icon name category: 'infrastructure' | 'database' | 'security' | 'development' | 'monitoring'; description: string; systemPromptOverlay: string; suggestedPrompts: string[]; preferredTools: string[]; } export const SH0_AGENTS: Sh0Agent[] = [ ... ]; export function getAgent(id: string): Sh0Agent | undefined; ``` **6 agents to define:** 1. **DevOps Engineer** (`devops`) - Infrastructure, Docker, CI/CD, deployments, networking, reverse proxy - Preferred tools: `deploy_app`, `restart_app`, `scale_app`, `get_app_logs`, `sandbox_exec_command` - Prompt overlay: Think like a senior DevOps engineer. Focus on infrastructure reliability, container best practices, and deployment pipelines. When debugging, check logs first, then container state, then network connectivity. Suggest infrastructure improvements proactively. - Suggested prompts: "Why did my last deploy fail?", "How should I set up zero-downtime deployment?", "Check my app's resource usage" 2. **Database Administrator** (`dba`) - PostgreSQL, MySQL, Redis, MongoDB, backups, connection tuning, query optimization - Preferred tools: `list_databases`, `trigger_backup`, `sandbox_exec_command`, `get_app_logs` - Prompt overlay: Think like a senior DBA. Focus on data integrity, backup strategies, connection pooling, and query performance. Always consider backup/restore implications before suggesting destructive operations. Recommend proper indexing and connection limits. - Suggested prompts: "Optimize my PostgreSQL backup schedule", "Check database connection pool health", "Help me set up automated backups" 3. **Security Auditor** (`security`) - Authentication, 2FA, RBAC, SSL certificates, secrets management, vulnerability scanning - Preferred tools: `get_server_status`, `list_apps`, `list_domains`, `sandbox_exec_command`, `sandbox_check_connectivity` - Prompt overlay: Think like a security auditor. Check authentication configuration, SSL/TLS status, secret management, container permissions, and network exposure. Flag security issues with severity ratings. Always verify 2FA is enabled and API keys are properly scoped. - Suggested prompts: "Audit my server security configuration", "Check SSL certificates for all domains", "Review my RBAC setup" 4. **Full-Stack Developer** (`developer`) - Dockerfile review, build debugging, environment setup, framework-specific help - Preferred tools: `get_deployment_logs`, `get_app_logs`, `sandbox_exec_command`, `sandbox_read_file`, `web_search` - Prompt overlay: Think like a senior full-stack developer. Help with build issues, Dockerfile optimization, environment variable configuration, and framework-specific problems. Use web_search to find current documentation when debugging framework issues. Read build logs carefully for root cause analysis. - Suggested prompts: "Why is my build failing?", "Review my Dockerfile", "Help me set up environment variables" 5. **Site Reliability Engineer** (`sre`) - Monitoring, alerts, uptime, scaling, resource optimization, incident response - Preferred tools: `server_metrics`, `list_alerts`, `get_server_status`, `scale_app`, `sandbox_exec_command` - Prompt overlay: Think like an SRE. Focus on reliability, uptime, and incident response. Check resource utilization, alert configuration, and scaling policies. Recommend monitoring improvements and capacity planning. During incidents, prioritize mitigation over root cause analysis. - Suggested prompts: "Is my server healthy?", "Set up alerts for high CPU usage", "Help me right-size my containers" 6. **Network Engineer** (`network`) - DNS, domains, SSL/TLS, reverse proxy, firewall, connectivity debugging - Preferred tools: `list_domains`, `sandbox_check_connectivity`, `sandbox_exec_command`, `get_server_status` - Prompt overlay: Think like a network engineer. Focus on DNS configuration, domain routing, SSL/TLS certificates, reverse proxy rules, and connectivity issues. Use sandbox_check_connectivity and sandbox_exec_command (with dig, curl, nc) for diagnostics. Explain networking concepts clearly. - Suggested prompts: "Debug DNS for my domain", "Check SSL certificate status", "Test connectivity between my services" #### Step 2: Agent selection in gateway Update `sh0-website/src/routes/api/ai/chat/+server.ts`: - Accept `agent_id` (optional string) in the POST request body - Look up the agent via `getAgent(agent_id)` - If found, inject `agent.systemPromptOverlay` into the system prompt (after the base prompt, before capabilities list) - Include agent info in SSE metadata so dashboard knows which agent is active #### Step 3: Dashboard -- Agent picker modal **New component:** `dashboard/src/lib/components/ai/AgentPickerModal.svelte` - Grid of 6 agent cards with icon, name, role, description - Category filter tabs (or just show all 6 -- small enough) - Click to select an agent - "General Assistant" option to deselect (default, no agent overlay) - Modal triggered from the chat page **Reference:** `.../deblo.ai/frontend/src/lib/components/AgentPickerModal.svelte` #### Step 4: Dashboard -- Agent integration in chat Update `dashboard/src/routes/(app)/ai/+page.svelte`: - Store selected agent ID in conversation state (`ai.svelte.ts` store) - Send `agent_id` in chat API requests - Show agent badge/name in chat header when active - Show agent-specific `suggestedPrompts` in the empty chat state - "Switch agent" or "Change assistant" button near the chat input or in the header Update `dashboard/src/lib/stores/ai.svelte.ts`: - Add `selectedAgentId: string | null` to the store - Persist in localStorage per conversation (or globally) #### Step 5: i18n Add keys to all 5 language files (`dashboard/src/lib/i18n/{en,fr,es,pt,sw}.ts`): - Agent names and descriptions (all 6) - Category labels - UI labels: "Choose an assistant", "Switch assistant", "General Assistant" - AiCapabilitiesModal: add "Specialist Agents" category #### Step 6: sh0-website /ai page Remove "Specialist Agents" from the "Coming Soon" section on the `/ai` page and add it to the features list. --- ## Files to Read Before Starting ### Existing sh0 AI implementation (read ALL before coding) **MCP server + tools (Rust):** - `.../sh0.dev/sh0-core/crates/sh0-api/src/mcp/tools.rs` -- 25 tool executors, `execute_tool()` dispatch, `tool_risk()` classification - `.../sh0.dev/sh0-core/crates/sh0-api/src/mcp/openapi.rs` -- Manual tool definitions, `manual_tool_definitions()`, tools_from_openapi test - `.../sh0.dev/sh0-core/crates/sh0-api/src/mcp/transport.rs` -- MCP transport, scope enforcement - `.../sh0.dev/sh0-core/crates/sh0-api/src/services/mod.rs` -- Services module (add `mod web;` here) **AI gateway (sh0-website):** - `.../sh0.dev/sh0-website/src/routes/api/ai/chat/+server.ts` -- 3-way router, SSE streaming - `.../sh0.dev/sh0-website/src/lib/server/ai-system-prompt.ts` -- System prompt builder (3 variants) - `.../sh0.dev/sh0-website/src/lib/server/ai-tools.ts` -- Tool splits **Dashboard AI components:** - `.../sh0.dev/sh0-core/dashboard/src/lib/components/ai/AiCapabilitiesModal.svelte` - `.../sh0.dev/sh0-core/dashboard/src/lib/components/ai/ProcessingSteps.svelte` - `.../sh0.dev/sh0-core/dashboard/src/lib/stores/ai.svelte.ts` - `.../sh0.dev/sh0-core/dashboard/src/lib/ai-tools.ts` -- TOOL_LABELS - `.../sh0.dev/sh0-core/dashboard/src/lib/i18n/en.ts` - `.../sh0.dev/sh0-core/dashboard/src/routes/(app)/ai/+page.svelte` **Deblo Pro reference (read for patterns, don't copy blindly):** - `.../deblo.ai/backend/app/services/web_tools.py` -- Tavily + Jina implementation - `.../deblo.ai/frontend/src/lib/utils/agents.ts` -- 101 agent definitions (pick patterns, not content) - `.../deblo.ai/frontend/src/lib/components/AgentPickerModal.svelte` -- Agent picker UI - `.../deblo.ai/backend/app/prompts/pro.py` -- Agent prompt overlays **Full architecture reference:** - `.../sh0.dev/AI-FEATURES.md` -- Complete AI implementation reference (Phases 1-5) --- ## Constraints - **Axum 0.7.9** -- do NOT upgrade - **SQLite** -- sh0 uses SQLite, not PostgreSQL - **MCP protocol** -- new tools must be MCP tools. Add to `execute_tool()`, `tool_risk()`, and `manual_tool_definitions()` in the existing pattern - **5 languages** -- i18n keys in en, fr, es, pt, sw for all new features - **Backward compatible** -- existing 25 tools must continue working unchanged - **Svelte 5 runes** -- use `$state`, `$derived`, `$effect`, `$props` (NOT Svelte 4 stores for new components) - **No `unwrap()` or `expect()`** in Rust library crates -- use `?` and `thiserror` - **Audit logging** -- all MCP tool calls must be audit logged with `mcp:` prefix ## Build Verification (after each phase) ```bash # Rust cd .../sh0.dev/sh0-core && cargo check && cargo test # Dashboard cd .../sh0.dev/sh0-core/dashboard && npm run build # Website cd .../sh0.dev/sh0-website && npm run build ``` ## Post-Implementation 1. Update `.../sh0.dev/AI-FEATURES.md` with Phase 6 and Phase 8 sections 2. Update `.../sh0.dev/sh0-core/FEATURES-TODO.md` 3. Create session log at `.../sh0.dev/sh0-private-docs/session-logs/` 4. Draft audit prompt for a separate auditor session ## Engineering Workflow Follow the mandatory build-audit-audit-approve workflow from `.../sh0.dev/CLAUDE.md`: 1. **Build** -- implement both phases, verify builds pass 2. **Audit Round 1** -- separate session audits all new/modified files 3. **Audit Round 2** -- third session verifies fixes, catches anything missed 4. **Approval** -- primary session reviews audit results