Security & Privacy

ravenbot is designed with a "Security First" philosophy, especially since it often runs on home networks with access to sensitive data (like the Insight Vault).

🛡️ SSRF Protection (Server-Side Request Forgery)

All outbound network requests made by tools pass through a validation layer (internal/tools/validator.go → NewSafeClient).

How it works:

Blocked Ranges: By default, the bot blocks all private IP ranges (e.g., 192.168.x.x, 10.x.x.x, 127.0.0.1).
Blocked Ports: Sensitive ports like 22 (SSH), 3306 (MySQL), and 5432 (PostgreSQL) are always blocked.
Allowed URLs: If you need to access a local service (like a local Ollama instance), you must set ALLOW_LOCAL_URLS=true in your .env.

💬 Channel & Chat Isolation

ravenbot strictly enforces access control to prevent unauthorized users from interacting with your bot.

Telegram: The bot ignores all messages that do not originate from the TELEGRAM_CHAT_ID specified in your .env.
Discord: The bot only listens and responds within the DISCORD_CHANNEL_ID specified in your .env.
Web UI: Sessions are cookie-based — each browser gets a unique session ID (ravenbot_session cookie). Form-supplied session IDs are rejected to prevent session hijacking. The web server only binds to the configured port and has no authentication layer (intended for local/private network use).

📂 Data Privacy

Local Persistence: All conversation summaries, research briefings, missions, and reminders are stored in a local SQLite database (data/ravenbot.db). This data is never sent to a cloud service (other than the prompt text sent to the LLM for processing).
Insight Vault: The VaultTools have built-in Path Traversal Protection. It is impossible for the bot to read files outside of the directory specified by VAULT_PATH, even if prompted by a user to do so.
Logging: Logs are stored locally in the logs/ directory. Sensitive credentials (API keys, tokens) are automatically redacted. Message content and user input are never logged at INFO level. Web request logs strip port numbers from remote addresses (IP anonymization).

🔒 Input Validation

All web form inputs are validated server-side regardless of HTML5 constraints:

Input	Max Length
Chat messages	10,000 characters
Research topics	1,000 characters
Reminder messages	500 characters
Jules tasks	5,000 characters
Watcher queries	1,000 characters

Numeric fields (watcher interval, export limit) are validated with strconv.Atoi before use.

🔑 Credential Management

Environment Variables: Use .env for secrets. Never commit your .env file to source control.
MCP Env: Environment variables for MCP servers are expanded at runtime in cmd/bot/main.go. You can use $VARIABLE_NAME in config.json to safely inject secrets.

📡 Error Observability (Sentry)

ravenbot optionally integrates with Sentry for error monitoring via internal/sentrylog/. The integration is designed with privacy as the primary constraint.

What is sent

The error value attached to any slog.Error call (the Go error object itself — always developer-generated).
The log message string (always a developer-written static string like "Chat failed").
A log_message tag and a go_time tag on each event.
Stack traces on panics caught by the HTTP recovery middleware.

What is never sent

User message content or chat history.
Session IDs, Telegram/Discord user IDs, or any other attributes logged alongside errors.
Request bodies, cookies, or auth headers (scrubbed by BeforeSend hook).
User identity (event.User is always cleared before dispatch).
Hostnames or other PII that Sentry collects by default (SendDefaultPII: false).

How it works

internal/sentrylog/handler.go wraps the base slog.Handler. On any LevelError+ record it walks the log attributes looking only for the error key, ignores all others, then calls sentry.CaptureException. If no error attribute exists, it calls sentry.CaptureMessage with the static log message.

The HTTP recoveryMiddleware in internal/web/server.go catches panics and reports them to Sentry as level=fatal before returning a 500.

Configuration

SENTRY_DSN=https://...@sentry.io/...
SENTRY_ENVIRONMENT=production   # or: development, staging

Leave SENTRY_DSN unset to disable Sentry entirely — all capture calls become no-ops.

🛡️ Best Practices

Keep it Updated: Regularly pull the latest version of ravenbot to ensure you have the latest security patches.
Use 0700/0600 Permissions: On Linux, ensure the data/ and logs/ directories are set to 0700 and sensitive files to 0600 (readable/writable only by the owner).
Limit Token Scope: When using GitHub integrations, use a Fine-grained Personal Access Token with the minimum permissions required for your tasks.
Web UI Network: The web interface has no authentication. Bind it to a private network or use a reverse proxy with authentication if exposing to the internet.
Compression Threshold: The compressionThreshold in config.json is validated on load (must be 0–1.0). A misconfigured value defaults to 0.8 with a warning.