Real incidents, real fixes.
This isn't a theoretical troubleshooting guide. Every entry below happened during a real build. Use the summary table to find your problem quickly, then expand the relevant incident for the full diagnosis, the raw fix, and a Claude Code prompt that applies the fix for you.
Quick reference: find your problem
Scan this table first. If your symptom matches, expand the relevant incident below for the full story.
| # | Symptom | Where | One-line fix |
|---|---|---|---|
| 1 | apt upgrade hangs silently on openssh-server | Phase 2 | Use DEBIAN_FRONTEND=noninteractive and --force-confold |
| 2 | sudo: a terminal is required to read the password | Any phase | Add temporary passwordless sudo |
| 3 | Unsupported Node.js version. Required >=24.0.0 | Phase 4 | Install Node.js 24 from NodeSource |
| 4 | Agent loads but responds generically | Phase 5 | Deploy to workspace/ with multi-file templates |
| 5 | Agent returns HTTP 404 or no response | Phase 4 | Clear the framework's default base_url |
| 6 | Two bots can't see each other in Telegram | Multi-agent | Use Discord for bot-to-bot — Telegram blocks it |
| 7 | Systemd service can't read shared env file | Phase 4 | Use system-level systemd with SupplementaryGroups |
| 8 | Bot replies create Discord threads | Phase 6 | Set DISCORD_AUTO_THREAD=false |
| 9 | Tool invocation logs appear in chat | Any phase | Set display.tool_progress: off |
| 10 | LLM request rejected: Could not serve request | Any phase | Test API key directly with curl |
| 11 | Telegram bot stops, logs show “pairing required” | Phase 4 | Approve device via CLI, update scopes |
| 12 | Docker install via convenience script fails on ARM64 | Phase 2 | Use the official APT repository method |
The 12 incidents
General troubleshooting principles
These are the meta-lessons from the build. They apply to almost every problem you'll hit.
1. Verify against the actual system, not the documentation
The single most repeated lesson. Documentation gets stale, frameworks evolve, and playbooks can be wrong. Before configuring anything, check what the system actually expects: ls the directory, cat the config, read the actual source if needed.
2. Test identity with a knowledge question, not a ping
Sending “hello” proves the bot is running. It does NOT prove the identity files loaded. Always test with: “What do you know about me and what are you here to do?”
3. One restart after all changes, not one restart per change
Multiple rapid restarts break things — device pairings, session state, cached configs. Make all your config changes, verify the files are correct, then restart once.
4. Always use DEBIAN_FRONTEND=noninteractive on servers
sudo DEBIAN_FRONTEND=noninteractive apt <command> -y -o Dpkg::Options::="--force-confold"Prevents dpkg from hanging on debconf prompts and keeps your existing configs safe.
5. Check the framework's template system before deploying identity files
Don't assume all frameworks use the same file structure. Some use a single file, some use a multi-file workspace directory. Read the docs, check the filesystem, understand the template loading order before writing identity content.
6. Platform limitations are real
Before designing a multi-agent collaboration pattern, verify that the underlying platform supports the communication pattern you need. Telegram bots can't see each other. Discord bots can. Verify first, build second.
7. The two-terminal safety net for anything SSH-related
For any change that could lock you out (SSH config, firewall rules, permission changes), keep a second terminal window open with an existing connection. If the change breaks the first connection, the second one is your escape route.
8. Read errors literally, don't assume
Error messages usually tell you exactly what's wrong. “Permission denied” means permissions. “Command not found” means PATH. “Connection refused” means nothing is listening on that port. The literal reading is almost always correct.