The problem
Most developers use one AI tool — ChatGPT, Copilot, or Claude. That's like having a toolbox with only a hammer. Different tasks need different tools, and the real power comes from wiring them together.
Our actual stack
Here's what we run daily, each tool chosen for a specific reason:
| Category | Example tools | Why |
|---|---|---|
| AI model provider | Claude, GPT, Gemini | Reasoning, analysis, code, strategy |
| Voice transcription | Deepgram, Whisper, AssemblyAI | Convert voice notes to text instructions |
| Text-to-speech | Edge TTS, ElevenLabs | Narrate results as audio |
| Cloud storage sync | rclone, gsutil | CLI-native file sync, scriptable |
| Version control | gh CLI, GitLab CLI | Create branches, PRs, manage repos |
| Web search | Brave Search, SerpAPI | API-based research for agents |
Six categories, each serving a distinct function. Choose tools that work headless (no GUI required) and can be called from scripts.
How we choose the right tool
The decision framework is simple:
For reasoning and generation: Claude. When we need analysis, writing, code, or strategic thinking — there's no substitute for a frontier model.
For audio processing: Test with your actual audio format. Different transcription services handle different encodings. We found that newer models don't always handle every format — sometimes an older, more battle-tested model works better. Always test with real data, not benchmarks.
For file operations: CLI-based sync tools beat SDK libraries for simple tasks. One command to copy a file beats 20 lines of API boilerplate:
# Example: sync a file to cloud storage
your-sync-tool copy local-file.md remote:folder/
No SDK, no authentication code, no boilerplate.
For version control: CLI tools let you create a branch, push, and open a PR in three commands:
git checkout -b feature/new-article
git push origin feature/new-article
gh pr create --title "feat: New article" --body "Description here"
Experiment: building a voice-to-PR pipeline
Here's a real workflow that chains multiple tools:
- Voice input (messaging platform) → raw audio file saved to disk
- Transcription service converts audio → text instruction
- AI model interprets the instruction and writes code/content
- Version control CLI creates a branch, commits, pushes, opens a PR
- Cloud sync tool backs up artifacts to cloud storage
- TTS engine narrates a summary back as audio
- Messaging API delivers the voice note back to the user
Seven tool categories in one workflow. Each handles what it's best at. No single tool could do all of this.
What we learned: The orchestration layer matters more than any individual tool. Claude acts as the brain — it decides what to do and calls the other tools. But it's the pipeline that creates the value, not any single tool in isolation.
Tool selection mistakes we made
Mistake 1: Trying to use one tool for everything. Early on, we tried to have Claude handle file uploads directly. It can't — it generates content, but file operations need dedicated tools. Separating "thinking" from "doing" was a key insight.
Mistake 2: Choosing the wrong transcription model. We initially assumed the newest model from our transcription provider would be best. It wasn't — it failed on our specific audio encoding. The lesson: test with your actual data format, not benchmarks.
Mistake 3: Over-engineering cloud storage integration. We initially set up a Python virtual environment with a full SDK client library. Then realized a CLI sync tool does the same thing in one command. Simpler tools win when the task is simple.
When to add a new tool
We use this checklist before adding another tool to the stack:
- Is there a real task it solves? Not theoretical — something we actually need to do repeatedly.
- Does it overlap with an existing tool? If yes, is it meaningfully better for the specific use case?
- Can it run headless? We need tools that work from scripts and cron jobs, not just GUIs.
- What's the failure mode? If this tool goes down, what breaks? Can we fall back gracefully?
We identified 23 tool gaps through a systematic audit (see the AI Tools Strategic Report). But gaps aren't urgent problems — they're opportunities to evaluate when the need arises.
What we don't use (yet)
Being honest about our known gaps:
- No LangChain/LangGraph — we orchestrate through our agent platform, not a dedicated framework
- No vector database — no semantic search over our own content
- No Ollama/local models — fully dependent on cloud APIs
- No observability tools — no LangSmith or Langfuse tracking our AI calls
These aren't oversights — they're conscious trade-offs. Our current stack solves our current problems. The gaps become relevant when we move from "AI power user" to "AI builder" (Modules 3-6).
The principle
Match tools to tasks, not tasks to tools. Start with what you need to accomplish, then find the simplest tool that does it reliably. Complexity should come from combining simple tools, not from using complex ones.
Sources
- Deepgram API docs — transcription service
- rclone documentation — cloud storage CLI
- GitHub CLI manual — programmatic GitHub operations
- Brave Search API — web search for agents