TechTrends Now - Tech News for Builders and Operators

A month ago I was building a side project and kept
hitting the same wall: I'd start a task with OpenAI,
hit the rate limit, manually switch to Anthropic,
hit a different limit, then open yet another tab to
configure Gemini. Three API dashboards open, three
different billing pages, and my actual project sitting
there waiting.

I didn't want to pay for multiple APIs just to keep
working. So I built something to fix it.

What I built

HelloChusquis is an open source terminal AI agent that automatically switches between 35+ AI providers when one hits rate limits or goes down.

One config file. Zero manual switching.

pip install hellochusquis
hellochusquis --quick

The agent tries your first provider, and if it fails or hits limits, silently falls back to the next one. You never see an error — the task just completes.

The hardest part

The trickiest bug was getting the agent to execute commands correctly during multi-step plans. The agent would generate a plan, start executing, and then lose access to its tools halfway through. Step 1 worked, steps 2-6 failed with "Unknown tool" errors.

The problem: tools were available in the initial context but weren't being passed through each step of the execution loop. Once I fixed the context propagation, multi-step tasks like "search the web for AI news and summarize the top 3 stories" started working end to end.

How the fallback works

The core is a ProviderPool class that tracks each provider's state:

@dataclass
class Provider:
    name: str
    base_url: str
    api_key: str
    model: str
    exhausted: bool = False
    exhausted_at: datetime = None

class ProviderPool:
    def chat(self, messages, tools=None):
        available = self._available()
        for provider in available:
            try:
                return self._call(provider, messages, tools)
            except Exception as e:
                self._handle_error(provider, e)
        raise RuntimeError("All providers failed")

When a provider returns a 429, 402, or 503, it gets marked as exhausted with a timestamp. After a configurable window (default 1 hour), it resets automatically. It's essentially a circuit breaker pattern applied to LLM providers.

What it can do

Beyond the fallback, HelloChusquis has grown into

a full terminal agent:

128 integrations (Stripe, Supabase, AWS, Discord...)
Browser automation with human-like mouse movement
Web UI with voice I/O
Auto-Tool Builder: describe an integration, it generates the plugin
REST API mode
Persistent memory across sessions

Try it

pip install hellochusquis
hellochusquis --quick  # 60 second setup

GitHub: github.com/aminoy77/HelloChusquis

Open source, MIT license, free forever.

If you've hit the same rate limit frustration, I'd
love to hear how you're handling it — or what you'd
want HelloChusquis to do that it doesn't yet.

How I built a terminal AI agent that never hits rate limits (open source, Python)

What I built

The hardest part

How the fallback works

What it can do

Try it

Comments (0)

United States

Related News

What Does "Building in Public" Actually Mean in 2026?

The Agentic Headless Backend: What Vibe Coders Still Need After the UI Is Done

Why I’m Still Learning to Code Even With AI

I gave Claude a persistent memory for $0/month using Cloudflare

NYT: 'Meta's Embrace of AI Is Making Its Employees Miserable'