flash

May 15, 2026

A couple weeks ago I was writing more bash than I care to admit and the "language"(if you can call it that) is quite useful.

2mo

We should all just be writing bash

I decided to build an agent harness in it. flash is a lightweight AI agent framework in roughly 300 lines of bash. It uses Ollama to query a llm and lets the model use shell tools to complete tasks.

How it works

Flash uses a two-model architecture:

Gemma4:E4B (execution model) has access to tools. it decides when to run shell commands, fetch URLs, or manage a todo list via function calling.
Gemma4:E2B (response model) takes the tool results and generates the final reply. No tools, just text. The idea here is when a conversation is needed this model is quicker on the resources I currently have it running on.
The models hit a local Ollama API, and chat history persists as JSON files in a sessions/ directory. You can switch between sessions at runtime with /session.

Tools are just shell scripts

Every tool is a standalone .sh file in tools/:

sh.sh - runs any shell command
webfetch.sh - fetches and strips HTML from a URL
todo_add.sh, todo_done.sh, todo_list.sh - lightweight task management
The agent can call them in parallel, collect results, and feed them back to the model.

Most agent harnesses are big and cumbersome. I wanted to run something that would be quick and easy to use locally. flash lives in my local file system uses a local model. It's a bash script that is self modifying and almost any consume grade laptop could rrun.

The whole thing fits in a Docker image based on Alpine with just bash, curl, and jq.

Watney4: Another Agent

Self-Hosting: Farmville in 2026

flash

How it works

Tools are just shell scripts

Why?

Brady Hawkins