Agentic on jeeVee Blog

Mobile Coder

Sun, 15 Mar 2026 13:12:45 +0000

Mobile Coder

Use Case

Adding a small feature or fixing a bug. No agentic planning needed - I already know exactly which files I want to change.

Quality:

Steps

Open file
Add comments to the file to indicate what needs to be done
Launch the Coder flow
Select the option to Verify Objective. This option will run compile, test, etc, whatever the acceptance criteria is for that project after the code changes are applied.
Accept the file edits
Review the outcome (e.g. build or test results) in the Memento tab (tag #objective)

Handling incorrect solutions

Sometimes the LLMs will produce a solution that does not yield the desired outcome. And it can be because I underspecified the request.

Reflecting on Agentic Workflows

Thu, 01 Jan 2026 13:12:45 +0000

A hybrid system

Reflecting on my work from the past year, I have achieved the primary goal: a modular system with support for human-in-the-loop steering. This allows for a hybrid workflow: autonomous for some tasks, but steerable when necessary.

My experience with fully autonomous modes highlights the following challenges:

Review Complexity: It is difficult to review a large diff for a finished product. I find incremental code reviews to be a much safer and effective approach for important projects.
Specification Drift: The original task must be incredibly well-specified. Otherwise, the agent might spend a long time building an incorrect implementation based on a minor misunderstanding.
Cost: Unsupervised trial-and-error is expensive. Long-running tasks that rely on inference to self-correct often cost more than a steered session.

Similar to robots using odometry encoders for localization—where small errors accrue until they lead to significant drift—early divergence in LLM planning leads to dead ends. Without the ability to intervene, the only recovery mechanism is often a brute-force restart. By sticking to my principles of Manual Steering and Observability, I can correct this drift in real-time.

Memento

Mon, 29 Dec 2025 13:12:45 +0000

I wanted to share some notes on how I use the Memento component.

The Memento holds two types of artifacts:

TextLocations - this is mostly structured text, e.g LSP symbol definitions, LSP symbol references, etc.
Fragments - this is unstructured text, e.g. the result of a web search, a note to remember, etc.

Agentic use-case

This is one use-case for the Memento. Instead of maintaining a long-context LLMs outputs and tool results (e.g. LSP) are added to the Memento.

Local Search

Fri, 26 Dec 2025 13:12:45 +0000

I’ve been working on implementing two types of searches: (a) web search and (b) local search. This post covers my approach to the latter.

Local search helps find relevant text or code in local files. The search scope is limited to a single project rather than being global. Effectively the planning first identifies the project, and then it does a local search in the context of that project.

Given an overarching goal, the planning agent can break down multiple local searches as needed.

Modularity

Sun, 12 Oct 2025 13:12:45 +0000

My setup is built around docker compose, with a web app using the Monaco editor as the main interface.

Because my app runs in the browser, I keep several tabs open and work on multiple tasks in parallel. When a task needs my attention, I review the progress, approve it or add additional guidance, and then let it carry on while I switch to another tab.

Isolation

Each application under development gets its own docker container. The same goes for supporting services like LSPs and the LLM gateway—they each live in their own container too. This modular approach makes it easy to add new capabilities, even when they have different runtime requirements.

Weniger, Aber Besser (Meta-Programming with Agents)

Sat, 20 Sep 2025 13:12:45 +0000

This post is about coding workflow and Tool use with agents. To set the context, it’s worth a brief reminder on my approach. As I mentioned in another post, the solution I evolved and refined over time is to “chip” at the problem. Solve intermediary goals, evaluate, see what’s left, and repeat until the entire problem is solved. I suppose in a way it’s similar to how diffusion works for image generation: incrementally add and refine until the task is solved.

Frugal

Thu, 26 Jun 2025 13:12:45 +0000

Being frugal has benefits in terms of the effectiveness of the LLMs responses and the LLM API usage cost.

1. Single turn context

I’ve been tinkering with LLMs since the early days of GPT-4. My experience is that single turn prompts are more effective than long contexts.

The way my system works is that the Memento holds LSP references (e.g. symbol definitions, symbol references, etc.) or more general text fragments, and then a Classifier selects from the Memento which items are relevant for a given LLM prompt.

Observability

Sat, 10 May 2025 13:12:45 +0000

Observability

I have not seen this feature highlighted in other agentic systems, but my own experience building my system was that it is paramount to be able to see what the LLM conversation was, and even more, to be able to make amendments to the conversations.

Reviewing the detailed LLM log

This is incredibly valuable, as many problems can be traced to prompt and context engineering. The log shows what was actually sent to the LLM and I can review if I missed anything salient to the task I asked to be solved. The logs live locally in a WASM SQLite database in the browser.

Planning

Sat, 03 May 2025 13:12:45 +0000

My approach to planning takes a somewhat different direction compared to other agentic platforms.

Chipper

Instead of generating a large md file describing a plan, I use what I call the chipper approach.

It works like this:

Define a sub-goal
Plan for the sub-goal
Implement code for the sub-goal
Run the Objective (e.g. compile, tests, etc)
Summarize (a) what we’ve done so far (b) is the Objective satisfied
If DONE exit, if not return to #1. (i.e. continue chipping at the problem)

In practice the structure of the Planner is a behavior tree that looks like this:

Motivation

Fri, 24 Jan 2025 13:12:45 +0000

The primary goal of this project is to build a semi-autonomous agentic system with rich support for human-in-the-loop steering. I believe that I must perfect this foundation before layering on top the “Critic” agents necessary to drive the system autonomously.

For the foundation layer the following capabilities are important to me:

Full Configurability: The ability to configure every aspect of the end-to-end system, right down to the LLM system prompts.
Manual Steering: The ability to review code changes incrementally and steer all decisions in a granular “manual mode.”
Observability: Deep introspection into the system, including the ability to review, edit, and interact with the underlying LLM layer directly.

I believe that a tunable, ground-up system is a necessary stepping stone for building truly autonomous agents on top of that foundation.