Abstract
Large Language Models (LLMs) are increasingly used as autonomous agents formulti-step tasks. However, most existing frameworks fail to maintain astructured understanding of the task state, often relying on linear promptconcatenation or shallow memory buffers. This leads to brittle performance,frequent hallucinations, and poor long-range coherence. In this work, wepropose the Task Memory Engine (TME), a lightweight and structured memorymodule that tracks task execution using a hierarchical Task Memory Tree (TMT).Each node in the tree corresponds to a task step, storing relevant input,output, status, and sub-task relationships. We introduce a prompt synthesismethod that dynamically generates LLM prompts based on the active node path,significantly improving execution consistency and contextual grounding. Throughcase studies and comparative experiments on multi-step agent tasks, wedemonstrate that TME leads to better task completion accuracy and moreinterpretable behavior with minimal implementation overhead. A referenceimplementation of the core TME components is available athttps://github.com/biubiutomato/TME-Agent, including basic examples andstructured memory integration. While the current implementation uses atree-based structure, TME is designed to be graph-aware, supporting reusablesubsteps, converging task paths, and shared dependencies. This lays thegroundwork for future DAG-based memory architectures.