Notes on Graph Retrieval-Augmented Generation

Standard RAG retrieves flat text chunks. But a lot of knowledge is relational: papers cite papers, entities connect to entities. Flattening that structure throws away signal an LLM could use.

Illustration of subgraph retrieval

The core idea

Instead of returning the top-k passages, GRAG retrieves a relevant subgraph and linearizes it for the language model, preserving the connections between retrieved nodes.

Embed the query and locate seed nodes.
Expand into a confidence-filtered local neighborhood.
Linearize the subgraph into a prompt the LLM can read.

Why it helps

Connected context lets the model follow multi-hop relationships in a single pass, which is exactly where chunk-based RAG tends to fail.

query --> seed nodes --> k-hop neighborhood --> linearized subgraph --> LLM

For the full method, see the GRAG paper.