Standard RAG retrieves flat text chunks. But a lot of knowledge is relational: papers cite papers, entities connect to entities. Flattening that structure throws away signal an LLM could use.

Illustration of subgraph retrieval

The core idea

Instead of returning the top-k passages, GRAG retrieves a relevant subgraph and linearizes it for the language model, preserving the connections between retrieved nodes.

  1. Embed the query and locate seed nodes.
  2. Expand into a confidence-filtered local neighborhood.
  3. Linearize the subgraph into a prompt the LLM can read.

Why it helps

Connected context lets the model follow multi-hop relationships in a single pass, which is exactly where chunk-based RAG tends to fail.

query --> seed nodes --> k-hop neighborhood --> linearized subgraph --> LLM

For the full method, see the GRAG paper.