Standard RAG retrieves flat text chunks. But a lot of knowledge is relational: papers cite papers, entities connect to entities. Flattening that structure throws away signal an LLM could use.
The core idea
Instead of returning the top-k passages, GRAG retrieves a relevant subgraph and linearizes it for the language model, preserving the connections between retrieved nodes.
- Embed the query and locate seed nodes.
- Expand into a confidence-filtered local neighborhood.
- Linearize the subgraph into a prompt the LLM can read.
Why it helps
Connected context lets the model follow multi-hop relationships in a single pass, which is exactly where chunk-based RAG tends to fail.
query --> seed nodes --> k-hop neighborhood --> linearized subgraph --> LLM
For the full method, see the GRAG paper.