The graph data structure can be leveraged to improve RAG systems in a variety of ways.
## document graph
When ingesting a file, the text is chunked. The smaller the chunks, the better the retrieval system can match a query to the relevant text. However, smaller chunks may not contain the full context or exclude critical information.
The chunks can include metadata that creates references to the previous chunk, next chunk, and the hierarchy of headers under which the chunk was located. After matching, the previous and subsequent chunk could be included in the prompt as additional context.
In this setup, each chunk is a node and relationships to other chunks are edges.
## graphRAG
A team at Microsoft originally proposed the GraphRAG framework. In their implementation, the corpus is translated into a graph by extracting entities, their relationships (i.e., when one is subject and one is object in a sentence) and claims about entities (e.g., entity A owns entity B). The entities become nodes, the relationships edges, and the number of relationships or claims is used as edge weights.
Communities are detected using traditional graph community detection (e.g., Louvain). An LLM is used to summarize the communities. The process is repeated, generating summaries of summaries in a hierarchical fashion, until a single summary is produced for the entire corpus.
The summaries can be used directly to understand themes in the corpus, or matched to queries to support question answering at higher levels of abstraction.
The team also released GraphRAG as an [open source library](https://microsoft.github.io/graphrag/get_started/). Multiple derivative approaches have also been published including
- [ai-knowledge-graph](https://github.com/robert-mcdermott/ai-knowledge-graph) (Robert McDermott; see [writeup](https://robert-mcdermott.medium.com/from-unstructured-text-to-interactive-knowledge-graphs-using-llms-dd02a1f71cd6))
- [llm-graph-builder](https://github.com/neo4j-labs/llm-graph-builder) (neo4j labs; see [writeup](https://medium.com/neo4j/llm-knowledge-graph-builder-first-release-of-2025-532828c4ba76))
> [!Tip]- Additional Resources
> - [How GraphRAG works](https://medium.com/towards-artificial-intelligence/how-microsofts-graphrag-works-step-by-step-b15cada5c209) (Mariana Avelino)