loader

Can GraphRAG finally solve the Kendrick & Drake dispute?

Following on from Part 1 where I introduced GraphQL and it’s popularity let’s take that a step further and look at how we can include a GraphQL service like neo4j into a RAG implementation using OpenAI’ GPT-4o and how it improves the contextual RAG responses.

Introduction

In the vibrant landscape of hip-hop, feuds are not just battles of words but also of narratives and public perceptions. The feud between Kendrick Lamar and Drake is a prime example, filled with subtexts, direct and indirect messages, and a broad influence across fans and media. To dissect this complex interplay, we turn to GraphRAG (Graph Retrieval Augmented Generation), which combines the analytic precision of graph databases with the intuitive understanding of language models.

Setting Up the Analysis Environment

Firstly, setting up the right tools is essential for any data-driven analysis. For this exploration, we utilize Neo4j, a graph database that excels in handling connected data, alongside LangChain's capabilities to interface with OpenAI's models:

import os
from langchain_community.graphs import Neo4jGraph

os.environ["OPENAI_API_KEY"] = "your_openai_api_key"  # Replace with actual key
os.environ["NEO4J_URI"] = "your_neo4j_uri"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "your_neo4j_password"

graph = Neo4jGraph()

Theoretical Background

Graph theory is adept at representing complex systems of relationships and interactions through nodes and edges, providing a structured way to visualize and analyse relationships. Graph RAG leverages this structure, enhancing the retrieval capabilities typically found in RAG systems by integrating the contextual depth that graphs provide. This dual approach enables a more nuanced understanding and retrieval of information, perfect for dissecting the Kendrick-Drake narrative.

 

Methodology

Data Collection

Our analysis begins with the meticulous collection of data points—lyrics, tweets, interviews—anything where Kendrick and Drake reference each other, either directly or indirectly. These data points are then loaded and preprocessed using LangChain tools:

from langchain.document_loaders import WikipediaLoader
from langchain.text_splitter import TokenTextSplitter

raw_documents = WikipediaLoader(query="Kendrick Lamar").load()
text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=24)
documents = text_splitter.split_documents(raw_documents[:3])

Graph Construction

We translate this prepared textual data into a structured graph format, employing LLMs to automatically recognize and encode relationships between entities mentioned in the data:

from langchain_openai import ChatOpenAI
from langchain_experimental.graph_transformers import LLMGraphTransformer

llm = ChatOpenAI(temperature=0, model_name="gpt-4")
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(documents)
graph.add_graph_documents(graph_documents, baseEntityLabel=True, include_source=True)

Let’s have a look at a section of the database as a graph.

Hybrid Retrieval for RAG

With the graph constructed, our next step involves a hybrid retrieval approach. This method enhances the retrieval of contextually relevant information by combining vector search through unstructured text with structured graph data:

from langchain_community.vectorstores import Neo4jVector
from langchain_openai import OpenAIEmbeddings

vector_index = Neo4jVector.from_existing_graph(
    OpenAIEmbeddings(),
    search_type="hybrid",
    node_label="Document",
    text_node_properties=["text"],
    embedding_node_property="embedding"
)

Results

To analyse the feud, we pose queries directly related to the artists' interactions, using the graph to retrieve detailed contexts and understand the underlying dynamics:

from langchain_core.runnables import RunnableParallel, RunnablePassthrough, ChatPromptTemplate, StrOutputParser

template = """Answer the question based only on the following context:
{context}

Question: {question}
Use natural language and be concise.
Answer:"""

prompt = ChatPromptTemplate.from_template(template)

chain = (
    RunnableParallel(
        {
            "context": _search_query | retriever,
            "question": RunnablePassthrough(),
        }
    )
    | prompt
    | llm
    | StrOutputParser()
)

result = chain.invoke({"question": "How do you win a beef in rap?"})
print(result)

Response:

'Winning a beef in rap often involves delivering more impactful diss tracks, gaining public and critical support, and effectively countering accusations. In the context of the Drake and Kendrick Lamar feud, critics and social media users have generally cited Lamar as leading or winning due to his strong responses and the support from major outlets like Pitchfork, The Ringer, and Rolling Stone.'

We can even delve into the sub context of who Kendrick Lamar is take this for example:

print(structured_retriever("Who is k.dot?"))

We can see the association through the models response:

K.Dot - RELEASED -> Y.H.N.I.C. (Hub City Threat: Minor Of The Year)

K.Dot - RELEASED -> Training Day

K.Dot - RELEASED -> C4

Kendrick Lamar - ALIAS -> K.Dot

Now for the big question “Can you imperially prove that Kendrick Lamar won the beef against drake?”

Let’s have a look at the GraphRAG’ input and response.

chain.invoke(
    {
        "question": "Can you imperically prove that Kendrick Lamar won the beef against Drake?",
        "chat_history": [
            (
                "How do you win a beef in rap?",
                "Winning a beef in rap often involves delivering more impactful diss tracks, gaining public and critical support, and effectively countering accusations. In the context of the Drake and Kendrick Lamar feud, critics and social media users have generally cited Lamar as leading or winning due to his strong responses and the support from major outlets like Pitchfork, The Ringer, and Rolling Stone."
            )
        ],
    }
)

Yes, critics from major outlets like Pitchfork, The Ringer, and Rolling Stone have generally cited Kendrick Lamar as the winner of the beef against Drake. Additionally, social media users and music critics have also considered Lamar the winner, indicating strong public and critical support for his position in the feud.
— GraphRAG

 

Conclusion

GraphRAG provides a deep and nuanced view into the layers of the Kendrick and Drake feud, uncovering not just what was said, but the broader context of why it was said, thus enabling a clearer understanding of their interactions. While it may not solve their feud, it does offer a powerful tool for analyzing similar conflicts in the music industry and beyond.

 

Future Applications: GraphRAG in Business and Legal Document Analysis

GraphRAG's capabilities are broadly applicable across various business domains where complex data relationships need to be deciphered quickly and accurately. For instance, in sectors like finance and healthcare, GraphRAG can enhance decision-making by providing comprehensive insights through the analysis of interconnected data points. This could include identifying trends from financial reports or understanding patient outcomes from medical records, all organized in an intuitive graph-based format. This approach not only streamlines data processing but also enriches the context and depth of the information retrieved, aiding strategic business decisions.

In the legal field, GraphRAG introduces a revolutionary way to handle and interpret dense legal documents. By building knowledge graphs that map out the relationships between cases, statutes, and legal principles, the tool dramatically reduces the time required for legal research and enhances the precision of legal advice. Lawyers can leverage this technology to quickly find relevant case law and statutory references, ensuring a comprehensive understanding of all pertinent legal texts.

The implementation of GraphRAG within legal practices particularly highlights its potential to manage the complexity of legal language and the intricate network of legal rulings. As such, this technology not only boosts the efficiency of legal research but also provides deeper analytical insights, potentially transforming the traditional methodologies of legal analysis and impacting future legal proceedings and outcomes.

 

Ready to Transform Your business with GraphRAG?

At Advancing Analytics, we specialise in implementing advanced analytical solutions that drive decision-making and innovation. If you're interested in exploring how GraphRAG can enhance your data analytics capabilities or want to integrate the technology into your existing systems, we're here to help.

 

Contact Us Today to schedule a consultation and discover how our expertise can empower your team to uncover actionable insights from complex data. Let's unlock the potential of your data together.

author profile

Author

Christopher Durow