Project: Graph RAG for Education

Spring 2024

View Source

📊 Experimental Results

287

Questions Evaluated

8 Different Categories

79%

Conceptual Integration

Accuracy Score

+18%

vs Traditional RAG

Performance Improvement

Problem Definition: Educational AI Integration Challenges

Core Challenge: Current educational AI systems struggle to represent complex conceptual relationships that are fundamental to effective teaching and learning.

The integration of AI in educational settings has largely followed two paths: complete prohibition or unrestricted access to general-purpose tools. This binary approach fails to address the fundamental challenge of creating AI systems that can guide learning without undermining the educational process.

Students increasingly rely on AI tools for coursework completion, creating tension between technological capabilities and pedagogical goals. Rather than implementing blanket restrictions, this project investigates how AI systems can be designed to promote understanding rather than provide direct solutions.

In the context of NLP education, this challenge becomes particularly acute. Students are learning about the very technologies they might misuse, creating both opportunities and risks for AI integration in the classroom.

Technical Limitations of Existing Approaches

Traditional retrieval-augmented generation (RAG) systems excel at finding semantically similar content but struggle with the structured knowledge representation required for educational contexts. These systems treat information as isolated chunks rather than understanding the interconnected nature of educational concepts.

Vector-Only RAG

Semantic similarity → isolated chunks

Graph RAG

Structured relationships → contextual understanding

Educational content contains implicit relationships between concepts that are critical for comprehension. Understanding attention mechanisms, for example, requires knowledge of their connections to alignment problems, earlier sequence models, and transformer architectures. Vector-based retrieval often misses these structural dependencies.

System Design and Implementation

This project developed a graph-based knowledge representation system for CS 4740/5740 (Introduction to NLP) at Cornell University, working in collaboration with Claire Cardie. The system addresses three core technical requirements for educational AI.

🎯 Core Technical Requirements

Structured Knowledge Representation

Capture explicit relationships between NLP concepts, allowing for systematic traversal of conceptual dependencies

Adaptive Retrieval Strategies

Modify search depth and breadth based on question classification and educational objectives

Pedagogical Alignment

Generate responses that promote learning rather than providing direct solutions to assignments

Architecture: Knowledge Graph Construction and Retrieval

The system implementation preceded widespread adoption of Graph RAG techniques, developed concurrently with early research in this area during Spring 2024. The architecture combines automated knowledge extraction with hybrid retrieval strategies.

Implementation Details

Knowledge Graph Construction: Automated entity and relationship extraction from course lecture documents using GPT-3.5/GPT-4, generating Cypher queries for Neo4j database population. The extraction process focuses on capturing both explicit concept definitions and implicit relationships between NLP techniques, theories, and applications.

Hybrid Retrieval Implementation:

Vector embeddings for semantic similarity matching
Keyword-based search for exact terminology retrieval
Graph traversal algorithms for relationship exploration
Question classification to determine optimal retrieval strategy

Adaptive Response Generation: The system modifies graph traversal parameters based on question taxonomy. Theoretical questions trigger deeper conceptual exploration, while clarification requests focus on adjacent concept relationships.

Experimental Methodology and Results

The evaluation framework addresses limitations in traditional NLP metrics when applied to educational contexts. Rather than relying solely on ROUGE or BLEU scores, the assessment focuses on pedagogical effectiveness and conceptual accuracy.

📊 Evaluation Framework

✅ Strong Performance Areas

Clarification Queries 81%

Conceptual Integration 79%

Assignment-related 76%

Theory-based 70%

⚠️ Identified Limitations

Compound Questions 68%

Image-based Content 55%

Circumstantial Context 40%

Off-topic Queries 18%

Performance Analysis

The system demonstrated particular strength in scenarios requiring conceptual relationship understanding. For theory-based questions about transformer architectures and attention mechanisms, the graph structure enabled comprehensive exploration of related concepts and their dependencies.

Performance limitations emerged primarily in questions requiring rich contextual understanding beyond the structured knowledge representation. Circumstantial queries referencing specific lecture moments or discussion context proved challenging for the graph-based approach.

🆚 Comparative Analysis with Vector-Only RAG

Performance Improvements Over Vector-Only Baseline

+18%

Conceptual Integration

+14%

Clarification Questions

+12%

Theory-based Questions

The performance improvements align with theoretical expectations about graph-based knowledge representation. Questions benefiting from structured relationship understanding showed the most significant gains, while improvements were minimal for tasks where vector similarity sufficed.

Technical Insights and Limitations

This implementation revealed several key insights about applying graph-based knowledge representation to educational contexts. The most significant finding concerns the relationship between question type and optimal retrieval strategy.

Key Technical Findings:

Evaluation methodology matters: Educational AI requires domain-specific metrics beyond traditional NLP benchmarks
Question classification is critical: Different learning objectives require different retrieval strategies
Structured representation has limits: Graph abstraction can lose important contextual nuances
Hybrid approaches show promise: Combining graph and vector search addresses complementary weaknesses

The evaluation framework developed for this project required collaboration with domain experts to assess pedagogical value rather than solely information accuracy. This highlighted the importance of task-appropriate evaluation metrics in educational AI applications.

Implications for Educational AI Design

This work demonstrates that effective educational AI requires careful consideration of both technical architecture and pedagogical objectives. The challenges encountered in this project extend beyond NLP education to broader questions about AI system design for human learning.

The proposed EDGuard framework represents a systematic approach to moderating AI-student interactions. This system would analyze both questions and responses to ensure appropriate educational engagement, promoting learning through guided discovery rather than direct answer provision.

The technical insights from graph-based knowledge representation have broader applications in developing AI systems that can reason about relationship structures and their effects on human understanding and capability development.

Connections to Current Research

This project addresses fundamental questions in AI alignment and human-AI interaction that extend beyond educational applications. The challenge of building systems that enhance rather than replace human capability is central to current discussions about beneficial AI development.

The evaluation methodologies developed here particularly the focus on measuring systems across different task types and contexts inform current approaches to assessing reasoning capabilities in language models. The importance of task appropriate metrics and domain expert evaluation has become increasingly relevant as AI systems are deployed in specialized domains.

Most importantly, this work demonstrates that the most impactful AI research occurs at the intersection of technical innovation and human values. Understanding both the technology and the specific domain requirements is essential for developing AI systems that provide genuine benefit rather than superficial convenience.

Back to All Projects

Technologies: Python, GPT-3.5/4, Neo4j, EdStem API

David Akinboro

Graph RAG for Education: Addressing Knowledge Representation Gaps