Nipun D. Pathirage
We present The Idea Graph, a novel framework for constructing semantically grounded knowledge graphs construction from scientific literature by capturing both concrete structure and abstract scholarly discourse. Extending prior work focused on metadata or citation networks, our approach models complex scientific intent—including research problems, proposed solutions, evidence, and claims—grounded in their originating textual context. Central to our system is a discourse-aware ontology, inspired by Toulmin’s argumentation model, which unifies document structure, rhetorical intent, and attribution semantics into a machine-readable schema. We implement an end-to-end pipeline that includes layout-preserving document parsing, ontology-guided abstract entity extraction using a multi-agent LLM architecture, and structured graph instantiation in RDF. The system supports hierarchical vector indexing for retrieval-augmented generation (RAG), provenance-aware attribution, and semantic querying. An interactive Web UI allows users to visualize discourse-level graph elements and citation roles. This work lays the foundation for scalable, interpretable, and ontology-compliant scholarly knowledge graphs, advancing the automation of knowledge discovery and scientific reasoning.
Links:
- Final paper: https://drive.google.com/file/d/1uLc-3v7vdKt4Xcwsrs3jM_xCYloWKAsY/
- Final presentation (slides): https://docs.google.com/presentation/d/1AgxQJGWIHb9XdBVNjeKlvMM0qp9OmDDI60tt33LiH3M/
- Final presentation (video): https://youtu.be/5nuMl7plQSc
- github repository: https://github.com/nipdep/idea-graph