Ankit Upadhyay
We present a systematic empirical study of evidence grounding vs. memorization on FACTKG. First, we construct feature-based symbolic baselines using 31 hand-crafted features over graph structure and evidence overlap, yielding a 63.96% “symbolic ceiling” that clarifies where non-neural methods break down. Second, we reproduce and extend BERT and QA-GNN baselines, confirming that BERT-base with linearized one-hop subgraphs achieves 92.68% accuracy while QA-GNN variants remain around 70%. Third, we use GPT-4.1-mini as a semantic filter that selects the most relevant triples per claim; BERT trained on 9,706 LLM-filtered examples reaches 78.85% accuracy, whereas an equally sized unfiltered subset collapses to 52.70%, demonstrating that semantic quality, not just data quantity, governs learnability. Finally, we design a 300-example comparison of GPT-4o-mini and GPT-4.1-mini under memorization (claims only) versus KG-grounded chain-of-thought with triple citations: KG grounding raises accuracy from 71.67% to 84.33% and from 74.67% to 84.00%, respectively. Together, these results show that neural semantic representations and explicit KG grounding are essential for robust and interpretable KG-based fact verification.
Links:
- Final paper: https://drive.google.com/file/d/1LLnUeVLAaW1S3LsGQUovaH2iG1qQHGxq/
- Final presentation (slides): https://docs.google.com/presentation/d/1i1M-8NWKF8Es9un0J00VQeDD8F3Gi9IGRnyPyXed4-A/
- Final presentation (video): https://youtu.be/uOz1olEUOrc
- github repository: https://github.com/AnkitKUpadhyay/kg-fact-verification-semantics