1Microsoft Research Asia · 2UCSD · 3Tsinghua University
*Equal Contribution · †Corresponding Author · ‡Work done during internship at Microsoft
We define RPG as a hierarchical, dual-view graph $\mathcal{G} = (\mathcal{V}, \mathcal{E})$. The node set $\mathcal{V} = \mathcal{V}_{H} \cup \mathcal{V}_{L}$ distinguishes High-level Nodes $\mathcal{V}_{H}$ representing architectural directories from Low-level Nodes $\mathcal{V}_{L}$ comprising atomic implementations (files, classes, functions). Each node $v = (f, \mathbf{m}) \in \mathcal{V}$ pairs a semantic feature $f$ describing functionality with structural metadata $\mathbf{m}$ encoding code entity attributes. The edge set $\mathcal{E}$ integrates two perspectives:
Lifts codebase into a discrete registry of Low-level Nodes. Extracts semantic features for functions and classes, mapping them to behavioral signatures while retaining code-level attributes as metadata.
Constructs High-level Nodes by recovering the latent functional topology. Performs Functional Abstraction via granularity-based compression and Hierarchical Aggregation to link nodes to semantic centroids.
Anchors the functional manifold to physical artifacts using LCA-based bottom-up propagation. Injects dependency edges via AST analysis to complete the implementation map.
To reduce the cost of full re-generation, we maintain the graph incrementally via commit-level feature extraction and three atomic update protocols:
RPG provides a queryable index where Functional and Dependency Views are partitioned by edge types but share a unified node set. Three core tools enable navigation:
Global retrieval by matching intent against semantic features or filtering metadata.
Node-level data retrieval: extracts attributes and raw source code.
Cross-view traversal along edges for navigating execution flows.
We evaluate RPG-Encoder on two challenging benchmarks: SWE-bench for fault localization and RepoCraft for repository reconstruction. Our experiments demonstrate that RPG-guided agents achieve state-of-the-art performance with significant efficiency gains across multiple backbone models.
| Method | SWE-bench Verified | SWE-bench Live | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| File-level | Function-level | File-level | Function-level | |||||||||||||
| Acc@1 | Acc@5 | Pre | Rec | Acc@1 | Acc@5 | Pre | Rec | Acc@1 | Acc@5 | Pre | Rec | Acc@1 | Acc@5 | Pre | Rec | |
| o3-mini | ||||||||||||||||
| Agentless | 67.1 | 88.1 | 67.0 | 64.7 | 34.7 | 60.3 | 39.4 | 33.2 | 54.2 | 78.5 | 55.6 | 47.7 | 28.8 | 54.2 | 39.3 | 25.6 |
| OrcaLoca | 67.5 | 71.9 | 68.3 | 64.0 | 46.3 | 52.9 | 48.3 | 41.5 | 35.4 | 38.0 | 36.2 | 27.6 | 23.1 | 26.1 | 25.3 | 15.6 |
| LocAgent | 62.8 | 77.2 | 64.7 | 61.4 | 32.1 | 40.5 | 33.9 | 28.9 | 47.6 | 59.4 | 49.7 | 41.2 | 23.8 | 31.0 | 26.6 | 17.7 |
| CoSIL | 66.5 | 85.7 | 66.2 | 63.6 | 52.2 | 73.3 | 54.7 | 47.1 | 60.9 | 80.8 | 66.1 | 54.8 | 43.8 | 65.1 | 51.4 | 35.6 |
| RPG-Encoder (Ours) | 78.3 | 91.2 | 80.7 | 76.8 | 58.5 | 77.8 | 62.9 | 55.1 | 73.7 | 88.2 | 77.5 | 64.5 | 56.5 | 75.6 | 64.7 | 46.9 |
| Δ best | +10.8 | +3.1 | +12.4 | +12.1 | +6.3 | +4.5 | +8.2 | +8.0 | +12.8 | +7.4 | +11.4 | +9.7 | +12.7 | +10.5 | +13.3 | +11.3 |
| GPT-4o | ||||||||||||||||
| Agentless | 63.0 | 86.1 | 63.1 | 61.1 | 31.4 | 58.8 | 34.7 | 29.3 | 56.1 | 78.8 | 57.1 | 48.3 | 30.6 | 57.4 | 41.4 | 26.4 |
| OrcaLoca | 64.3 | 69.3 | 65.0 | 61.4 | 39.8 | 53.3 | 42.5 | 36.7 | 42.5 | 47.6 | 45.0 | 34.0 | 28.2 | 37.0 | 32.5 | 21.1 |
| LocAgent | 71.9 | 87.9 | 73.4 | 69.3 | 40.1 | 67.4 | 44.8 | 38.1 | 62.5 | 80.0 | 66.8 | 54.2 | 35.7 | 56.4 | 44.5 | 29.9 |
| CoSIL | 64.9 | 84.4 | 65.0 | 62.2 | 43.2 | 66.2 | 48.2 | 40.1 | 60.1 | 77.0 | 63.7 | 50.7 | 41.2 | 61.6 | 49.1 | 29.4 |
| RPG-Encoder (Ours) | 74.5 | 89.6 | 77.0 | 72.7 | 53.1 | 76.7 | 57.9 | 49.5 | 69.2 | 83.5 | 73.2 | 60.3 | 50.5 | 69.4 | 59.4 | 41.8 |
| Δ best | +2.6 | +1.7 | +3.6 | +3.4 | +9.9 | +9.3 | +9.7 | +9.4 | +6.7 | +3.5 | +6.4 | +6.1 | +9.3 | +7.8 | +10.3 | +11.9 |
| GPT-4.1 | ||||||||||||||||
| Agentless | 65.2 | 90.8 | 65.7 | 63.5 | 29.3 | 49.0 | 32.7 | 26.4 | 62.0 | 85.5 | 63.0 | 54.5 | 35.1 | 59.4 | 46.0 | 25.4 |
| OrcaLoca | 75.2 | 80.0 | 76.5 | 71.3 | 55.2 | 66.7 | 59.0 | 50.1 | 56.2 | 59.6 | 57.1 | 44.2 | 42.0 | 50.5 | 46.2 | 29.1 |
| LocAgent | 79.5 | 90.9 | 80.8 | 77.2 | 32.3 | 65.6 | 36.7 | 31.2 | 74.7 | 87.9 | 76.8 | 66.1 | 43.4 | 68.7 | 52.5 | 38.7 |
| CoSIL | 69.8 | 90.6 | 70.7 | 67.6 | 51.8 | 74.5 | 55.3 | 47.0 | 62.3 | 84.7 | 67.3 | 55.6 | 48.8 | 72.2 | 58.3 | 41.2 |
| RPG-Encoder (Ours) | 82.6 | 93.2 | 83.6 | 79.3 | 68.7 | 83.4 | 71.0 | 62.4 | 78.0 | 90.5 | 81.4 | 69.0 | 64.7 | 81.9 | 72.1 | 52.6 |
| Δ best | +3.1 | +2.3 | +2.8 | +2.1 | +13.5 | +8.9 | +12.0 | +12.3 | +3.3 | +2.6 | +4.6 | +2.9 | +15.9 | +9.7 | +13.8 | +11.4 |
| GPT-5 | ||||||||||||||||
| Agentless | 78.7 | 95.9 | 78.3 | 76.2 | 45.1 | 68.1 | 47.3 | 41.3 | 64.5 | 87.4 | 65.1 | 57.4 | 38.8 | 64.6 | 49.7 | 31.6 |
| OrcaLoca | 88.2 | 93.9 | 88.6 | 84.2 | 76.1 | 86.2 | 79.1 | 68.6 | 74.4 | 82.3 | 77.6 | 63.5 | 59.6 | 74.0 | 68.6 | 46.6 |
| LocAgent | 88.2 | 96.7 | 88.4 | 86.7 | 50.9 | 80.3 | 55.9 | 49.7 | 79.7 | 93.0 | 81.4 | 74.2 | 48.0 | 68.7 | 56.6 | 40.5 |
| CoSIL | 82.8 | 95.7 | 82.3 | 80.2 | 68.3 | 81.8 | 68.9 | 62.3 | 69.8 | 89.3 | 72.9 | 62.2 | 55.2 | 76.2 | 62.3 | 46.5 |
| RPG-Encoder (Ours) | 91.9 | 97.7 | 91.1 | 89.1 | 83.4 | 93.6 | 84.5 | 76.9 | 82.1 | 94.4 | 85.4 | 76.2 | 71.9 | 87.8 | 78.1 | 61.1 |
| Δ best | +3.7 | +1.0 | +2.5 | +2.4 | +7.3 | +7.4 | +5.4 | +8.3 | +2.4 | +1.4 | +4.0 | +2.0 | +12.3 | +11.6 | +9.5 | +14.5 |
| Claude-4.5-Sonnet | ||||||||||||||||
| Agentless | 76.6 | 96.5 | 76.9 | 74.4 | 31.7 | 34.6 | 32.0 | 27.1 | 63.8 | 89.7 | 66.1 | 58.0 | 41.4 | 72.4 | 55.3 | 35.9 |
| OrcaLoca | 87.2 | 89.6 | 87.5 | 82.2 | 74.5 | 79.3 | 76.5 | 65.1 | 74.7 | 78.3 | 76.2 | 61.5 | 65.1 | 69.4 | 67.8 | 46.1 |
| LocAgent | 71.4 | 76.6 | 72.7 | 70.2 | 49.3 | 57.8 | 51.5 | 44.9 | 58.7 | 69.0 | 61.6 | 54.7 | 47.3 | 60.5 | 52.6 | 39.3 |
| CoSIL | 75.5 | 96.1 | 75.9 | 73.7 | 57.5 | 78.7 | 60.7 | 52.9 | 64.5 | 88.3 | 69.4 | 57.5 | 51.1 | 74.9 | 60.1 | 39.6 |
| RPG-Encoder (Ours) | 90.5 | 97.6 | 91.8 | 88.6 | 79.8 | 93.7 | 83.4 | 75.8 | 82.0 | 93.9 | 85.6 | 75.8 | 74.8 | 90.4 | 80.7 | 63.3 |
| Δ best | +3.3 | +1.1 | +4.3 | +6.4 | +5.3 | +14.4 | +6.9 | +10.7 | +7.3 | +4.2 | +9.4 | +14.3 | +9.7 | +15.5 | +12.9 | +17.2 |
| Framework | Backbone | Coverage (%) | Accuracy (Pass/Vote) (%) | #Files | nLOC | Code Tokens |
|---|---|---|---|---|---|---|
| Gold Projects | Human | 100.0 | 94.8 / 98.8 | 345 | 97,725 | 718,946 |
| ZeroRepo-Doc | GPT-4.1 | 64.6 | 50.0 / 63.4 | 209 | 6,079 | 158,948 |
| ZeroRepo-Doc | GPT-5-mini | 74.2 | 52.6 / 71.4 | 143 | 13,414 | 125,625 |
| ZeroRepo-RPG (Ours) | GPT-4.1 | 93.5 | 85.8 / 93.4 | 206 | 35,190 | 346,865 |
| ZeroRepo-RPG (Ours) | GPT-5-mini | 98.5 | 86.0 / 97.7 | 226 | 60,871 | 550,432 |
@misc{luo2026rpgencoder
title={Closing the Loop: Universal Repository Representation with RPG-Encoder},
author={Jane Luo and Chengyu Yin and Xin Zhang and Qingtao Li and Steven Liu and Yiming Huang and Jie Wu and Hao Liu and Yangyu Huang and Yu Kang and Fangkai Yang and Ying Xin and Scarlett Li},
year={2026},
eprint={2602.02084},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2602.02084},
}