Programming Language Lineage Dataset
An open, evidence-backed dataset of programming language implementation and influence relationships. Every relationship includes a confidence score and at least one evidence source URL.
112
Total nodes
98
Languages
14
Tools
347
Relationships
Relationship Breakdown
| Type | Count |
|---|---|
| influenced | 189 |
| compiler written in | 78 |
| runtime written in | 56 |
| bootstrap written in | 14 |
| transpiled to | 8 |
| rewritten in | 2 |
Schema
Each language node contains: id, name, first_release_year, paradigm, typing, cluster_hint.
Each relationship contains: from_language, to_language, relationship, confidence (0–1), evidence_source (URL), notes.
Download
The raw dataset JSON is available at:
https://languagelineage.org/dataset/v4/lineage_v4.json
Citation
Language Lineage dataset (languagelineage.org). Accessed 2026.Explore in Graph →