Comparing eleven RDF frameworks and triplestores across I/O and SPARQL query performance at 100K, 1M, and 10M triples
Toggle frameworks on or off to customise all charts and tables below. At least one must remain active.
Eleven configurations spanning four languages and multiple execution models.
| Framework | Language | Engine | Version | License |
|---|---|---|---|---|
| maplib | Python (Rust core) | Polars + Apache Arrow, in-memory | 0.20.15 | Apache 2.0 |
| maplib (disk) * | Python (Rust core) | Polars + Apache Arrow, disk-backed storage | 0.20.15 | Proprietary |
| oxigraph | Python (Rust core) | SPOG indexes, disk-backed store (RocksDB) | 0.5.7 | MIT / Apache 2.0 |
| rdflib | Python (pure) | In-memory dict-of-dicts | latest | BSD 3-Clause |
| Jena | Java | In-memory Model | 5.2.0 | Apache 2.0 |
| RDF4J | Java | MemoryStore SAIL | 5.0.3 | EDL 1.0 |
| QLever | C++ (Docker) | On-disk index + SPARQL endpoint | latest | Apache 2.0 |
| Virtuoso | C (Docker) | Hybrid relational/RDF, column store | 7.x (latest) | GPL v2 |
| GraphDB | Java (Docker) | RDF4J-based triplestore, on-disk persistence | 10.8.0 | Proprietary (free tier) |
| dotNetRDF | C# (Docker) | In-memory TripleStore, Leviathan SPARQL engine | 3.5.1 | MIT |
| Neo4j + n10s | Java (Docker) | Native property graph with neosemantics RDF import | 5.26 + n10s 5.26.0 | GPL v3 (Community) |
* maplib (disk) uses the storage_folder parameter for disk-backed storage. This feature is part of the proprietary maplib distribution and is not available in the open-source release. The in-memory maplib (without storage_folder) is fully open source under Apache 2.0.
Synthetic e-commerce graph (customers, orders, products) generated with a fixed seed for reproducibility.
| Scale | Triples | Turtle | N-Triples |
|---|---|---|---|
| Medium | ~100 K | 3.6 MB | 10.9 MB |
| Large | ~1 M | 36.9 MB | 111 MB |
| XLarge | ~10 M | 369 MB | 1.1 GB |
Four queries of increasing complexity, representative of real analytical workloads. Click a row to see the SPARQL.
| ID | Description | Complexity |
|---|---|---|
| Q1 | COUNT all triples | Trivial, full scan |
SELECT (COUNT(*) AS ?count) WHERE { ?s ?p ?o . } |
||
| Q2 | Top 20 customers by spend (GROUP BY + SUM + ORDER BY) | Aggregation over joins |
PREFIX : <http://benchmark.example/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?customer_name (COUNT(?order) AS ?order_count) (SUM(?amount) AS ?total_spend)
WHERE {
?order :placedBy ?customer ;
:totalAmount ?amount .
?customer rdfs:label ?customer_name .
}
GROUP BY ?customer_name
ORDER BY DESC(?total_spend)
LIMIT 20 |
||
| Q3 | 3-entity join (customer + order + product) with country filter | Multi-pattern + filter |
PREFIX : <http://benchmark.example/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?customer_name ?product_name ?amount ?status
WHERE {
?order :placedBy ?customer ;
:contains ?product ;
:totalAmount ?amount ;
:orderStatus ?status .
?customer rdfs:label ?customer_name ;
:country "Norway" .
?product rdfs:label ?product_name .
}
ORDER BY DESC(?amount)
LIMIT 50 |
||
| Q4 | Revenue by country/segment with OPTIONAL orders | OPTIONAL + aggregation |
PREFIX : <http://benchmark.example/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?country ?segment
(COUNT(DISTINCT ?customer) AS ?customers)
(COUNT(DISTINCT ?order) AS ?orders)
(SUM(?amount) AS ?revenue)
WHERE {
?customer rdf:type :Customer ;
:country ?country ;
:segment ?segment .
OPTIONAL {
?order :placedBy ?customer ;
:totalAmount ?amount .
}
}
GROUP BY ?country ?segment
ORDER BY DESC(?revenue) |
||
Time to read and write RDF data in Turtle and N-Triples format.
Best of 3 runs after warmup. All frameworks execute the same SPARQL queries.
How each framework handles growing data volumes, from 100K to 10M triples.