We provide a new benchmark, called Sparqloscope for evaluating the query performance of SPARQL engines. The benchmark has three unique features that separate it from other such benchmarks:
- Sparqloscope is comprehensive in that it considers most features of the SPARQL 1.1 query language that are relevant in practice. In particular: basic graph patterns, OPTIONAL, FILTER, ORDER BY, LIMIT, DISTINCT, GROUP BY and aggregates, UNION, EXISTS, MINUS, SPARQL functions (for numerical values, strings, and dates).
- Sparqloscope is generic in the sense that it can be applied to any given RDF dataset and will then produce a comprehensive benchmark for that particular dataset. Existing benchmarks are either synthetic or manually constructed for a fixed dataset.
- Sparqloscope is specific in the sense that it aims to evaluate features in isolation (independent from other features) as much as possible. This allows pinpointing specific strengths and weaknesses of a particular engine.
Sparqloscope is free and open-source software and easy to use. As a showcase we use it to evaluate the performance of three high-performing SPARQL engines (Virtuoso, MillenniumDB, QLever) on two widely used RDF datasets (DBLP and Wikidata).
Assuming a SPARQL endpoint for the DBLP dataset is running on port 7015 on your machine, you can generate a benchmark for this dataset using the following command-line (for details see --help
).
python3 generate-benchmark.py --sparql-endpoint http://localhost:7015 --prefix-definitions "$(cat prefixes/dblp.ttl)" --kg-name dblp
You may find ready-to-use benchmarks, which we have generated using Sparqloscope for popular datasets in the benchmarks/ folder.
An interactive web app for the evaluation results on various engines can be found at https://purl.org/ad-freiburg/sparqloscope-evaluation.
Detailed setup instructions for running Sparqloscope can be found in the setup documentation.
Details on the precomputation queries, placeholders and template queries are described in query-templates.yaml. The exact procedure of benchmark generation is documented in generate-benchmark.py.
This project is licensed under the Apache 2.0 License. For more information, see the LICENSE file.