mirror of
https://github.com/libexpat/libexpat.git
synced 2025-04-04 12:54:58 +00:00
Some of these currently take a very long time to parse. I set those to only run one loop in the run-benchmark make target. 4096 may be a fairly small buffer, and definitely make the problem worse than it otherwise would've been, but similar sizes exist in real code: * 2048 bytes in cpython Modules/pyexpat.c * 4096 bytes in skia SkXMLParser.cpp * BUFSIZ bytes (8192 on my machine) in expat/examples The files, too, are inspired by real-life examples: Android stores depth and gain maps as base64-encoded JPEGs inside the XMP data of other JPEGs. Sometimes as a text element, sometimes as an attribute value. I've seen attribute values slightly over 5 MiB in size.
57 lines
2.2 KiB
Text
57 lines
2.2 KiB
Text
This directory contains some really large test files, mostly used to
|
|
benchmark various aspects of Expat's performance.
|
|
|
|
(As files are added, they should be described here, including what
|
|
benchmark program they're intended to be used with and what that
|
|
resulting measurements tell us.)
|
|
|
|
* nes96.xml (~2.8 MB):
|
|
- properties: no namespaces, mixed content, average nesting depth
|
|
- source: http://sda.berkeley.edu:7502/ddi/nes96/
|
|
(no indication of license or copyright there)
|
|
- purpose: mostly for performance testing with the benchmark utility
|
|
|
|
* wordnet_glossary-20010201.xml (~14.4 MB):
|
|
- properties: namespaces, element content, flat
|
|
- source: http://www.semanticweb.org/library/wordnet/
|
|
(license looks Open Source, see license.html file on the same page)
|
|
- purpose: mostly for performance testing with the benchmark utility
|
|
|
|
* recset.xml (~29.1 MB):
|
|
- properties: small portion with namespaces, bulk without, element
|
|
content, flat
|
|
- source: test data donated by Karl Waclawek
|
|
- purpose: mostly for performance testing with the benchmark utility
|
|
|
|
* ns_att_test.xml (~34.2 MB):
|
|
- properties: lots of prefixed attributes (28 on average), element
|
|
content, flat
|
|
- source: test data donated by Karl Waclawek
|
|
- purpose: mostly for performance testing with the benchmark
|
|
utility, specifically for testing the duplicate attribute check in
|
|
storeAttributes()
|
|
|
|
* aaaaaa_attr.xml (~10 MB):
|
|
- properties: trivial file with a huge attribute value
|
|
- source: generated by a simple shell script
|
|
- purpose: performance/regression test
|
|
|
|
* aaaaaa_cdata.xml (~10 MB):
|
|
- properties: trivial file with huge cdata content
|
|
- source: generated by a simple shell script
|
|
- purpose: performance/regression test
|
|
|
|
* aaaaaa_comment.xml (~10 MB):
|
|
- properties: trivial file with a huge comment
|
|
- source: generated by a simple shell script
|
|
- purpose: performance/regression test
|
|
|
|
* aaaaaa_tag.xml (~10 MB):
|
|
- properties: trivial file with a huge tag name
|
|
- source: generated by a simple shell script
|
|
- purpose: performance/regression test
|
|
|
|
* aaaaaa_text.xml (~10 MB):
|
|
- properties: trivial file with a huge text segment (no newlines)
|
|
- source: generated by a simple shell script
|
|
- purpose: performance/regression test
|