Add aaaaaa_*.xml with unreasonably large tokens

Some of these currently take a very long time to parse. I set those to
only run one loop in the run-benchmark make target.

4096 may be a fairly small buffer, and definitely make the problem worse
than it otherwise would've been, but similar sizes exist in real code:

 * 2048 bytes in cpython Modules/pyexpat.c
 * 4096 bytes in skia SkXMLParser.cpp
 * BUFSIZ bytes (8192 on my machine) in expat/examples

The files, too, are inspired by real-life examples: Android stores
depth and gain maps as base64-encoded JPEGs inside the XMP data of
other JPEGs. Sometimes as a text element, sometimes as an attribute
value. I've seen attribute values slightly over 5 MiB in size.
This commit is contained in:
Snild Dolkow 2023-08-17 16:53:12 +02:00
parent 183270d565
commit 3484383fa7
7 changed files with 33 additions and 0 deletions

View file

@ -131,6 +131,11 @@ buildlib:
run-benchmark:
$(MAKE) -C tests/benchmark
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/recset.xml 65535 3
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/aaaaaa_attr.xml 4096 3
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/aaaaaa_cdata.xml 4096 3
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/aaaaaa_comment.xml 4096 3
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/aaaaaa_tag.xml 4096 3
./run.sh tests/benchmark/benchmark@EXEEXT@ -n $(top_srcdir)/../testdata/largefiles/aaaaaa_text.xml 4096 3
.PHONY: download-xmlts-zip
download-xmlts-zip:

View file

@ -31,4 +31,27 @@ resulting measurements tell us.)
utility, specifically for testing the duplicate attribute check in
storeAttributes()
* aaaaaa_attr.xml (~10 MB):
- properties: trivial file with a huge attribute value
- source: generated by a simple shell script
- purpose: performance/regression test
* aaaaaa_cdata.xml (~10 MB):
- properties: trivial file with huge cdata content
- source: generated by a simple shell script
- purpose: performance/regression test
* aaaaaa_comment.xml (~10 MB):
- properties: trivial file with a huge comment
- source: generated by a simple shell script
- purpose: performance/regression test
* aaaaaa_tag.xml (~10 MB):
- properties: trivial file with a huge tag name
- source: generated by a simple shell script
- purpose: performance/regression test
* aaaaaa_text.xml (~10 MB):
- properties: trivial file with a huge text segment (no newlines)
- source: generated by a simple shell script
- purpose: performance/regression test

1
testdata/largefiles/aaaaaa_attr.xml vendored Normal file

File diff suppressed because one or more lines are too long

1
testdata/largefiles/aaaaaa_cdata.xml vendored Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

1
testdata/largefiles/aaaaaa_tag.xml vendored Normal file

File diff suppressed because one or more lines are too long

1
testdata/largefiles/aaaaaa_text.xml vendored Normal file

File diff suppressed because one or more lines are too long