Some tests use the xmlwf documentation as sample input. It is written in
DocBook, and the tests appear to be failing because they try to fetch it
at run time, which is not allowed. Work around this by installing it in
advance.
In order to cover the largest number of glibc and musl libc versions,
withouth warnings, the decision here is to use `_GNU_SOURCE`,
even if it enables a larger than necessary feature set.
A feature macro is needed, because otherwise the `check_c_source_compiles`
for `HAVE_SYSCALL_GETRANDOM` fails in cases when for example
the default compiler flags include `-std=c99`:
````
src.c:6:13: error: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration]
6 | syscall(SYS_getrandom, NULL, 0, 0);
| ^~~~~~~
````
But this check should pass, as `SYS_getrandom` is available,
only the declaration of `syscall` in `unistd.h` is conditional behind a macro.
The exact minimal public macros, for enabling this are in `features.h`, and
are version dependent.
According to [5.04](
https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/Archive/man-pages-5.04.tar.gz)
and older versions of the `man 2 syscall` page,
the recommended feature test macro is `_GNU_SOURCE`.
Later on in [5.05](
https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/Archive/man-pages-5.05.tar.gz)
this statement has changed, to provide a smaller minimal feature set.
Namely up to `glibc 2.18` is `_BSD_SOURCE || _SVID_SOURCE`,
but after that the `_DEFAULT_SOURCE` is recommended,
and `_BSD_SOURCE || _SVID_SOURCE` is deprecated, and emits warning in later versions.
Regardless of that the `_GNU_SOURCE` is still fully supported
in every version and is suitable for our purposes.
The musl libc doesn't use `_SVID_SOURCE` at all, but `_BSD_SOURCE` always works,
plus in some newer versions `_DEFAULT_SOURCE` also sets `_BSD_SOURCE`,
but `_GNU_SOURCE` covers the largest set of versions and is unlikely
to be deprecated in the future.
Further info about feature test macros:
In glibc:
https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html
In musl libc under the `Feature Test Macros Supported by musl` section:
https://musl.libc.org/doc/1.1.24/manual.html
Signed-off-by: Ferenc Géczi <ferenc.gm@gmail.com>
The two issues with the previous approach were that:
1. `check_symbol_exists` would store "1" or "" into
variable `off_t` rather than string "off_t", and
2. (`check_symbol_exists` would not find `off_t` or
`size_t` on modern Linux).
Was reported with NetBSD 9.3.
`size_t` is part of C99 (which Expat requires), so
only the `off_t` half remains.
Running find without path is a GNU extension. GNU find uses current
directory as starting-point in this case. Better always use an
explicit . in build scripts to support find on other systems.
When parsing DTD content with code like ..
XML_Parser parser = XML_ParserCreate(NULL);
XML_Parser ext_parser = XML_ExternalEntityParserCreate(parser, NULL, NULL);
enum XML_Status status = XML_Parse(ext_parser, doc, (int)strlen(doc), XML_TRUE);
.. there are 0 bytes accounted as direct input and all input from `doc` accounted
as indirect input. Now function accountingGetCurrentAmplification cannot calculate
the current amplification ratio as "(direct + indirect) / direct", and it did refuse
to divide by 0 as one would expect, but it returned 1.0 for this case to indicate
no amplification over direct input. As a result, billion laughs attacks from
DTD-only input were not detected with this isolated way of using an external parser.
The new approach is to assume direct input of length not 0 but 22 -- derived from
ghost input "<!ENTITY a SYSTEM 'b'>", the shortest possible way to include an external
DTD --, and do the usual "(direct + indirect) / direct" math with "direct := 22".
GitHub issue #839 has more details on this issue and its origin in ClusterFuzz
finding 66812.