[Dev-luatex] Lua states and \dump

Jonathan Sauer Jonathan.Sauer at silverstroke.com
Tue Sep 25 10:36:14 CEST 2007


> > Well, I experimented a bit and am now a lot wiser. Also, I have 
> > confirmation on what I thought all along: My computer's internal 
> > harddisk is much too slow:
> > 
> > How to measure the performance of virtual memory:
> > 
> > \directlua0{lua.bytecode[100000000] = function() end}
> works ok here (but i have a 4 gig machine);

Well, yes, since it allocates about 1.6 GB. With 4 GB (what luxury ;-),
virtual memory is not needed. I have "only" 1 GB, so ...

> technically it should only allocate one lua state; on the other hand,
> it may be that a non sparse array is used for the register

No, right now it allocates 100000000 byte registers. It does not use
a sparse array.

> > On a more serious matter, though, there is a buffer overflow in 
> > llualib.c, function set_bytecode: If `sizeof(bytecode)*(k+1)' is 
> > larger than UINT_MAX, a numeric overflow occurs and not 
> > enough memory is allocated in line 162. When initializing the newly
> > allocated bytecode registers afterwards, random memory is
> > 
> > Example (without exploit, of course):
> > 
> > Register number = 1000000000:
> such overruns happen also when you do wild things with lua 
> code, tex does not manage that memory

It is not a question of managing the memory, but of not allocating
enough memory: LuaTeX assumes it has allocated A bytes, but in fact
only allocates B bytes (B < A).

> another overflow can be in the piping data to tex (tex.print) 
> .. if you collect 2 gig data there you may also run into problems

Then a check should be added so the problem can be caught before an
overflow happens.

> > (I killed the process in the latter experiment, as it still consumed

> > too much memory [about 1.9GB]. But: Even though the memory 
> > requirements have been higher than in the first example, in this
> > the allocation did not fail, as the overflow created a smaller 
> > allocation, `only' 1935228944 bytes)
> > 
> > Register number = 3000000000:
> test \directlua0{lua.bytecode[1024*1024*1024*1024] = function() end}

This results in a numeric overflow, and bytecode register 2^40 mod 2^32
= 2^8 = 256 is set (if my calculations are not mistaken). In this case,
the numeric overflow also influences what register is accessed and not
only the amount of memory allocated, so there is no buffer overflow.

> test \directlua0{for i=1,100000000 do lua.bytecode[i] = 
> function() end end }

This one only allocates 100,000,000 registers, so only 1.6 GB are

> seems to work ok here but
> test \directlua0{lua.bytecode[3000000000] = function() end}
> test \directlua0{for i=1,3000000000 do lua.bytecode[i] = 
> function() end end }
> report a problem with a negavtive value, so it may be that 
> there is a problem there (not sure if taco tests the max value)
> ! LuaTeX error negative values not allowed.
> l.7 ...{lua.bytecode[3000000000] = function() end}

Interesting. I notice you use array access, whereas in my experiments
I used lua.setbytecode. Maybe there is a difference.

> > IMO it would be sensible to limit the number of bytecode registers
> > UINT_MAX/sizeof(bytecode) or -- to be platform-independent in the 
> > light of 64 bit processors[2] -- (2^32-1)/sizeof(bytecode).
> such a limitation is not that meaningful; one can have 1 
> milion bytecode functions savely but 10 using large 
> datastructures and bombing;

The problem is not that LuaTeX runs out of memory, but that it
memory it has not allocated:

And in the current implementation (since it does not use sparse array),
there is a fixed upper limit on how many bytecode registers can be used
before such a buffer overflow occurs, no matter how many memory the
machine has (although on 64 bit machines the limit is really high).

> there is no control over the lua 
> end of the game; also, when using much data, in practice the 
> garbage collectors will bring down your system (so slow that 
> one will abort the job);

Interesting. In what use cases did you observe this behaviour?

> luatex kind of assumes modern memory management

This is surely a given, since LuaTeX runs on Unix and Windows.

> and machines with memort in the gig range

This I think is a bit optimistic and IMO limits the usefulness of
LuaTeX. Especially since TeX has much lower requirements.

> > [3] I think, this is the result of the sig-handler LuaTeX installs 
> > which displays an error message. But it seems that this message has 
> > been overwritten as well.
> error messages and cathing errors with proper messages is on 
> the agenda for next year but segfaults and crashes indeed 
> need to be cached

The problem is not caching, but that LuaTeX, when accessing the bytecode
register, has overwritten almost all memory (the first 12 bytes of each
16 byte block) with zeros. This is the result of the buffer overflow.

But: If the error message was inside LuaTeX's machine code (as opposed
to LuaTeX's data), IIRC it could not be overwritten, since in RAM, an
application's machine code is kept separately from its data, and also is

> (i can at least imagine a practical limit of 64K bytecode registers)

Yes! This would solve the buffer overflow. And it would make the current
non-sparse-array-implementation viable.

> Hans


More information about the dev-luatex mailing list