[Dev-luatex] Notes

Hans Hagen pragma at wxs.nl
Tue Dec 11 09:32:35 CET 2007


Jonathan Sauer wrote:
> Hello,
> 
> some miscellaneous notes while playing with LuaTeX 0.20.0:
> 
> -	Currently, the 'open_read_file' callback is only used if a
> 	'find_read_file' callback is registered as well. Since on Apr 24
> 2007,
> 	Taco wrote on the mailing list (subject '\input does not look at
> 
> 	open_read_file callback?') that this would be fixed, I suspect
> that 
> 	this is a bug.
> 
> -	I want to register an open_read_file callback that is called to
> open 
> 	the job file. \everyjob is executed after the job file has been 
> 	opened, so I cannot use it to register the callback. Also,
> callbacks 
> 	are not dumped into the format file, so I cannot register the
> callback 
> 	while creating the format.

> 	What now?

put the initialization code in a bytecode register and initialize that 
in everyjob

> 
> 	-	I could use the command line switch "--lua" to execute a
> special
> 		script which registers the callbacks (and deal with the
> fact that
> 		I cannot use kpathsea to locate this script)?

write a small wrapper (startup script) or load the initialization code 
after loading the format

> 	-	I could call luatex not with the file name but like
> this:
> 
> 			luatex -jobname <filename> '\input <filename>'
> 
> 	-	[insert simple and elegant solution]
> 
> 	(On Apr 02 2007, Taco wrote on the mailing list (subject 'LaTeX
> and 
> 	LuaTeX'), that dumping of callbacks could be added. If there was
> a 
> 	vote on this, I would vote with 'yes' :-)
> 
> -	The following command line crashes LuaTeX with a bus error (yes,
> yes
> 	the bus that people are riding on without a ticket):
> 
> 		luatex '\directlua0{texio.write("term and
> log","Hello")}'
> 
> 	No destination or destination "term" works.

hard to read this funny formatted remark -)

> 	The problem seems to be that LuaTeX wants to log to the log
> file, but 
> 	the log file is not open yet.
> 
> 	Stacktrace (not very useful, I'm afraid, since the symbols are
> missing):
> 
> 	0   libSystem.B.dylib 	0x9003fd9c putc + 60
> 	1   luatex            	0x0000d3e4 0x1000 + 50148
> 	2   luatex            	0x0000d9e8 0x1000 + 51688
> 	3   luatex            	0x000a74bc 0x1000 + 681148
> 	4   luatex            	0x0015c5d0 0x1000 + 1422800
> 	5   luatex            	0x00164310 0x1000 + 1454864
> 	6   luatex            	0x0015c6a4 0x1000 + 1423012
> 	7   luatex            	0x0015bab4 0x1000 + 1419956
> 	8   luatex            	0x0015ca6c 0x1000 + 1423980
> 	9   luatex            	0x0015ab10 0x1000 + 1415952
> 	10  luatex            	0x000a5878 0x1000 + 673912
> 	11  luatex            	0x0003c850 0x1000 + 243792
> 	12  luatex            	0x000238b4 0x1000 + 141492
> 	13  luatex            	0x00023de4 0x1000 + 142820
> 	14  luatex            	0x00068004 0x1000 + 421892
> 	15  luatex            	0x0000cf04 0x1000 + 48900
> 	16  luatex            	0x00070694 0x1000 + 456340
> 	17  luatex            	0x00001d54 0x1000 + 3412
> 	18  luatex            	0x00001bfc 0x1000 + 3068
> 
> -	Is it sensible to point the string functions, which are not
> 	Unicode-aware, to unicode.utf8, or are there situations (due to
> 	differences in functionality) that still require the original
> string
> 	functions? (I would suspect that the original functions are a
> bit
> 	faster)

you can overload the *standardized* string functions

string.char = unicode.utf8.char

and it's up to the user to decide; we just provide the standardized lua 
engien + some extra libs

> -	lpeg does not seem to be UTF-8-aware. Is this impression
> correct? If
> 	so, how should one proceed to use it safely? I would estimate
> that this
> 	is mainly a problem when matching using a set ("lpeg.S")
> containing
> 	characters outside the first 256 code points, since when
> matching, a
> 	character is checked against each *byte* in the set string, not
> each
> 	character.

you can write utf parsing in lpeg (see roberto's web page on lpeg) ...
so a solution is to combine techniques; native utf8 support would slow 
down lpeg and also make it less generic; it's basically a byte parsing 
engine (no assumptions about characters, sequences and their meaning)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the dev-luatex mailing list