表预分配 |
|
lua_createtable
[1] C API 函数。lua_createtable
创建一个 Lua 表,其数组和哈希区域预分配到给定的大小。
预分配比执行 local t = {}; for i=1,N do t[i] = 0 end
更有效率,在这种情况下,大约会执行 O(log(N)) 次分配——每次数组在溢出时重新分配到之前大小的倍数时都会执行一次。
以下是一些解决方案。
在这里,我们构建了 lua_createtable
C 函数的 Lua 绑定。一般来说,如果可以绑定到 C,这是最健壮、最高效和最直接的方法。
如果我们在 Lua 中使用表构造函数语法,例如 local t = {0,0,0,0,0
},Lua 解析器会生成预分配表中所需空间并填充表的操作码。这基本上是我们想要的,除了可能填充表的部分。
$ echo 'local t = {0,0,0,0,0}' | luac -p -l - main <stdin:0,0> (8 instructions, 32 bytes at 0x680e00) 0+ params, 6 slots, 0 upvalues, 1 local, 1 constant, 0 functions 1 [1] NEWTABLE 0 5 0 2 [1] LOADK 1 -1 ; 0 3 [1] LOADK 2 -1 ; 0 4 [1] LOADK 3 -1 ; 0 5 [1] LOADK 4 -1 ; 0 6 [1] LOADK 5 -1 ; 0 7 [1] SETLIST 0 5 1 ; 1 8 [1] RETURN 0 1
更好的是,我们可以用 nil
填充表,local t = {nil,nil,nil,nil,nil
},它使用更有效的 LOADNIL 和 SETLIST 指令。
$ echo 'local t = {nil,nil,nil,nil,nil}' | luac -p -l - main <stdin:0,0> (4 instructions, 16 bytes at 0x680df8) 0+ params, 6 slots, 0 upvalues, 1 local, 0 constants, 0 functions 1 [1] NEWTABLE 0 5 0 2 [1] LOADNIL 1 5 3 [1] SETLIST 0 5 1 ; 1 4 [1] RETURN 0 1
主要问题是,如果所需的分配大小很大,这写起来很麻烦,如果大小在运行时才知道,那么我们需要在运行时构建代码字符串并调用 loadstring
。
这是可行的。构建和编译构建表的函数效率不高,但另一方面,此步骤可能只需要执行一次(或者可以记忆)。一个可能不重要的点是,LOADNIL 和 SETLIST 指令在每次调用时都是不必要的。
前一个解决方案中的表构造函数可以通过修改字节码更有效地构建。我们甚至可以避免初始化表条目。在这里,我们修改了 return {}
字节码中的 NEWTABLE 操作码,
$ echo 'return {}' | luac -p -l - main <stdin:0,0> (3 instructions, 12 bytes at 0x680df8) 0+ params, 2 slots, 0 upvalues, 0 locals, 0 constants, 0 functions 1 [1] NEWTABLE 0 0 0 2 [1] RETURN 0 2 3 [1] RETURN 0 1
使用新的表大小,然后将由新字节码定义的函数加载到内存中。这不需要额外的 Lua 源代码解析,并且字符串操作更少。但是,这依赖于对 Lua 字节码格式的假设。
-- opcreatetable.lua -- -- Creates table preallocated in array and hash sizes. -- Implemented in pure Lua. -- -- Warning: This code may be somewhat fragile since it depends on -- the Lua 5.1 bytecode format and little-endian byte order. -- -- This code has not been well tested. Please review prior to using -- in production. -- -- (c) 2009 David Manura. Licensed under the same terms as Lua (MIT license). local M = {} local loadstring = loadstring local assert = assert local string = string local string_dump = string.dump local string_char = string.char -- Encodes integer for NEWTABLE opcode. Based on lobject.c:luaO_int2fb. local xmax = 15*2^30 local function int2fb(x) assert(x >= 0 and x <= xmax) local e = 0 while x >= 16 do x = (x+1) local b = x % 2 x = (x-b) / 2 e = e + 1 end if x < 8 then return x else return (e+1) * 8 + (x - 8) end end -- Packs and unpacks 4-byte little-endian unsigned int. local function pack_int4(x1,x2,x3,x4) return ((x4*256 + x3)*256 + x2)*256 + x1 end local function unpack_int4(x) local x1 = x % 256; x = (x - x1) / 256 local x2 = x % 256; x = (x - x2) / 256 local x3 = x % 256; x = (x - x3) / 256 local x4 = x return x1,x2,x3,x4 end -- Packs and unpacks iABC type instruction. local function unpack_iABC(x) local instopid = x % 64; x = (x - instopid) / 64 local insta = x % 256; x = (x - insta) / 256 local instc = x % 512; x = (x - instc) / 512 local instb = x return instopid, insta, instb, instc end local function pack_iABC(instopid, insta, instb, instc) return ((instb * 512 + instc) * 256 + insta) * 64 + instopid end -- Returns a function that when called creates and returns a new table. -- The table has array size asize and hash size hsize (both default to 0). -- Calling this function may be slow and you may want to cache the -- returned function. Calling the returned function should be fast. local code local pos local insta local function new_table_builder(asize, hsize) asize = asize or 0 hsize = hsize or 0 if not code then -- See "A No-Frills Introduction to Lua 5.1 VM Instructions" -- by Kein-Hong Man for details on the bytecode format. code = string_dump(function() return {} end) -- skip headers local int_size = code:byte(8) local size_t_size = code:byte(9) local instruction_size = code:byte(10) local endian = code:byte(7) assert(size_t_size == 4) assert(instruction_size == 4) assert(endian == 1) -- little endian local source_size = pack_int4(code:byte(13), code:byte(14), code:byte(15), code:byte(16)) pos = 1 + 12 -- chunk header + size_t_size -- string size + source_size -- string data + 2 * int_size + 4 -- rest of function block header + 4 -- number of instructions -- read first instruction (NEWTABLE) local a1 = code:byte(pos) local a2 = code:byte(pos+1) local a3 = code:byte(pos+2) local a4 = code:byte(pos+3) local inst = pack_int4(a1,a2,a3,a4) -- parse instruction local instopid, instc, instb instopid, insta, instb, instc = unpack_iABC(inst) assert(instopid == 10) assert(instb == 0) assert(instc == 0) end -- build new instruction local instopid = 10 local instb = int2fb(asize) local instc = int2fb(hsize) local inst = pack_iABC(instopid, insta, instb, instc) -- encode new instruction into code. local inst1,inst2,inst3,inst4 = unpack_int4(inst) local code2 = code:sub(1,pos-1)..string_char(inst1,inst2,inst3,inst4)..code:sub(pos+4) local f2 = assert(loadstring(code2)) return f2 end M.new_table_builder = new_table_builder return M
测试
local new_table_builder = require "opcreatetable" . new_table_builder local nt = new_table_builder(1e6,1e6) local t1 = nt() local t2 = nt() print(t1, t2) -- wait for user to observe process memory usage io.stdin:read'*l'
Lua 5.2 应该提供更好的方法来做到这一点吗?