Lua 类型检查 |
|
> x = 5 + "ok" stdin:1: attempt to perform arithmetic on a string value
但是,与 C 等语言不同,Lua 中没有内置机制来检查函数调用参数和返回值的类型。类型未指定
function abs(x) return x >= 0 and x or -x end
这提供了很大的灵活性。例如,像 print
这样的函数可以接受多种类型的值。但是,它可能会使函数未完全指定,更容易出现使用错误。您可以对非数字类型的值调用此函数,尽管函数内部的操作会在运行时触发一个比较隐晦的错误(在 C 中,这将是编译时错误)。
> print(abs("hello")) stdin:2: attempt to compare number with string stack traceback: stdin:2: in function 'abs' stdin:1: in main chunk [C]: ?
为了改进错误报告,通常建议这样做
function abs(x) assert(type(x) == "number", "abs expects a number") return x >= 0 and x or -x end
> print(abs("hello")) stdin:2: abs expects a number stack traceback: [C]: in function 'assert' stdin:2: in function 'abs' stdin:1: in main chunk [C]: ?
这是一个很好的建议,但有人可能会抱怨这会增加额外的运行时开销,它只会检测到执行的代码的程序错误,而不是所有编译的代码,函数值的类型完全在函数的实现中(无法通过内省获取),并且可能涉及大量重复代码(尤其是在命名参数通过表传递时)。
以下是一种主动检查命名参数的方法
local Box = {} local is_key = {x=true,y=true,width=true,height=true,color=true} function create_box(t) local x = t.x or 0 local y = t.y or 0 local width = t.width or 0 local height = t.height or 0 local color = t.color assert(type(x) == "number", "x must be number or nil") assert(type(y) == "number", "y must be number or nil") assert(type(width) == "number", "width must be number be number or nil") assert(type(height) == "number", "height must be number or nil") assert(color == "red" or color == "blue", "color must be 'red' or 'blue'") for k,v in pairs(t) do assert(is_key[k], tostring(k) .. " not valid key") end return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box) end
代码量相对较多。实际上,我们可能希望使用 error
而不是 assert
来为堆栈回溯提供适当的 level
参数。
另一种方法是将类型检查代码放在原始函数之外,可能使用“函数装饰器”(有关背景信息,请参见 DecoratorsAndDocstrings)。
is_typecheck = true function typecheck(...) return function(f) return function(...) assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE") return f(...) end end end function notypecheck() return function(f) return f end end typecheck = is_typecheck and typecheck or notypecheck sum = typecheck("number", "number", "->", "number")( function(a,b) return a + b end )
优点是类型信息位于函数实现之外。我们可以通过切换单个变量来禁用所有类型检查,并且当函数执行时不会产生任何额外的开销(尽管在构建函数时会有一些轻微的额外开销)。typecheck
函数还可以存储类型信息以供以后内省。
这种方法类似于 LuaList:/2002-07/msg00209.html 中描述的方法(警告:Lua 4)。
实现此类类型检查装饰器的一种方法是
--- Odin Kroeger, 2022, released under the MIT license. do local abbrs = { ['%*'] = 'boolean|function|number|string|table|thread|userdata', ['%?(.*)'] = '%1|nil' } local msg = 'expected %s, got %s.' --- Check whether a value is of a type. -- -- Type declaration grammar: -- -- Declare one or more Lua type names separated by '|' to require that -- the given value is of one of the given types (e.g., 'string|table' -- requires the value to be a string or a table). '*' is short for the -- list of all types but `nil`. '?T' is short for 'T|nil' (e.g., -- '?table' is short for 'table|nil'). -- -- Extended Backus-Naur Form: -- -- > Type = 'boolean' | 'function' | 'nil' | 'number' | -- > 'string' | 'table' | 'thread' | 'userdata' -- > -- > Type list = [ '?' ], type, { '|', type } -- > -- > Wildcard = [ '?' ], '*' -- > -- > Type declaration = type list | wildcard -- -- Complex types: -- -- You can check types of table or userdata fields by -- declarding a table that maps indices to declarations. -- -- > type_check({1, '2'}, {'number', 'number'}) -- nil index 2: expected number, got string. -- > type_check({foo = 'bar'}, {foo = '?table'}) -- nil index foo: expected table or nil, got string. -- > type_check('foo', {foo = '?table'}) -- nil expected table or userdata, got string. -- -- Wrong type names (e.g., 'int') do *not* throw an error. -- -- @param val A value. -- @tparam string|table decl A type declaration. -- @treturn[1] bool `true` if the value matches the declaration. -- @treturn[2] nil `nil` otherwise. -- @treturn[2] string An error message. function type_match (val, decl, _seen) local t = type(decl) if t == 'string' then local t = type(val) for p, r in pairs(abbrs) do decl = decl:gsub(p, r) end for e in decl:gmatch '[^|]+' do if t == e then return true end end return nil, msg:format(decl:gsub('|', ' or '), t) elseif t == 'table' then local ok, err = type_match(val, 'table|userdata') if not ok then return nil, err end if not _seen then _seen = {} end assert(not _seen[val], 'cycle in data tree.') _seen[val] = true for k, t in pairs(decl) do ok, err = type_match(val[k], t, _seen) if not ok then return nil, format('index %s: %s', k, err) end end return true end error(msg:format('string or table', t)) end end --- Type-check function arguments. -- -- Type declaration grammar: -- -- The type declaration syntax is that of @{type_match}, save for -- that you can use '...' to declare that the remaining arguments -- are of the same type as the previous one. -- -- Obscure Lua errors may indicate that forgot the quotes around '...'. -- -- Caveats: -- -- * Wrong type names (e.g., 'int') do *not* throw an error. -- * Sometimes the strack trace is wrong. -- -- @tparam string|table ... Type declarations. -- @treturn func A function that adds type checks to a function. -- -- @usage -- store = type_check('?*', 'table', '?number', '...')( -- function (val, tab, ...) -- local indices = table.pack(...) -- for i = 1, n do tab[indices[i]] = val end -- end -- ) -- -- @function type_check function type_check (...) local decls = pack(...) return function (func) return function (...) local args = pack(...) local decl, prev local n = math.max(decls.n, args.n) for i = 1, n do if decls[i] == '...' then prev = true elseif decls[i] then prev = false decl = decls[i] elseif not prev then break end if args[i] == nil and prev and i >= decls.n then break end local ok, err = type_match(args[i], decl) if not ok then error(format('argument %d: %s', i, err), 2) end end return func(...) end end end
以上解决方案存在一些局限性
* 它们相当冗长,非平凡的验证难以阅读;
* 错误消息不如 Lua 原语返回的错误消息清晰。此外,它们指示在调用函数中发生错误,即在 assert() 失败的地方,而不是在传递无效参数的调用函数中。
checks 库提供了一种简洁、灵活且易读的方式来生成良好的错误消息。类型由字符串描述,这些字符串当然可以是 Lua 类型名称,但也可以存储在对象的元表中,位于 __type 字段下。此外,还可以将任意类型检查函数注册到专门的 checkers
表中。例如,如果要检查 IP 端口号(必须介于 0 和 0xffff 之间),可以定义一个名为 port
的类型,如下所示
function checkers.port(x) return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 end
为了消除无用的样板代码,checks()
直接从堆栈帧中获取参数,无需重复它们;例如,如果函数 f(num, str)
需要一个数字和一个字符串,则可以按如下方式实现
function f(num, str) checks('number', 'string') --actual function body end
类型可以组合
* 竖线允许接受多种类型,例如 checks('string|number')
接受字符串和数字作为第一个参数。
* 前缀 "?" 使类型可选,即也接受 nil
。在功能上,它等效于 "nil|"
前缀,尽管它在运行时更易读且更快。
* 问号可以与联合竖线组合,例如 checks('?string|number')
接受字符串、数字和 nil
。
* 最后,特殊的 "!"
类型接受除 nil
之外的任何内容。
有关库工作原理的更详细说明,请参阅其源代码的标头 (https://github.com/fab13n/checks/blob/master/checks.c)。该库是 Sierra Wireless 应用程序框架的一部分,可在此处访问:https://github.com/SierraWireless/luasched。为了方便起见,它也作为独立的 rock 可在此处获得:https://github.com/fab13n/checks
如前所述,运行时类型检查不会检测到未执行的程序错误。对于动态类型语言中的程序,广泛的测试套件尤其重要,以便所有代码分支都使用所有可想象的数据集(或至少是它们的良好表示)执行,以便运行时断言得到充分命中。您不能过分依赖编译器为您执行此类检查。
也许我们可以通过为值携带更完整的类型信息来缓解这种情况。下面是一种方法,尽管它更像是一个新颖的概念证明,而不是目前用于生产环境的任何东西。
-- ExTypeCheck.lua ("ExTypeCheck") -- Type checking for Lua. -- -- In this type model, types are associated with values at run-time. -- A type consists of the set of values the value could have -- at run-time. This set can be large or infinite, so we -- store only a small representative subset of those values. -- Typically one would want to include at least the boundary -- values (e.g. max and min) in this set. -- Type checking is performed by verifying that all values -- in that set are accepted by a predicate function for that type. -- This predicate function takes a values and returns true or false -- whether the value is a member of that type. -- -- As an alternative to representing types as a set of representative -- values, we could represent types more formally, such as with -- first-order logic, but then we get into theorem proving, -- which is more involved. -- -- DavidManura, 2007, licensed under the same terms as Lua itself. local M = {} -- Stored Expression design pattern -- ( https://lua-users.lua.ac.cn/wiki/StatementsInExpressions ) local StoredExpression do local function call(self, ...) self.__index = {n = select('#', ...), ...} return ... end function StoredExpression() local self = {__call = call} return setmetatable(self, self) end end -- Whether to enable type checking (true/false). Default true. local is_typecheck = true -- TypeValue is an abstract type for values that are typed -- This holds the both the actual value and a subset of possible -- values the value could assume at runtime. That set should at least -- include the min and max values (for bounds checking). local TypedValue = {} -- Check that value x satisfies type predicate function f. function M.check_type(x, f) for _,v in ipairs(x) do assert(f(v)) end return x.v end -- Type check function that decorates functions. -- Example: -- abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')( -- function(x) return x >= 0 and x or -x end -- ) function M.typecheck(...) local types = {...} return function(f) local function check(i, ...) -- Check types of return values. if types[i] == "->" then i = i + 1 end local j = i while types[i] ~= nil do M.check_type(select(i - j + 1, ...), types[i]) i = i + 1 end return ... end return function(...) -- Check types of input parameters. local i = 1 while types[i] ~= nil and types[i] ~= "->" do M.check_type(select(i, ...), types[i]) i = i + 1 end return check(i, f(...)) -- call function end end end function M.notypecheck() return function(f) return f end end function M.nounbox(x) return x end M.typecheck = is_typecheck and M.typecheck or M.notypecheck M.unbox = is_typecheck and M.unbox or M.nounbox -- Return a boxed version of a binary operation function. -- For the returned function, -- Zero, one, or two of the arguments may be boxed. -- The result value is boxed. -- Example: -- __add = boxed_op(function(a,b) return a+b end) function M.boxed_op(op) return function(a, b) if getmetatable(a) ~= TypedValue then a = M.box(a) end if getmetatable(b) ~= TypedValue then b = M.box(b) end local t = M.box(op(M.unbox(a), M.unbox(b))) local seen = {[t[1]] = true} for _,a2 in ipairs(a) do for _,b2 in ipairs(b) do local c2 = op(a2, b2) if not seen[c2] then t[#t + 1] = op(a2, b2) seen[c2] = true end end end return t end end -- Return a boxed version of a unary operation function. -- For the returned function, -- The argument may optionally be boxed. -- The result value is boxed. -- Example: -- __unm = boxed_uop(function(a) return -a end) function M.boxed_uop(op) return function(a) if getmetatable(a) ~= TypedValue then a = M.box(a) end local t = M.box(op(M.unbox(a))) local seen = {[t[1]] = true} for _,a2 in ipairs(a) do local c2 = op(a2) if not seen[c2] then t[#t + 1] = op(a2) seen[c2] = true end end return t end end TypedValue.__index = TypedValue TypedValue.__add = M.boxed_op(function(a, b) return a + b end) TypedValue.__sub = M.boxed_op(function(a, b) return a - b end) TypedValue.__mul = M.boxed_op(function(a, b) return a * b end) TypedValue.__div = M.boxed_op(function(a, b) return a / b end) TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end) TypedValue.__mod = M.boxed_op(function(a, b) return a % b end) TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end) -- TypedValue.__le -- not going to work? (metafunction returns Boolean) -- TypedValue.__lt -- not going to work? (metafunction returns Boolean) -- TypedValue.__eq -- not going to work? (metafunction returns Boolean) TypedValue.__tostring = function(self) local str = "[" .. tostring(self.v) .. " in " for i,v in ipairs(self) do if i ~= 1 then str = str .. ", " end str = str .. v end str = str .. "]" return str end -- Convert a regular value into a TypedValue. We call this "boxing". function M.box(v, ...) local t = setmetatable({v = v, ...}, TypedValue) if #t == 0 then t[1] = v end return t end -- Convert a TypedValue into a regular value. We call this "unboxing". function M.unbox(x) assert(getmetatable(x) == TypedValue) return x.v end -- Returns a type predicate function for a given interval over the reals. -- Example: ranged_real'[0,inf)' -- Note: this function could be memoized. function M.ranged_real(name, a, b) local ex = StoredExpression() if name == "(a,b)" then return function(x) return type(x) == "number" and x > a and x < b end elseif name == "(a,b]" then return function(x) return type(x) == "number" and x > a and x <= b end elseif name == "[a,b)" then return function(x) return type(x) == "number" and x >= a and x < b end elseif name == "[a,b]" then return function(x) return type(x) == "number" and x >= a and x <= b end elseif name == "(inf,inf)" then return function(x) return type(x) == "number" end elseif name == "[a,inf)" then return function(x) return type(x) == "number" and x >= a end elseif name == "(a,inf)" then return function(x) return type(x) == "number" and x > a end elseif name == "(-inf,a]" then return function(x) return type(x) == "number" and x <= a end elseif name == "(-inf,a)" then return function(x) return type(x) == "number" and x < a end elseif name == "[0,inf)" then return function(x) return type(x) == "number" and x >= 0 end elseif name == "(0,inf)" then return function(x) return type(x) == "number" and x > 0 end elseif name == "(-inf,0]" then return function(x) return type(x) == "number" and x <= 0 end elseif name == "(-inf,0)" then return function(x) return type(x) == "number" and x < 0 end elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4] if left == "(" and right == ")" then return function(x) return type(x) == "number" and x > a and x < b end elseif left == "(" and right == "]" then return function(x) return type(x) == "number" and x > a and x <= b end elseif left == "[" and right == ")" then return function(x) return type(x) == "number" and x >= a and x < b end elseif left == "[" and right == "]" then return function(x) return type(x) == "number" and x >= a and x <= b end else assert(false) end else error("invalid arg " .. name, 2) end end return M
示例用法
-- type_example.lua -- Test of ExTypeCheck.lua. local TC = require "ExTypeCheck" local typecheck = TC.typecheck local ranged_real = TC.ranged_real local boxed_uop = TC.boxed_uop local box = TC.box -- Checked sqrt function. local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')( function(x) return boxed_uop(math.sqrt)(x) end ) -- Checked random function. local random = typecheck('->', ranged_real'[0,1)')( function() return box(math.random(), 0, 0.999999) end ) print(box("a", "a", "b") .. "z") print(box(3, 3,4,5) % 4) math.randomseed(os.time()) local x = 0 + random() * 10 - 1 + (random()+1) * 0 print(x + 1); print(sqrt(x + 1)) -- ok print(x); print(sqrt(x)) -- always asserts! (since x might be negative)
示例输出
[az in az, bz] [3 in 3, 0, 1] [8.7835848325787 in 8.7835848325787, 0, 9.99999] [2.9637113274708 in 2.9637113274708, 0, 3.1622760790292] [7.7835848325787 in 7.7835848325787, -1, 8.99999] lua: ./ExTypeCheck.lua:50: assertion failed! stack traceback: [C]: in function 'assert' ./ExTypeCheck.lua:50: in function 'check_type' ./ExTypeCheck.lua:78: in function 'sqrt' testt.lua:30: in main chunk [C]: ?
注意:这种值持有多个值的方法与 Perl6 交汇点(最初称为“量子叠加”)有一些相似之处。
Metalua 中有一个运行时类型检查示例 [2]。
Dao 语言,部分基于 Lua,内置支持可选类型 [3]。
The [Teal 语言] 是 Lua 的一种类型化方言,编译成 Lua。
[TypeScriptToLua] 是一个 TypeScript? 到 Lua 的转译器,它允许我们使用 TypeScript? 语法和类型检查在编译时编写 Lua 代码。
我认为,除了非常简单的程序或始终具有相同输入的程序外,在脚本中禁用类型检查是一个坏主意。--JohnBelmonte
上面的评论被匿名删除,理由是“这不是一个供人随意发表意见的论坛”。相反,这是维基的既定风格。人们可以在页面上发表评论,对于有争议的观点,这被认为是礼貌的,而不是简单地更改原文。原作者可以决定是否将这些评论整合到原文中。--JohnBelmonte