Lua 类型检查

lua-users home
wiki

许多编程语言提供某种形式的静态(编译时)或动态(运行时)类型检查,每种形式都有其自身的优点 [1]。Lua 对其内置操作执行运行时类型检查。例如,以下代码会触发运行时错误

> x = 5 + "ok"
stdin:1: attempt to perform arithmetic on a string value

但是,与 C 等语言不同,Lua 中没有内置机制来检查函数调用参数和返回值的类型。类型未指定

function abs(x)
  return x >= 0 and x or -x
end

这提供了很大的灵活性。例如,像 print 这样的函数可以接受多种类型的值。但是,它可能会使函数未完全指定,更容易出现使用错误。您可以对非数字类型的值调用此函数,尽管函数内部的操作会在运行时触发一个比较隐晦的错误(在 C 中,这将是编译时错误)。

> print(abs("hello"))
stdin:2: attempt to compare number with string
stack traceback:
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

解决方案:函数顶部的断言

为了改进错误报告,通常建议这样做

function abs(x)
  assert(type(x) == "number", "abs expects a number")
  return x >= 0 and x or -x
end

> print(abs("hello"))
stdin:2: abs expects a number
stack traceback:
        [C]: in function 'assert'
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

这是一个很好的建议,但有人可能会抱怨这会增加额外的运行时开销,它只会检测到执行的代码的程序错误,而不是所有编译的代码,函数值的类型完全在函数的实现中(无法通过内省获取),并且可能涉及大量重复代码(尤其是在命名参数通过表传递时)。

以下是一种主动检查命名参数的方法

local Box = {}
local is_key = {x=true,y=true,width=true,height=true,color=true}
function create_box(t)
  local x = t.x or 0
  local y = t.y or 0
  local width = t.width or 0
  local height = t.height or 0
  local color = t.color
  assert(type(x) == "number", "x must be number or nil")
  assert(type(y) == "number", "y must be number or nil")
  assert(type(width) == "number", "width must be number be number or nil")
  assert(type(height) == "number", "height must be number or nil")
  assert(color == "red" or color == "blue", "color must be 'red' or 'blue'")
  for k,v in pairs(t) do
    assert(is_key[k], tostring(k) .. " not valid key")
  end
  return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box)
end

代码量相对较多。实际上,我们可能希望使用 error 而不是 assert 来为堆栈回溯提供适当的 level 参数。

解决方案:函数装饰器

另一种方法是将类型检查代码放在原始函数之外,可能使用“函数装饰器”(有关背景信息,请参见 DecoratorsAndDocstrings)。

is_typecheck = true

function typecheck(...)
  return function(f)
    return function(...)
      assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE")
      return f(...)
    end
  end
end

function notypecheck()
  return function(f) return f end
end

typecheck = is_typecheck and typecheck or notypecheck

sum = typecheck("number", "number", "->", "number")(
  function(a,b)
    return a + b
  end
)

优点是类型信息位于函数实现之外。我们可以通过切换单个变量来禁用所有类型检查,并且当函数执行时不会产生任何额外的开销(尽管在构建函数时会有一些轻微的额外开销)。typecheck 函数还可以存储类型信息以供以后内省。

这种方法类似于 LuaList:/2002-07/msg00209.html 中描述的方法(警告:Lua 4)。

实现此类类型检查装饰器的一种方法是

--- Odin Kroeger, 2022, released under the MIT license.
do
    local abbrs = {
        ['%*'] = 'boolean|function|number|string|table|thread|userdata',
        ['%?(.*)'] = '%1|nil'
    }

    local msg = 'expected %s, got %s.'

    --- Check whether a value is of a type.
    --
    -- Type declaration grammar:
    --
    -- Declare one or more Lua type names separated by '|' to require that
    -- the given value is of one of the given types (e.g., 'string|table'
    -- requires the value to be a string or a table). '*' is short for the
    -- list of all types but `nil`. '?T' is short for 'T|nil' (e.g.,
    -- '?table' is short for 'table|nil').
    --
    -- Extended Backus-Naur Form:
    --
    -- > Type = 'boolean' | 'function' | 'nil'    | 'number'   |
    -- >        'string'  | 'table'    | 'thread' | 'userdata'
    -- >
    -- > Type list = [ '?' ], type, { '|', type }
    -- >
    -- > Wildcard = [ '?' ], '*'
    -- >
    -- > Type declaration = type list | wildcard
    --
    -- Complex types:
    --
    -- You can check types of table or userdata fields by
    -- declarding a table that maps indices to declarations.
    --
    --    > type_check({1, '2'}, {'number', 'number'})
    --    nil    index 2: expected number, got string.
    --    > type_check({foo = 'bar'}, {foo = '?table'})
    --    nil    index foo: expected table or nil, got string.
    --    > type_check('foo', {foo = '?table'})
    --    nil    expected table or userdata, got string.
    --
    -- Wrong type names (e.g., 'int') do *not* throw an error.
    --
    -- @param val A value.
    -- @tparam string|table decl A type declaration.
    -- @treturn[1] bool `true` if the value matches the declaration.
    -- @treturn[2] nil `nil` otherwise.
    -- @treturn[2] string An error message.
    function type_match (val, decl, _seen)
        local t = type(decl)
        if t == 'string' then
            local t = type(val)
            for p, r in pairs(abbrs) do decl = decl:gsub(p, r) end
            for e in decl:gmatch '[^|]+' do if t == e then return true end end
            return nil, msg:format(decl:gsub('|', ' or '), t)
        elseif t == 'table' then
            local ok, err = type_match(val, 'table|userdata')
            if not ok then return nil, err end
            if not _seen then _seen = {} end
            assert(not _seen[val], 'cycle in data tree.')
            _seen[val] = true
            for k, t in pairs(decl) do
                ok, err = type_match(val[k], t, _seen)
                if not ok then return nil, format('index %s: %s', k, err) end
            end
            return true
        end
        error(msg:format('string or table', t))
    end
end

--- Type-check function arguments.
--
-- Type declaration grammar:
--
-- The type declaration syntax is that of @{type_match}, save for
-- that you can use '...' to declare that the remaining arguments
-- are of the same type as the previous one.
--
-- Obscure Lua errors may indicate that forgot the quotes around '...'.
--
-- Caveats:
--
-- * Wrong type names (e.g., 'int') do *not* throw an error.
-- * Sometimes the strack trace is wrong.
--
-- @tparam string|table ... Type declarations.
-- @treturn func A function that adds type checks to a function.
--
-- @usage
-- store = type_check('?*', 'table', '?number', '...')(
--     function (val, tab, ...)
--          local indices = table.pack(...)
--          for i = 1, n do tab[indices[i]] = val end
--     end
-- )
--
-- @function type_check
function type_check (...)
    local decls = pack(...)
    return function (func)
        return function (...)
            local args = pack(...)
            local decl, prev
            local n = math.max(decls.n, args.n)
            for i = 1, n do
                if     decls[i] == '...' then prev = true
                elseif decls[i]          then prev = false
                                              decl = decls[i]
                elseif not prev          then break
                end
                if args[i] == nil and prev and i >= decls.n then break end
                local ok, err = type_match(args[i], decl)
                if not ok then error(format('argument %d: %s', i, err), 2) end
            end
            return func(...)
        end
    end
end

解决方案:checks 库

以上解决方案存在一些局限性

* 它们相当冗长,非平凡的验证难以阅读;

* 错误消息不如 Lua 原语返回的错误消息清晰。此外,它们指示在调用函数中发生错误,即在 assert() 失败的地方,而不是在传递无效参数的调用函数中。

checks 库提供了一种简洁、灵活且易读的方式来生成良好的错误消息。类型由字符串描述,这些字符串当然可以是 Lua 类型名称,但也可以存储在对象的元表中,位于 __type 字段下。此外,还可以将任意类型检查函数注册到专门的 checkers 表中。例如,如果要检查 IP 端口号(必须介于 0 和 0xffff 之间),可以定义一个名为 port 的类型,如下所示

function checkers.port(x)
    return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 
end

为了消除无用的样板代码,checks() 直接从堆栈帧中获取参数,无需重复它们;例如,如果函数 f(num, str) 需要一个数字和一个字符串,则可以按如下方式实现

function f(num, str)
    checks('number', 'string')
    --actual function body
end

类型可以组合

* 竖线允许接受多种类型,例如 checks('string|number') 接受字符串和数字作为第一个参数。

* 前缀 "?" 使类型可选,即也接受 nil。在功能上,它等效于 "nil|" 前缀,尽管它在运行时更易读且更快。

* 问号可以与联合竖线组合,例如 checks('?string|number') 接受字符串、数字和 nil

* 最后,特殊的 "!" 类型接受除 nil 之外的任何内容。

有关库工作原理的更详细说明,请参阅其源代码的标头 (https://github.com/fab13n/checks/blob/master/checks.c)。该库是 Sierra Wireless 应用程序框架的一部分,可在此处访问:https://github.com/SierraWireless/luasched。为了方便起见,它也作为独立的 rock 可在此处获得:https://github.com/fab13n/checks

技巧:装箱值 + 可能的值

如前所述,运行时类型检查不会检测到未执行的程序错误。对于动态类型语言中的程序,广泛的测试套件尤其重要,以便所有代码分支都使用所有可想象的数据集(或至少是它们的良好表示)执行,以便运行时断言得到充分命中。您不能过分依赖编译器为您执行此类检查。

也许我们可以通过为值携带更完整的类型信息来缓解这种情况。下面是一种方法,尽管它更像是一个新颖的概念证明,而不是目前用于生产环境的任何东西。

-- ExTypeCheck.lua ("ExTypeCheck")
-- Type checking for Lua.
--
-- In this type model, types are associated with values at run-time.
-- A type consists of the set of values the value could have
-- at run-time.  This set can be large or infinite, so we
-- store only a small representative subset of those values.
-- Typically one would want to include at least the boundary
-- values (e.g. max and min) in this set.
-- Type checking is performed by verifying that all values
-- in that set are accepted by a predicate function for that type.
-- This predicate function takes a values and returns true or false
-- whether the value is a member of that type.
--
-- As an alternative to representing types as a set of representative
-- values, we could represent types more formally, such as with
-- first-order logic, but then we get into theorem proving,
-- which is more involved.
--
-- DavidManura, 2007, licensed under the same terms as Lua itself.

local M = {}

-- Stored Expression design pattern
-- ( https://lua-users.lua.ac.cn/wiki/StatementsInExpressions )
local StoredExpression
do
  local function call(self, ...)
    self.__index = {n = select('#', ...), ...}
    return ...
  end
  function StoredExpression()
    local self = {__call = call}
    return setmetatable(self, self)
  end
end
 
-- Whether to enable type checking (true/false).  Default true.
local is_typecheck = true

-- TypeValue is an abstract type for values that are typed
-- This holds the both the actual value and a subset of possible
-- values the value could assume at runtime.  That set should at least
-- include the min and max values (for bounds checking).
local TypedValue = {}

-- Check that value x satisfies type predicate function f.
function M.check_type(x, f)
  for _,v in ipairs(x) do
    assert(f(v))
  end
  return x.v
end


-- Type check function that decorates functions.
-- Example:
--   abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')(
--     function(x) return x >= 0 and x or -x end
--   )
function M.typecheck(...)
  local types = {...}
  return function(f)
    local function check(i, ...)
      -- Check types of return values.
      if types[i] == "->" then i = i + 1 end
      local j = i
      while types[i] ~= nil do
        M.check_type(select(i - j + 1, ...), types[i])
        i = i + 1
      end
      return ...
    end
    return function(...)
      -- Check types of input parameters.
      local i = 1
      while types[i] ~= nil and types[i] ~= "->" do
        M.check_type(select(i, ...), types[i])
        i = i + 1
      end
      return check(i, f(...))  -- call function
    end
  end
end


function M.notypecheck() return function(f) return f end end
function M.nounbox(x) return x end

M.typecheck = is_typecheck and M.typecheck or M.notypecheck
M.unbox = is_typecheck and M.unbox or M.nounbox

-- Return a boxed version of a binary operation function.
-- For the returned function,
--   Zero, one, or two of the arguments may be boxed.
--   The result value is boxed.
-- Example:
--   __add = boxed_op(function(a,b) return a+b end)
function M.boxed_op(op)
  return function(a, b)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    if getmetatable(b) ~= TypedValue then b = M.box(b) end
    local t = M.box(op(M.unbox(a), M.unbox(b)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      for _,b2 in ipairs(b) do
        local c2 = op(a2, b2)
        if not seen[c2] then
          t[#t + 1] = op(a2, b2)
          seen[c2] = true
        end
      end
    end
    return t
  end
end

-- Return a boxed version of a unary operation function.
-- For the returned function,
--   The argument may optionally be boxed.
--   The result value is boxed.
-- Example:
--   __unm = boxed_uop(function(a) return -a end)
function M.boxed_uop(op)
  return function(a)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    local t = M.box(op(M.unbox(a)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      local c2 = op(a2)
      if not seen[c2] then
        t[#t + 1] = op(a2)
        seen[c2] = true
      end
    end
    return t
  end
end

TypedValue.__index = TypedValue
TypedValue.__add = M.boxed_op(function(a, b) return a + b end)
TypedValue.__sub = M.boxed_op(function(a, b) return a - b end)
TypedValue.__mul = M.boxed_op(function(a, b) return a * b end)
TypedValue.__div = M.boxed_op(function(a, b) return a / b end)
TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end)
TypedValue.__mod = M.boxed_op(function(a, b) return a % b end)
TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end)
-- TypedValue.__le -- not going to work? (metafunction returns Boolean)
-- TypedValue.__lt -- not going to work? (metafunction returns Boolean)
-- TypedValue.__eq -- not going to work? (metafunction returns Boolean)
TypedValue.__tostring = function(self)
  local str = "[" .. tostring(self.v) .. " in "
  for i,v in ipairs(self) do
    if i ~= 1 then str = str .. ", " end
    str = str .. v
  end
  str = str .. "]"
  return str 
end
-- Convert a regular value into a TypedValue.  We call this "boxing".
function M.box(v, ...)
  local t = setmetatable({v = v, ...}, TypedValue)
  if #t == 0 then t[1] = v end
  return t
end
-- Convert a TypedValue into a regular value.  We call this "unboxing".
function M.unbox(x)
  assert(getmetatable(x) == TypedValue)
  return x.v
end


-- Returns a type predicate function for a given interval over the reals.
-- Example: ranged_real'[0,inf)'
-- Note: this function could be memoized.
function M.ranged_real(name, a, b)
  local ex = StoredExpression()

  if name == "(a,b)" then
    return function(x) return type(x) == "number" and x > a and x < b end
  elseif name == "(a,b]" then
    return function(x) return type(x) == "number" and x > a and x <= b end
  elseif name == "[a,b)" then
    return function(x) return type(x) == "number" and x >= a and x < b end
  elseif name == "[a,b]" then
    return function(x) return type(x) == "number" and x >= a and x <= b end
  elseif name == "(inf,inf)" then
    return function(x) return type(x) == "number" end
  elseif name == "[a,inf)" then
    return function(x) return type(x) == "number" and x >= a end
  elseif name == "(a,inf)" then
    return function(x) return type(x) == "number" and x > a end
  elseif name == "(-inf,a]" then
    return function(x) return type(x) == "number" and x <= a end
  elseif name == "(-inf,a)" then
    return function(x) return type(x) == "number" and x < a end
  elseif name == "[0,inf)" then
    return function(x) return type(x) == "number" and x >= 0 end
  elseif name == "(0,inf)" then
    return function(x) return type(x) == "number" and x > 0 end
  elseif name == "(-inf,0]" then
    return function(x) return type(x) == "number" and x <= 0 end
  elseif name == "(-inf,0)" then
    return function(x) return type(x) == "number" and x < 0 end
  elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then
    local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4]
    if left == "(" and right == ")" then
      return function(x) return type(x) == "number" and x > a and x < b end
    elseif left == "(" and right == "]" then
      return function(x) return type(x) == "number" and x > a and x <= b end
    elseif left == "[" and right == ")" then
      return function(x) return type(x) == "number" and x >= a and x < b end
    elseif left == "[" and right == "]" then
      return function(x) return type(x) == "number" and x >= a and x <= b end
    else assert(false)
    end
  else
    error("invalid arg " .. name, 2)
  end
end


return M

示例用法

-- type_example.lua
-- Test of ExTypeCheck.lua.

local TC = require "ExTypeCheck"
local typecheck = TC.typecheck
local ranged_real = TC.ranged_real
local boxed_uop = TC.boxed_uop
local box = TC.box

-- Checked sqrt function.
local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')(
  function(x)
    return boxed_uop(math.sqrt)(x)
  end
)

-- Checked random function.
local random = typecheck('->', ranged_real'[0,1)')(
  function()
    return box(math.random(), 0, 0.999999)
  end
)

print(box("a", "a", "b") .. "z")
print(box(3, 3,4,5) % 4)

math.randomseed(os.time())
local x = 0 + random() * 10 - 1 + (random()+1) * 0
print(x + 1); print(sqrt(x + 1)) -- ok
print(x); print(sqrt(x)) -- always asserts! (since x might be negative)

示例输出

[az in az, bz]
[3 in 3, 0, 1]
[8.7835848325787 in 8.7835848325787, 0, 9.99999]
[2.9637113274708 in 2.9637113274708, 0, 3.1622760790292]
[7.7835848325787 in 7.7835848325787, -1, 8.99999]
lua: ./ExTypeCheck.lua:50: assertion failed!
stack traceback:
        [C]: in function 'assert'
        ./ExTypeCheck.lua:50: in function 'check_type'
        ./ExTypeCheck.lua:78: in function 'sqrt'
        testt.lua:30: in main chunk
        [C]: ?

注意:这种值持有多个值的方法与 Perl6 交汇点(最初称为“量子叠加”)有一些相似之处。

解决方案:Metalua 运行时类型检查

Metalua 中有一个运行时类型检查示例 [2]

解决方案:Dao

Dao 语言,部分基于 Lua,内置支持可选类型 [3]


--DavidManura

解决方案:Teal

The [Teal 语言] 是 Lua 的一种类型化方言,编译成 Lua。

解决方案:TypeScriptToLua?

[TypeScriptToLua] 是一个 TypeScript? 到 Lua 的转译器,它允许我们使用 TypeScript? 语法和类型检查在编译时编写 Lua 代码。

我认为,除了非常简单的程序或始终具有相同输入的程序外,在脚本中禁用类型检查是一个坏主意。--JohnBelmonte

上面的评论被匿名删除,理由是“这不是一个供人随意发表意见的论坛”。相反,这是维基的既定风格。人们可以在页面上发表评论,对于有争议的观点,这被认为是礼貌的,而不是简单地更改原文。原作者可以决定是否将这些评论整合到原文中。--JohnBelmonte


最近更改 · 偏好设置
编辑 · 历史记录
最后编辑于 2022 年 7 月 12 日上午 8:38 GMT (差异)