字符串插值

lua-users home
wiki

当需要在字符串中插值变量时,结果的引号和取消引号可能会变得有点笨拙

print("Hello " .. name .. ", the value of key " .. k .. " is " .. v .. "!")

与 Perl 相比,Perl 中可以在字符串中嵌入变量

print "Hello $name, the value of key $k is $b!\n";

关于 Lua 版本的抱怨是引号冗长,并且可能使阅读变得更加困难,例如在视觉上区分引号内或引号外的文本。除了使用具有语法高亮的编辑器之外,后一个问题可以通过带括号的引号样式来改进

print([[Hello ]] .. name .. [[, the value of key ]] ..
      k .. [[ is ]] .. v .. [[!]])

这也可以使用string.format变得更加简洁

print(string.format("Hello %s, the value of key %s is %s", name, k, v))

可能使用辅助函数

function printf(...) print(string.format(...)) end

printf("Hello %s, the value of key %s is %s", name, k, v)

由此带来的新问题是,变量是按位置识别的,如果变量数量很大,就会出现可读性和可维护性问题。

以下解决方案展示了如何在 Lua 中实现对字符串中插值变量的支持,以实现类似于以下语法的语法

printf("Hello %(name), the value of key %(k) is %(v)")

解决方案:表中的命名参数

这是一个简单的实现(-- RiciLake

function interp(s, tab)
  return (s:gsub('($%b{})', function(w) return tab[w:sub(3, -2)] or w end))
end
print( interp("${name} is ${value}", {name = "foo", value = "bar"}) )

getmetatable("").__mod = interp
print( "${name} is ${value}" % {name = "foo", value = "bar"} )
-- Outputs "foo is bar"

解决方案:带格式代码的命名参数

这是另一个实现(-- RiciLake),支持 Python 式格式规范(需要 Lua 5.1 或更高版本)

function interp(s, tab)
  return (s:gsub('%%%((%a%w*)%)([-0-9%.]*[cdeEfgGiouxXsq])',
            function(k, fmt) return tab[k] and ("%"..fmt):format(tab[k]) or
                '%('..k..')'..fmt end))
end
getmetatable("").__mod = interp
print( "%(key)s is %(val)7.2f%" % {key = "concentration", val = 56.2795} )
-- outputs "concentration is   56.28%"

解决方案:命名参数和格式字符串在同一个表中

这是另一个仅限 Lua 的解决方案(-- MarkEdgar

function replace_vars(str, vars)
  -- Allow replace_vars{str, vars} syntax as well as replace_vars(str, {vars})
  if not vars then
    vars = str
    str = vars[1]
  end
  return (string_gsub(str, "({([^}]+)})",
    function(whole,i)
      return vars[i] or whole
    end))
end

-- Example:
output = replace_vars{
	[[Hello {name}, welcome to {company}. ]],
	name = name,
	company = get_company_name()
}

解决方案:使用 % 运算符的 Ruby 和 Python 式字符串格式化

Ruby 和 Python 都有一个用于字符串格式化的简短形式,使用 % 运算符。

以下代码片段将类似的 mod 运算符的使用添加到 lua 中

getmetatable("").__mod = function(a, b)
        if not b then
                return a
        elseif type(b) == "table" then
                return string.format(a, unpack(b))
        else
                return string.format(a, b)
        end
end

示例用法

print( "%5.2f" % math.pi )

print( "%-10.10s %04d" % { "test", 123 } )

您可能喜欢或不喜欢这种表示法,由您自己选择。

黑客:使用 debug 访问词法

下面是一个更复杂的实现(-- DavidManura)。这利用了 debug 库(特别是 debug.getlocal())来查询局部变量,这可能由于多种原因而不可取(-- RiciLake)。首先,它可以用来侵入你不应该侵入的东西,所以如果运行的是可信代码,这是一个坏主意。debug.getlocal() 也很昂贵,因为它需要扫描整个字节码才能找出哪些变量在范围内。它也不会捕获闭包变量。

代码

-- "nil" value that can be stored in tables.
local mynil_mt = {__tostring = function() return tostring(nil) end}
local mynil = setmetatable({}, mynil_mt)

-- Retrieves table of all local variables (name, value)
-- in given function <func>.  If a value is Nil, it instead
-- stores the value <mynil> in the table to distinguish a
-- a local variable that is nil from the local variable not
-- existing.
-- If a number is given in place of <func>, then it
-- uses that level in the call stack.  Level 1 is the
-- function that called get_locals.
-- Note: this correctly handles the case where two locals have the
-- same name: "local x = 1 ... get_locals(1) ... local x = 2".
-- This function is similar and is based on debug.getlocal().
function get_locals(func)
  local n = 1
  local locals = {}
  func = (type(func) == "number") and func + 1 or func
  while true do
    local lname, lvalue = debug.getlocal(func, n)
    if lname == nil then break end  -- end of list
    if lvalue == nil then lvalue = mynil end  -- replace
    locals[lname] = lvalue
    n = n + 1
  end
  return locals
end


-- Interpolates variables into string <str>.
-- Variables are defined in table <table>.  If <table> is
-- omitted, then it uses local and global variables in the
-- calling function.
-- Option level indicates the level in the call stack to
-- obtain local variable from (1 if omitted).
function interp(str, table, level)
  local use_locals = (table == nil)
  table = table or getfenv(2)
  if use_locals then
    level = level or 1
    local locals = get_locals(level + 1)
    table = setmetatable(locals, {__index = table})
  end
  local out = string.gsub(str, '$(%b{})',
    function(w)
      local variable_name = string.sub(w, 2, -2)
      local variable_value = table[variable_name]
      if variable_value == mynil then variable_value = nil end
      return tostring(variable_value)
    end
  )
  return out
end

-- Interpolating print.
-- This is just a wrapper around print and interp.
-- It only accepts a single string argument.
function printi(str)
  print(interp(str, nil, 2))
end

-- Pythonic "%" operator for srting interpolation.
getmetatable("").__mod = interp

测试

-- test globals
x=123
assert(interp "x = ${x}" == "x = 123")

-- test table
assert(interp("x = ${x}", {x = 234}) == "x = 234")

-- test locals (which override globals)
do
  local x = 3
  assert(interp "x = ${x}" == "x = 3")
end

-- test globals using setfenv
function test()
  assert(interp "y = ${y}" == "y = 123")
end
local env = {y = 123}
setmetatable(env, {__index = _G})
setfenv(test, env)
test()

-- test of multiple locals of same name
do
  local z = 1
  local z = 2
  assert(interp "z = ${z}" == "z = 2")
  local z = 3
end

-- test of locals with nil value
do
  z = 2
  local z = 1
  local z = nil
  assert(interp "z = ${z}" == "z = nil")
end

-- test of printi
x = 123
for k, v in ipairs {3,4} do
  printi("${x} - The value of key ${k} is ${v}")
end

-- test of "%" operator
assert("x = ${x}" % {x = 2} == "x = 2")

可以进行各种增强。例如,

v = {x = 2}
print(interp "v.x = ${v.x}")  -- not implemented

Lua 的补丁

我在 Ruby 和 PHP 中喜欢的一个功能是能够在字符串中包含变量,例如 print "Hello ${Name}" 以下补丁做了同样的事情,但仅针对 doc 字符串类型,以 [[ 开头并以 ]] 结尾的字符串。它使用 "|" 字符来表示开括号和闭括号。

添加变量内联示例

output = [[Hello |name|, welcome to |get_company_name()|. ]]

补丁的作用实际上是将上面的代码转换为

output = [[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]]

以下函数在 llex.c 文件中更新。

重要提示:不知何故,我需要另一个字符来表示代码中的右大括号,我随意选择了 '�',这意味着如果你在字符串中包含了这个字符(特别是在使用外语编码时),你将得到一个语法错误。目前还不确定是否有解决此问题的方案。

int luaX_lex (LexState *LS, SemInfo *seminfo) {
  for (;;) {
    switch (LS->current) {

      case '\n': {
        inclinenumber(LS);
        continue;
      }
      case '-': {
        next(LS);
        if (LS->current != '-') return '-';
        /* else is a comment */
        next(LS);
        if (LS->current == '[' && (next(LS), LS->current == '['))
          read_long_string(LS, NULL);  /* long comment */
        else  /* short comment */
          while (LS->current != '\n' && LS->current != EOZ)
            next(LS);
        continue;
      }
      case '[': {
        next(LS);
        if (LS->current != '[') return '[';
        else {
          read_long_string(LS, seminfo);
          return TK_STRING;
        }
      }
      case '=': {
        next(LS);
        if (LS->current != '=') return '=';
        else { next(LS); return TK_EQ; }
      }
      case '<': {
        next(LS);
        if (LS->current != '=') return '<';
        else { next(LS); return TK_LE; }
      }
      case '>': {
        next(LS);
        if (LS->current != '=') return '>';
        else { next(LS); return TK_GE; }
      }
      case '~': {
        next(LS);
        if (LS->current != '=') return '~';
        else { next(LS); return TK_NE; }
      }
      case '"':
      case '\'': {
        read_string(LS, LS->current, seminfo);
        return TK_STRING;
      }

	// added!!!
        //------------------------------
      case '|': {
	 LS->current = '�';
	 return TK_CONCAT;
      }

      case '�': {
	read_long_string(LS, seminfo);
	return TK_STRING;
	}
        //------------------------------

      case '.': {
        next(LS);
        if (LS->current == '.') {
          next(LS);
          if (LS->current == '.') {
            next(LS);
            return TK_DOTS;   /* ... */
          }
          else return TK_CONCAT;   /* .. */
        }

        else if (!isdigit(LS->current)) return '.';
        else {
          read_numeral(LS, 1, seminfo);
          return TK_NUMBER;
        }
      }
      case EOZ: {
        return TK_EOS;
      }
      default: {
        if (isspace(LS->current)) {
          next(LS);
          continue;
        }
        else if (isdigit(LS->current)) {
          read_numeral(LS, 0, seminfo);
          return TK_NUMBER;
        }
        else if (isalpha(LS->current) || LS->current == '_') {
          /* identifier or reserved word */
          size_t l = readname(LS);
          TString *ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff), l);
          if (ts->tsv.reserved > 0)  /* reserved word? */
            return ts->tsv.reserved - 1 + FIRST_RESERVED;
          seminfo->ts = ts;
          return TK_NAME;
        }
        else {
          int c = LS->current;
          if (iscntrl(c))
            luaX_error(LS, "invalid control char",
                           luaO_pushfstring(LS->L, "char(%d)", c));
          next(LS);
          return c;  /* single-char tokens (+ - / ...) */
        }
      }
    }
  }
}


static void read_long_string (LexState *LS, SemInfo *seminfo) {
  int cont = 0;
  size_t l = 0;
  checkbuffer(LS, l);
  save(LS, '[', l);  /* save first `[' */
  save_and_next(LS, l);  /* pass the second `[' */
  if (LS->current == '\n')  /* string starts with a newline? */
    inclinenumber(LS);  /* skip it */
  for (;;) {
    checkbuffer(LS, l);
    switch (LS->current) {
      case EOZ:
        save(LS, '\0', l);
        luaX_lexerror(LS, (seminfo) ? "unfinished long string" :
                                   "unfinished long comment", TK_EOS);
        break;  /* to avoid warnings */
      case '[':
        save_and_next(LS, l);
        if (LS->current == '[') {
          cont++;
          save_and_next(LS, l);
        }
        continue;
      case ']':
        save_and_next(LS, l);
        if (LS->current == ']') {
          if (cont == 0) goto endloop;
          cont--;
          save_and_next(LS, l);
        }
        continue;

// added
//------------------------------
      case '|':
		save(LS, ']', l);  

		LS->lookahead.token = TK_CONCAT;
        goto endloop;
        continue;
//------------------------------

      case '\n':
        save(LS, '\n', l);
        inclinenumber(LS);
        if (!seminfo) l = 0;  /* reset buffer to avoid wasting space */
        continue;
      default:
        save_and_next(LS, l);
    }
  } endloop:
  save_and_next(LS, l);  /* skip the second `]' */
  save(LS, '\0', l);
  if (seminfo)
    seminfo->ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff) + 2, l - 5);
}

--Sam Lie

注意:上面的补丁在 5.1 中已损坏。确保 f [[Hello |name|, welcome to |get_company_name()|. ]] 转换为 f([[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]])。或者将其转换为 f([[Hello ]], name, [[, welcome to ]], get_company_name(), [[. ]])。也许使用 [[ ]] 而不是 | | 来跳出字符串,因为嵌套的 [[ ]] 在 Lua 5.1 中已弃用,默认情况下会引发错误,因此我们可以自由地重新定义其语义。--DavidManura

Metalua

有关 MetaLua 实现,请参阅 MetaLuaRecipes 中的“字符串插值”。

变量扩展

VarExpand - bash 风格的内联变量扩展的进阶版本。

自定义搜索器,用于预处理

参见 [gist1338609] (--DavidManura),它安装了一个自定义搜索器函数,用于预处理正在加载的模块。给出的示例预处理器执行字符串插值。

--! code = require 'interpolate' (code)

local M = {}

local function printf(s, ...)
  local vals = {...}
  local i = 0
  s = s:gsub('\0[^\0]*\0', function()
    i = i + 1
    return tostring(vals[i])
  end)
  print(s)
end

function M.test()
  local x = 16
  printf("value is $(math.sqrt(x)) ")
end

return M

其他想法

其他可能的应用

在字符串中嵌入表达式可以有以下应用。


最近更改 · 偏好设置
编辑 · 历史记录
最后编辑于 2023 年 1 月 21 日凌晨 12:57 GMT (差异)