字符串插值

lua-users home
wiki

当变量需要插入到字符串中时,产生的引用和解引用可能会有些繁琐

print("Hello " .. name .. ", the value of key " .. k .. " is " .. v .. "!")

与 Perl 相比,Perl 中变量可以嵌入到字符串中

print "Hello $name, the value of key $k is $b!\n";

关于 Lua 的抱怨是其引用方式冗长,并且可能更难阅读,例如在视觉上区分哪些文本在引号内或引号外。除了使用带有语法高亮显示的编辑器外,后者问题可以通过括号引用风格来改善

print([[Hello ]] .. name .. [[, the value of key ]] ..
      k .. [[ is ]] .. v .. [[!]])

使用 string.format 也可以使其更简洁

print(string.format("Hello %s, the value of key %s is %s", name, k, v))

可能使用一个辅助函数

function printf(...) print(string.format(...)) end

printf("Hello %s, the value of key %s is %s", name, k, v)

这带来的新问题是变量是按位置标识的,如果变量数量很多,会带来可读性和可维护性问题。

以下解决方案展示了如何在 Lua 中实现支持将变量插值到字符串中,以达到类似如下的语法

printf("Hello %(name), the value of key %(k) is %(v)")

解决方案:表中的命名参数

这是一个简单的实现(-- RiciLake

function interp(s, tab)
  return (s:gsub('($%b{})', function(w) return tab[w:sub(3, -2)] or w end))
end
print( interp("${name} is ${value}", {name = "foo", value = "bar"}) )

getmetatable("").__mod = interp
print( "${name} is ${value}" % {name = "foo", value = "bar"} )
-- Outputs "foo is bar"

解决方案:带格式化代码的命名参数

这是另一个实现(-- RiciLake),支持 Pythonic 格式化规范(需要 Lua 5.1 或更高版本)

function interp(s, tab)
  return (s:gsub('%%%((%a%w*)%)([-0-9%.]*[cdeEfgGiouxXsq])',
            function(k, fmt) return tab[k] and ("%"..fmt):format(tab[k]) or
                '%('..k..')'..fmt end))
end
getmetatable("").__mod = interp
print( "%(key)s is %(val)7.2f%" % {key = "concentration", val = 56.2795} )
-- outputs "concentration is   56.28%"

解决方案:命名参数和格式字符串在同一表中

这是另一个仅限 Lua 的解决方案(-- MarkEdgar

function replace_vars(str, vars)
  -- Allow replace_vars{str, vars} syntax as well as replace_vars(str, {vars})
  if not vars then
    vars = str
    str = vars[1]
  end
  return (string_gsub(str, "({([^}]+)})",
    function(whole,i)
      return vars[i] or whole
    end))
end

-- Example:
output = replace_vars{
	[[Hello {name}, welcome to {company}. ]],
	name = name,
	company = get_company_name()
}

解决方案:类 Ruby 和 Python 的字符串格式化,使用 % 运算符

Ruby 和 Python 都为字符串格式化提供了一种简写形式,使用 % 运算符。

下面的代码片段向 Lua 添加了类似的模运算符用法

getmetatable("").__mod = function(a, b)
        if not b then
                return a
        elseif type(b) == "table" then
                return string.format(a, unpack(b))
        else
                return string.format(a, b)
        end
end

用法示例

print( "%5.2f" % math.pi )

print( "%-10.10s %04d" % { "test", 123 } )

你可能喜欢或不喜欢这种表示法,自行选择。

技巧:使用 debug 访问词法作用域变量

下面是一个更复杂的实现(-- DavidManura)。它利用 debug 库(特别是 debug.getlocal())来查询局部变量,这可能出于多种原因而不可取(-- RiciLake)。首先,它可以用于中断你不应该中断的事情,所以如果运行受信任的代码,这是一个坏主意。debug.getlocal()也很昂贵,因为它需要扫描整个字节码来确定哪些变量在作用域内。它也不会捕获闭包变量。

代码

-- "nil" value that can be stored in tables.
local mynil_mt = {__tostring = function() return tostring(nil) end}
local mynil = setmetatable({}, mynil_mt)

-- Retrieves table of all local variables (name, value)
-- in given function <func>.  If a value is Nil, it instead
-- stores the value <mynil> in the table to distinguish a
-- a local variable that is nil from the local variable not
-- existing.
-- If a number is given in place of <func>, then it
-- uses that level in the call stack.  Level 1 is the
-- function that called get_locals.
-- Note: this correctly handles the case where two locals have the
-- same name: "local x = 1 ... get_locals(1) ... local x = 2".
-- This function is similar and is based on debug.getlocal().
function get_locals(func)
  local n = 1
  local locals = {}
  func = (type(func) == "number") and func + 1 or func
  while true do
    local lname, lvalue = debug.getlocal(func, n)
    if lname == nil then break end  -- end of list
    if lvalue == nil then lvalue = mynil end  -- replace
    locals[lname] = lvalue
    n = n + 1
  end
  return locals
end


-- Interpolates variables into string <str>.
-- Variables are defined in table <table>.  If <table> is
-- omitted, then it uses local and global variables in the
-- calling function.
-- Option level indicates the level in the call stack to
-- obtain local variable from (1 if omitted).
function interp(str, table, level)
  local use_locals = (table == nil)
  table = table or getfenv(2)
  if use_locals then
    level = level or 1
    local locals = get_locals(level + 1)
    table = setmetatable(locals, {__index = table})
  end
  local out = string.gsub(str, '$(%b{})',
    function(w)
      local variable_name = string.sub(w, 2, -2)
      local variable_value = table[variable_name]
      if variable_value == mynil then variable_value = nil end
      return tostring(variable_value)
    end
  )
  return out
end

-- Interpolating print.
-- This is just a wrapper around print and interp.
-- It only accepts a single string argument.
function printi(str)
  print(interp(str, nil, 2))
end

-- Pythonic "%" operator for srting interpolation.
getmetatable("").__mod = interp

测试

-- test globals
x=123
assert(interp "x = ${x}" == "x = 123")

-- test table
assert(interp("x = ${x}", {x = 234}) == "x = 234")

-- test locals (which override globals)
do
  local x = 3
  assert(interp "x = ${x}" == "x = 3")
end

-- test globals using setfenv
function test()
  assert(interp "y = ${y}" == "y = 123")
end
local env = {y = 123}
setmetatable(env, {__index = _G})
setfenv(test, env)
test()

-- test of multiple locals of same name
do
  local z = 1
  local z = 2
  assert(interp "z = ${z}" == "z = 2")
  local z = 3
end

-- test of locals with nil value
do
  z = 2
  local z = 1
  local z = nil
  assert(interp "z = ${z}" == "z = nil")
end

-- test of printi
x = 123
for k, v in ipairs {3,4} do
  printi("${x} - The value of key ${k} is ${v}")
end

-- test of "%" operator
assert("x = ${x}" % {x = 2} == "x = 2")

可以进行各种改进。例如,

v = {x = 2}
print(interp "v.x = ${v.x}")  -- not implemented

补丁到 Lua

我在 Ruby 和 PHP 中喜欢的特性之一是能够将变量包含在字符串中,例如 print "Hello ${Name}" 下面的补丁做了同样的事情,但仅限于 doc string 类型,即以 [[ 开头并以 ]] 结尾的字符串。它使用 "|" 字符来表示花括号的开闭。

要内联添加变量,例如

output = [[Hello |name|, welcome to |get_company_name()|. ]]

补丁所做的就是字面意思上将上面的内容转换为

output = [[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]]

llex.c 文件中更新了以下函数。

重要提示: 不知何故,我需要另一个字符来表示代码中的闭合花括号,并且我随意选择了 '',这意味着如果你在字符串中有该字符(尤其是在使用外语编码时),你将得到一个语法错误。我不知道目前是否有解决方案。

int luaX_lex (LexState *LS, SemInfo *seminfo) {
  for (;;) {
    switch (LS->current) {

      case '\n': {
        inclinenumber(LS);
        continue;
      }
      case '-': {
        next(LS);
        if (LS->current != '-') return '-';
        /* else is a comment */
        next(LS);
        if (LS->current == '[' && (next(LS), LS->current == '['))
          read_long_string(LS, NULL);  /* long comment */
        else  /* short comment */
          while (LS->current != '\n' && LS->current != EOZ)
            next(LS);
        continue;
      }
      case '[': {
        next(LS);
        if (LS->current != '[') return '[';
        else {
          read_long_string(LS, seminfo);
          return TK_STRING;
        }
      }
      case '=': {
        next(LS);
        if (LS->current != '=') return '=';
        else { next(LS); return TK_EQ; }
      }
      case '<': {
        next(LS);
        if (LS->current != '=') return '<';
        else { next(LS); return TK_LE; }
      }
      case '>': {
        next(LS);
        if (LS->current != '=') return '>';
        else { next(LS); return TK_GE; }
      }
      case '~': {
        next(LS);
        if (LS->current != '=') return '~';
        else { next(LS); return TK_NE; }
      }
      case '"':
      case '\'': {
        read_string(LS, LS->current, seminfo);
        return TK_STRING;
      }

	// added!!!
        //------------------------------
      case '|': {
	 LS->current = '�';
	 return TK_CONCAT;
      }

      case '�': {
	read_long_string(LS, seminfo);
	return TK_STRING;
	}
        //------------------------------

      case '.': {
        next(LS);
        if (LS->current == '.') {
          next(LS);
          if (LS->current == '.') {
            next(LS);
            return TK_DOTS;   /* ... */
          }
          else return TK_CONCAT;   /* .. */
        }

        else if (!isdigit(LS->current)) return '.';
        else {
          read_numeral(LS, 1, seminfo);
          return TK_NUMBER;
        }
      }
      case EOZ: {
        return TK_EOS;
      }
      default: {
        if (isspace(LS->current)) {
          next(LS);
          continue;
        }
        else if (isdigit(LS->current)) {
          read_numeral(LS, 0, seminfo);
          return TK_NUMBER;
        }
        else if (isalpha(LS->current) || LS->current == '_') {
          /* identifier or reserved word */
          size_t l = readname(LS);
          TString *ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff), l);
          if (ts->tsv.reserved > 0)  /* reserved word? */
            return ts->tsv.reserved - 1 + FIRST_RESERVED;
          seminfo->ts = ts;
          return TK_NAME;
        }
        else {
          int c = LS->current;
          if (iscntrl(c))
            luaX_error(LS, "invalid control char",
                           luaO_pushfstring(LS->L, "char(%d)", c));
          next(LS);
          return c;  /* single-char tokens (+ - / ...) */
        }
      }
    }
  }
}


static void read_long_string (LexState *LS, SemInfo *seminfo) {
  int cont = 0;
  size_t l = 0;
  checkbuffer(LS, l);
  save(LS, '[', l);  /* save first `[' */
  save_and_next(LS, l);  /* pass the second `[' */
  if (LS->current == '\n')  /* string starts with a newline? */
    inclinenumber(LS);  /* skip it */
  for (;;) {
    checkbuffer(LS, l);
    switch (LS->current) {
      case EOZ:
        save(LS, '\0', l);
        luaX_lexerror(LS, (seminfo) ? "unfinished long string" :
                                   "unfinished long comment", TK_EOS);
        break;  /* to avoid warnings */
      case '[':
        save_and_next(LS, l);
        if (LS->current == '[') {
          cont++;
          save_and_next(LS, l);
        }
        continue;
      case ']':
        save_and_next(LS, l);
        if (LS->current == ']') {
          if (cont == 0) goto endloop;
          cont--;
          save_and_next(LS, l);
        }
        continue;

// added
//------------------------------
      case '|':
		save(LS, ']', l);  

		LS->lookahead.token = TK_CONCAT;
        goto endloop;
        continue;
//------------------------------

      case '\n':
        save(LS, '\n', l);
        inclinenumber(LS);
        if (!seminfo) l = 0;  /* reset buffer to avoid wasting space */
        continue;
      default:
        save_and_next(LS, l);
    }
  } endloop:
  save_and_next(LS, l);  /* skip the second `]' */
  save(LS, '\0', l);
  if (seminfo)
    seminfo->ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff) + 2, l - 5);
}

--Sam Lie

注意:上面的补丁在 5.1 中不起作用。确保 f [[Hello |name|, welcome to |get_company_name()|. ]] 翻译成 f([[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]]). 或者翻译成 f([[Hello ]], name, [[, welcome to ]], get_company_name(), [[. ]]). 也许使用 [[ ]] 而不是 | | 来跳出字符串,因为 Lua 5.1 中嵌套的 [[ ]] 已被弃用,默认会引发错误,所以我们可以自由重新定义其语义。 --DavidManura

MetaLua

对于 MetaLua 实现,请参阅 MetaLuaRecipes 中的“字符串插值”。

Var Expand

VarExpand - bash 风格的内联变量展开的高级版本。

自定义搜索器,预处理

请参阅 [gist1338609](--DavidManura),它安装了一个自定义搜索器函数,用于预处理正在加载的模块。提供的示例预处理器进行字符串插值

--! code = require 'interpolate' (code)

local M = {}

local function printf(s, ...)
  local vals = {...}
  local i = 0
  s = s:gsub('\0[^\0]*\0', function()
    i = i + 1
    return tostring(vals[i])
  end)
  print(s)
end

function M.test()
  local x = 16
  printf("value is $(math.sqrt(x)) ")
end

return M

其他想法

其他可能的应用

在字符串中嵌入表达式可以有这些应用


RecentChanges · preferences
编辑 · 历史
最后编辑于 2023年1月20日 下午 6:57 GMT (差异)