Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

When answering this question, I wrote this code to iterate over the UTF-8 byte sequence in a string:

local str = "KORYTNA?KA"
for c in str:gmatch("[-x7FxC2-xF4][x80-xBF]*") do 
    print(c) 
end

It works in Lua 5.2, but in Lua 5.1, it reports an error:

malformed pattern (missing ']')

I recall in Lua 5.1, the string literal xhh is not supported, so I modified it to:

local str = "KORYTNA?KA"
for c in str:gmatch("[-127194-244][128-191]*") do 
    print(c) 
end

But the error stays the same, how to fix it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.7k views
Welcome To Ask or Share your Answers For Others

1 Answer

I highly suspect, this happens because of in the pattern. Basically, string that holds your pattern null-terminates before it should and, in fact, what lua regex engine is parsing is: [. That's clearly wrong pattern and should trigger the error you're currently getting.

To prove this concept I made little change to pattern:

local str = "KORYTNA?KA"
for c in str:gmatch("[x0-x7FxC2-xF4][x80-xBF]*") do 
    print(c) 
end

That compiled and ran as expected on lua 5.1.4. Demonstration

Note: I have not actually looked what pattern was doing. Just removed by adding x. So output of modified code might not be what you expect.

Edit: As a workaround you might consider replacing with \0 (to escape null-termination) in your second code example:

local str = "KORYTNA?KA"
for c in str:gmatch("[\0-127194-244][128-191]*") do 
    print(c) 
end

Demo


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...