๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Computer Science

[CS] ๋‚˜๋งŒ์˜ ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ๋งŒ๋“ค์–ด๋ณด์ž!(1): Lexer ๋งŒ๋“ค๊ธฐ

๋ฐ˜์‘ํ˜•

๐Ÿ”Š ํ•ด๋‹น ํฌ์ŠคํŒ…์€ ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ๋งŒ๋“œ๋Š” ์ธํ„ฐํ”„๋ฆฌํ„ฐ in go ์ฑ…์„ ์ฝ๊ณ  ๊ฐœ์ธ์ ์ธ ์ •๋ฆฌ ๋ชฉ์  ํ•˜์— ์ž‘์„ฑ๋œ ๊ธ€์ž…๋‹ˆ๋‹ค. ๋ณธ ํฌ์ŠคํŒ…์— ์‚ฌ์šฉ๋œ ์ž๋ฃŒ๋Š” ๋ชจ๋‘ ๋ณธ์ธ์ด ์ง์ ‘ ์žฌ๊ตฌ์„ฑํ•˜์—ฌ ์ž‘์„ฑํ•˜์˜€์Œ์„ ์•Œ๋ฆฝ๋‹ˆ๋‹ค.
 
์ตœ๊ทผ์— C์–ธ์–ด๋ฅผ ์กฐ๊ธˆ์”ฉ ์ ‘ํ•˜๊ธฐ ์‹œ์ž‘ํ•˜๋ฉด์„œ ๋กœ์šฐ ๋ ˆ๋ฒจ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ๊ณต๋ถ€์— ๋Œ€ํ•œ ๊ฐˆ์ฆ์ด ๋งŽ์•„์กŒ๋‹ค. ํ•˜์ง€๋งŒ C์–ธ์–ด์— ๋Œ€ํ•œ ๊ธฐ์ดˆ๋ฅผ ๋ฐฐ์šฐ๊ณ  ๋‚œ ๋’ค, ๊ฐ€์žฅ ๋‹นํ˜น์Šค๋Ÿฌ์› ๋˜ ๋ถ€๋ถ„์€ ํ˜„์žฌ ์‹ค๋ฌด์—์„œ C์–ธ์–ด๋ฅผ ์ž์ฃผ ์ ‘ํ•  ์ผ์ด ์—†์–ด์„œ ํ”„๋กœ์ ํŠธ ํ• ๋งŒํ•œ ๊ฒŒ ์—†๋‹ค๋Š” ๊ฒƒ์ด์—ˆ๋‹ค. ๋ฌผ๋ก  Python์˜ ๊ตฌํ˜„์ฒด ์ค‘ ํ•˜๋‚˜์ธ CPython ์†Œ์Šค์ฝ”๋“œ๋ฅผ ์‚ดํŽด๋ณผ ์ˆ˜ ์žˆ๊ฒ ์ง€๋งŒ, ์ด๊ฒƒ๋„ ๋‹จ์ง€ '๋ณด๊ธฐ๋งŒ ํ•  ๋ฟ'์ด์ง€, ๋ญ”๊ฐ€ ๊ฒฐ๊ณผ๋ฌผ์ด ์žˆ๊ฑฐ๋‚˜ ๋‚ด ์ง€์‹์œผ๋กœ ์ฒด๋“๋˜๋Š” ๋Š๋‚Œ์ด ์•„๋‹ˆ์—ˆ๋‹ค. ๊ทธ๋Ÿฌ๋˜ ์ค‘, ์š”์ฆ˜ ์‹ค๋ฌด์—์„œ ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๊ธฐ์ˆ ๊ณผ Go๋กœ ์ž‘์„ฑ๋œ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์ธ Milvus(๊ฒ€์ƒ‰ ์—”์ง„์€ C๊ณ„์—ด๋กœ ์ž‘์„ฑ๋˜์—ˆ์ง€๋งŒ) ๋“ฑ์„ ๋‹ค๋ฃจ์–ด๋ณด๋ฉด์„œ Go์–ธ์–ด์— ๋Œ€ํ•œ ๊ณต๋ถ€์˜ ํ•„์š”์„ฑ์„ ๋Š๊ผˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์•„๋ž˜์˜ ์ฑ…์„ ์ถ”์ฒœ๋ฐ›์•„ CS ๊ณต๋ถ€๋ฅผ ๋” ํ•˜๊ณ  ์‹ถ์—ˆ๋Š”๋ฐ, ์•„๋ž˜ ์ฑ…์ด Go์–ธ์–ด๋กœ ์ž‘์„ฑ๋œ ๊ฒƒ์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค! ๊ทธ๋ž˜์„œ ํ•ด๋‹น ์ฑ…์„ ๊นŠ๊ฒŒ ๊ณต๋ถ€ํ•˜๊ธฐ๋กœ ๋งˆ์Œ ๋จน์—ˆ๋‹ค. CS ์ง€์‹์„ ์†Œ์Šค์ฝ”๋“œ ๋ ˆ๋ฒจ์—์„œ ์ดํ•ดํ•˜๊ณ , ๊ทธ ์†Œ์Šค์ฝ”๋“œ๋„ ๋‚ด๊ฐ€ ๋ฐฐ์šฐ๊ณ ์ž ํ•˜๋Š” Go์–ธ์–ด๋ฅผ ๋ฐฐ์šฐ๋ฉด์„œ ๋ง์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์ด๋ฒˆ์—๋„ ์ฒœ์ฒœํžˆ ํ•œ ๊ฑธ์Œ์”ฉ ๋‚ด๋”›์–ด๋ณด์ž!
 

์ถœ์ฒ˜: Yes24


์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ๋žŒ๋“ค์ด ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” Python ์ฆ‰, CPython ๊ตฌํ˜„์ฒด๋กœ ๊ตฌํ˜„๋œ Python ์–ธ์–ด๋Š” ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์–ธ์–ด์— ์†ํ•œ๋‹ค. ๋ฌผ๋ก  ๋‚ด๋ถ€์ ์œผ๋กœ ์ผ๋ถ€ ๋ฐ”์ดํŠธ ์ฝ”๋“œ๋กœ ์ปดํŒŒ์ผํ•˜๋Š” ๊ณผ์ •์ด ์žˆ์ง€๋งŒ, C์–ธ์–ด๋‚˜ Java, Go ์–ธ์–ด๋“ค๊ณผ ๊ฐ™์€ ์™„์ „ ์ปดํŒŒ์ผ ์–ธ์–ด๋Š” ์•„๋‹ˆ๋‹ค. Python ์ด๋ผ๋Š” ์–ธ์–ด๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ•ด์˜จ ๊ฐœ๋ฐœ์ž๋กœ์„œ, ๊ถ๊ธˆํ–ˆ๋‹ค. ๋Œ€์ฒด ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋Š” ์–ด๋–ค ์›๋ฆฌ๋กœ ๋งŒ๋“ค์–ด์ง€๋Š” ๊ฒƒ์ผ๊นŒ?
 
์ธํ„ฐํ”„๋ฆฌํ„ฐ๋Š” ์ •๋ง ๊ทธ ์ž์ฒด๋กœ๋Š” ๋‹จ์ˆœํ•˜๋‹ค. 'interpret' ๋ผ๋Š” ์˜์–ด๋‹จ์–ด์˜ ์˜๋ฏธ์ฒ˜๋Ÿผ "์–ด๋–ค ๊ฒƒ์„ ํ•ด์„ ์ฆ‰, ๋ฒˆ์—ญํ•œ๋‹ค". ๋‹ค์‹œ ๋งํ•ด, ์–ด๋–ค ๊ฒƒ์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๋Š” ์ˆœ๊ฐ„, ๋ฐ”๋กœ ๋ฌด์—‡์ธ๊ฐ€ ์ถœ๋ ฅ์œผ๋กœ ํŠ€์–ด๋‚˜์˜จ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ํ”ํžˆ ๋Œ€ํ™”ํ˜• ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ผ๊ณ  ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ, ํ„ฐ๋ฏธ๋„์— python ์ด๋ผ๋Š” ๋ช…๋ น์–ด๋ฅผ ์ณ ์ ‘์†ํ•œ ํ›„, ์ฝ”๋“œ๋ฅผ ์ž…๋ ฅํ•  ๋•Œ๋งˆ๋‹ค ๊ฒฐ๊ณผ๋ฌผ์ด ์ฝ˜์†”์— ์ถœ๋ ฅ๋˜์–ด ๋‚˜์˜จ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ๋‹จ์ˆœํ•œ ๋™์ž‘์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋Š” ๋Œ€์ฒด ์–ด๋–ค ์›๋ฆฌ๋„ ๋™์ž‘ํ•˜๋Š” ๊ฑธ๊นŒ? ์šฐ๋ฆฌ๋Š” ์ด์ œ ์•ž์œผ๋กœ ์ด ์›๋ฆฌ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋ ค๊ณ  ํ•œ๋‹ค.
 
๋ฌผ๋ก  ๋ชจ๋“  ์ธํ„ฐํ”„๋ฆฌํ„ฐ๊ฐ€ ๋ฐฉ๊ธˆ ์–ธ๊ธ‰ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋‹จ์ˆœํžˆ ๋ฒˆ์—ญํ•˜๋Š” ๋™์ž‘๋งŒ ์ˆ˜ํ–‰ํ•˜์ง„ ์•Š๋Š”๋‹ค. CPython๋„ ๋‚ด๋ถ€์ ์œผ๋กœ๋Š” ๋ฐ”์ดํŠธ ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋™์ž‘์ด ์ˆ˜๋ฐ˜๋œ๋‹ค.(์ด ๊ณผ์ •์„ ๊ฑฐ์ณ ๋‚˜์˜ค๋Š” ํŒŒ์ผ์ด ํ™•์žฅ์ž .pyc ๋ผ๋Š” ํŒŒ์ผ์ด๋‹ค) ์ฆ‰, ์ž…๋ ฅ์„ ๋‹จ์ˆœํžˆ ํ‰๊ฐ€ํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ, ๋ฐ”์ดํŠธ ์ฝ”๋“œ๋ผ๋Š” ๋‚ด๋ถ€ ํ‘œํ˜„๋ฌผ๋กœ ์ปดํŒŒ์ผํ•œ ๋‹ค์Œ ํ‰๊ฐ€ํ•œ๋‹ค. ์ข€ ๋” ์ง„๋ณด๋œ ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์ข…๋ฅ˜๋กœ๋Š” JIT(Just in Time) ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋กœ, ์ž…๋ ฅ์„ ๊ทธ ์ž๋ฆฌ์—์„œ ๋„ค์ดํ‹ฐ๋ธŒ ๊ธฐ๊ณ„์–ด๋กœ ์ปดํŒŒ์ผํ•œ ํ›„ ์‹คํ–‰์„ ํ•˜๊ธฐ๋„ ํ•œ๋‹ค.
 
์ด๋ ‡๊ฒŒ ์ƒ์šฉํ™”๋˜๊ณ  ์ง„๋ณด๋œ ์ธํ„ฐํ”„๋ฆฌํ„ฐ์˜ ์›๋ฆฌ์— ๋Œ€ํ•ด์„œ๋Š” ๋ฐฐ์šฐ์ง€๋Š” ์•Š๋Š”๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋“ค ๋ณด๋‹ค๋Š” ๋น„๋ก ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š๊ฒ ์ง€๋งŒ ๋” ๊ฐ„๋‹จํ•œ ์›๋ฆฌ๋กœ ๋งŒ๋“ค์–ด์ง€๋Š” ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ์ง์ ‘ ๋งŒ๋“ค์–ด๋ณด๋ฉด์„œ ์›๋ฆฌ๋ฅผ ์ตํ˜€๋ณผ ๊ฒƒ์ด๋‹ค. ์ด ์›๋ฆฌ๋ผ ํ•จ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ž…๋ ฅํ•œ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ํŒŒ์‹ฑํ•˜๊ณ , ์ด๋ฅผ ์ถ”์ƒ๊ตฌ๋ฌธํŠธ๋ฆฌ(AST, Abstract Syntax Tree)๋กœ ๋งŒ๋“ค๊ณ , ์ด๊ฒƒ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ด๋ฅธ๋ฐ” 'ํŠธ๋ฆฌ ํƒ์ƒ‰(tree-walking) ์ธํ„ฐํ”„๋ฆฌํ„ฐ'๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
 
ํŠธ๋ฆฌ ํƒ์ƒ‰ ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ ‰์„œ(Lexer), ํŒŒ์„œ(Parser), ํŠธ๋ฆฌ ํ‘œํ˜„๋ฒ•(Tree representation), ํ‰๊ฐ€๊ธฐ(Evaluator)๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•œ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ๊ทธ ์ฒซ๋ฒˆ์งธ ๊ตฌ์„ฑ์š”์†Œ๋กœ, ๋ ‰์„œ๋ฅผ ๋งŒ๋“ค์–ด๋ณผ ๊ฒƒ์ด๋‹ค.
 
์ฐธ๊ณ ๋กœ ํ•ด๋‹น ์ฑ…์—์„œ ์›์ €์ž๋Š” ํ•ด๋‹น ์ฑ…์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด Go ์–ธ์–ด์— ๋Œ€ํ•ด์„œ ์•„์˜ˆ ๋ชฐ๋ผ๋„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋ฒˆ์—ญ๊ฐ€๋ถ„๋„ ์ฑ…์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ํ•„์ž์˜ ์ƒ๊ฐ์—๋„ Go ์–ธ์–ด์— ๋Œ€ํ•œ ๊ธฐ์ดˆ๋Š” ๊ฐ„๋‹จํžˆ ์ˆ™์ง€ํ•œ ํ›„ ์ฑ…์„ ์ฝ๊ธฐ๋ฅผ ๊ถŒ์žฅํ•œ๋‹ค. ํ•„์ž๋„ Go ์–ธ์–ด์— ๋Œ€ํ•œ ๊ธฐ์ดˆ๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ๋น ๋ฅด๊ฒŒ ์Šต๋“ํ•œ ํ›„ ์ฑ…์„ ์ฝ์œผ๋‹ˆ ์ˆ˜์›”ํ–ˆ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ๋Š” Go ๊ณต์‹ ๋ฌธ์„œ์˜ Tutorials ์™€ A Tour of Go ์ •๋„๋Š” ๊ฐ™์ด ๋ณ‘ํ–‰ํ•˜๋ฉด์„œ ์ฑ…์„ ์ฝ๊ธฐ๋ฅผ ๊ถŒ์žฅํ•œ๋‹ค.


1. ์†Œ์Šค์ฝ”๋“œ๋ผ๋Š” ์–ดํœ˜๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ณผ์ •

์†Œ์Šค์ฝ”๋“œ๋ฅผ ๊ฐ€์ง€๊ณ  ์–ด๋–ค ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ์ด ์†Œ์Šค์ฝ”๋“œ๋ฅผ ์ข€๋” ์ ‘๊ทผํ•˜๊ธฐ ์‰ฌ์šด ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค. ์ด ๋ณ€ํ™˜ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์•„๋ž˜์˜ ๋‹จ๊ณ„๋“ค์ด ํ•„์š”ํ•˜๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์„ ๋ณด์ž.
 

์†Œ์Šค์ฝ”๋“œ๊ฐ€ ์ถ”์ƒ๊ตฌ๋ฌธํŠธ๋ฆฌ ๊นŒ์ง€ ๊ฐ€๋Š” ๊ณผ์ •

 
๋จผ์ € ์†Œ์Šค์ฝ”๋“œ ํ˜•ํƒœ๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํ† ํฐ๋“ค๋กœ ๋‚˜๋‰˜์–ด์ ธ์•ผ ํ•œ๋‹ค. ์ตœ๊ทผ ๋Œ€ LLM ์‹œ๋Œ€์— ์ ‘์–ด๋“ค๋ฉด์„œ 'ํ† ํฐ'์ด๋ผ๋Š” ๋‹จ์–ด๋Š” ๋Œ€๋ถ€๋ถ„ ์ต์ˆ™ํ•  ๊ฒƒ์ด๋‹ค. ์ž์—ฐ์–ด ์ฒ˜๋ฆฌํ•  ๋•Œ์˜ ํ† ํฐ ์˜๋ฏธ์™€ ์œ ์‚ฌํ•˜๋‹ค. ์–ด์จŒ๊ฑด ์†Œ์Šค์ฝ”๋“œ๋„ ๋ฌธ์ž์—ด ํ˜•ํƒœ์ด๋‹ค. ์ด ๋ฌธ์ž์—ด๋“ค์„ ์–ด๋–ค ๋‹จ์œ„๋กœ ๋Œ•๊ฐ•๋Œ•๊ฐ• ์ž˜๋ผ๋‚ด์•ผ ํ•˜๋Š”๋ฐ, ์ด ๊ณผ์ •์„ ์ธํ„ฐํ”„๋ฆฌํ„ฐ์—์„œ๋Š” ์–ดํœ˜ ๋ถ„์„ ์ฆ‰, ๋ ‰์‹ฑ(Lexing)์ด๋ผ๊ณ  ์šฉ์–ด๋ฅผ ์ •์˜ํ•œ๋‹ค. ์ด ๋•Œ, ๋ ‰์‹ฑ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ฃผ์ฒด๋Š” ๋ ‰์„œ๊ฐ€ ์ˆ˜ํ–‰ํ•œ๋‹ค.
 
์ด์ œ ์ž˜๊ฒŒ ์ชผ๊ฐœ์ง„ ํ† ํฐ๋“ค์€ ํŠธ๋ฆฌ ํ˜•ํƒœ์˜ ์ž๋ฃŒ๊ตฌ์กฐ์ธ ์ถ”์ƒ๊ตฌ๋ฌธํŠธ๋ฆฌ๋กœ ๋ณ€ํ™˜์ด ๋˜๋Š”๋ฐ, ์ด๋ฅผ ํŒŒ์‹ฑ(Parsing)์ด๋ผ๊ณ  ํ•˜๋ฉฐ ํŒŒ์‹ฑ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ฃผ์ฒด๋Š” ํŒŒ์„œ๊ฐ€ ๋œ๋‹ค.
 
์šฐ๋ฆฌ๋Š” ์ด๋ฒˆ ๋ชฉ์ฐจ์—์„œ๋Š” ์œ„ ๊ณผ์ • ์ค‘ Lexing ์„ ์ˆ˜ํ–‰ํ•˜๋Š” Lexer๋ฅผ ๋งŒ๋“ค์–ด๋ณผ ๊ฒƒ์ด๋‹ค.

2. ๊ฐ€์žฅ ๋จผ์ € ํ•  ์ผ: ํ† ํฐ ์ •์˜ํ•˜๊ธฐ

๋ ‰์‹ฑ์„ ๋ณธ๊ฒฉ์ ์œผ๋กœ ํ•˜๊ธฐ์— ์•ž์„œ ๊ฐ€์žฅ ๋จผ์ € ํ•  ์ผ์€ ํ† ํฐ์„ ๋ฏธ๋ฆฌ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ง์ „ ๋ชฉ์ฐจ์—์„œ ๋ดค๋˜ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด ์ด์ƒํ•˜๋‹ค ์‹ถ์„ ์ˆ˜ ์žˆ๋‹ค. ๋ ‰์‹ฑ์„ ๊ฑฐ์ณ์„œ ๋‚˜์˜จ ๊ฒŒ ํ† ํฐ์ด๋ผ๋ฉด์„œ ์™œ ํ† ํฐ์„ ๋จผ์ € ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด์ง€? ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ† ํฐ์„ ๋จผ์ € ์ •์˜ํ•ด์•ผ ํ•˜๋Š” ์ด์œ ๋Š” ์ปดํ“จํ„ฐ์—๊ฒŒ "์ด๋Ÿฐ ๋ฌธ์ž์—ด์€ ์ด๋Ÿฐ ๋œป์ด์•ผ" ๋ผ๊ณ  ๋งคํ•‘ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋งˆ์น˜ ๊ฐ€์ด๋“œ๋ฅผ ์ฃผ๋Š” ์…ˆ์ด๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹ ๋ถ„์•ผ์—์„œ ์‰ฝ๊ฒŒ ์ด์•ผ๊ธฐํ•ด๋ณด์ž๋ฉด, ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ๋ฅผ ํ•˜๊ฑฐ๋‚˜ ์นดํ…Œ๊ณ ๋ฆฌ์ปฌํ•œ ์„ฑ๊ฒฉ์ด ์žˆ๋Š” ํ”ผ์ณ๋ฅผ ์ธ์ฝ”๋”ฉ ํ•  ๋•Œ, ์–ด๋–ค ๋‹จ์–ด๋Š” ์–ด๋–ค ์ˆซ์ž์— ๋งคํ•‘๋˜๋„๋ก ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๋‹จ์–ด์™€ ์ˆซ์ž๊ฐ€ ๋งคํ•‘๋œ ์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ์ผ๋ช… ์‚ฌ์ „(Vocabulary)์ด๋ผ๊ณ ๋„ ํ•œ๋‹ค.
 
๋‹ค์‹œ ๋งํ•ด, ์ปดํ“จํ„ฐ์—๊ฒŒ "์†Œ์Šค์ฝ”๋“œ ์ค‘ '=' ๋ผ๋Š” ๋ฌธ์ž๊ฐ€ ์žˆ์œผ๋ฉด ์ด๊ฑด ์–ด๋–ค ๊ฐ’์„ ํ• ๋‹นํ•œ๋‹ค๋Š” ์˜๋ฏธ์•ผ. ๋˜๋Š” '+' ๋ผ๋Š” ๋ฌธ์ž๊ฐ€ ๋‚˜์˜ค๋ฉด ์ด๊ฒƒ์€ ๋ง์…ˆ ์—ฐ์‚ฐ์„ ์˜๋ฏธํ•˜๋Š” ๊ฑฐ์•ผ." ์ฒ˜๋Ÿผ ์ด์•ผ๊ธฐ ํ•ด์ฃผ๊ธฐ ์œ„ํ•ด, ์†Œ์Šค์ฝ”๋“œ์— ์–ด๋–ค ๋ฌธ์ž์—ด์ด ๋“ฑ์žฅํ•˜๋ฉด ์ด๊ฒƒ์€ ์–ด๋–ค ๊ฒƒ์„ ์˜๋ฏธํ•˜๋Š”์ง€ ๋‚˜ํƒ€๋‚ด๋Š” ํ† ํฐ์„ ์‚ฌ์ „์— ๋ฏธ๋ฆฌ ์ •์˜ํ•ด๋†“๋Š” ๊ฒƒ์ด๋‹ค. ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ๊ณ„์† ํ™•์žฅํ•ด์„œ ๊ฐœ๋ฐœํ•  ๋•Œ, ์ด์ „์— ์ปดํ“จํ„ฐ๊ฐ€ ์•Œ์•„๋“ฃ์ง€ ๋ชปํ–ˆ๋˜ ๋ฌธ์ž์—ด์„ ์ดํ•ด์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ ๋‹จ์ˆœํžˆ ์ด ํ† ํฐ์ด๋ผ๋Š” ๊ฒƒ์— ์ถ”๊ฐ€ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋˜๋Š” ๊ฒƒ์ด๋‹ค.
 
์ด์ œ ํ† ํฐ์„ ์ •์˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์šฐ๋ฆฌ๋งŒ์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋จผ์ € ์‚ดํŽด๋ณด์ž. ์ด ์†Œ์Šค์ฝ”๋“œ๋Š” (์ฑ…์—์„œ๋„ ์†Œ๊ฐœํ•˜๋Š”) Monkey ๋ผ๋Š” ์„ธ์ƒ์— ์ƒ์šฉํ™”๋˜์–ด ์žˆ์ง€ ์•Š์ง€๋งŒ ์šฐ๋ฆฌ๋งŒ์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ์†Œ์Šค์ฝ”๋“œ์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด Monkey ๋ผ๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋กœ ์ž‘์„ฑ๋˜์–ด ์žˆ๋Š” ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•˜๋Š” ๋ ‰์„œ๋ฅผ ๋งŒ๋“ค ๊ฒƒ์ด๋‹ค.
 

let five = 5;
let ten = 10;

let add = fn(x, y) {
  x + y;
};

let result = add(5, 10);

 
์œ„ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ณด์•˜์„ ๋•Œ, ์‚ฌ์ „์— ์ •์˜ํ•ด์•ผ ํ•  ํ† ํฐ์€ ๋ฌด์—‡์ด ์žˆ์„๊นŒ?

  • 5, 10๊ณผ ๊ฐ™์€ ๊ฐ’์„ ๋ณด๊ณ  ์ˆซ์ž ํ† ํฐ ํƒ€์ž…์ด ํ•„์š”. ์ฐธ๊ณ ๋กœ ์ •์ˆ˜ํ˜•์— ๋Œ€ํ•œ ํ† ํฐ์ด๋ฉด ๋จ. ๋ ‰์„œ์™€ ํŒŒ์„œ๋Š” ์ˆซ์ž์ธ์ง€๋งŒ ์‹๋ณ„ํ•˜๋ฉด ๋จ. ์ˆซ์ž๊ฐ€ 5์ธ์ง€ 10์ธ์ง€๋Š” ์‹ ๊ฒฝ์“ฐ์ง€ ์•Š์•„๋„ ๋จ
  • x, y, five, ten, add, result ๋ฅผ ๋ณด๊ณ  ๋ณ€์ˆ˜ ์ด๋ฆ„์ด๋ผ๊ณ  ์ถ”์ธกํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ์‹๋ณ„์ž ์ค‘ '์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž'๋ฅผ ์œ„ํ•œ ํ† ํฐ์ด ํ•„์š”
  • fn, let ๊ณผ ๊ฐ™์ด ์‹๋ณ„์ž์ด๊ธด ํ•˜์ง€๋งŒ, ๋ณ€์ˆ˜๊ฐ€ ์•„๋‹Œ ์ผ๋ช… ์˜ˆ์•ฝ์–ด์— ๋Œ€ํ•œ ํ† ํฐ์ด ํ•„์š”
  • { } , ( ) ; = ๊ณผ ๊ฐ™์ด ํŠน์ˆ˜๋ฌธ์ž์— ๋Œ€ํ•œ ํ† ํฐ์ด ํ•„์š”

ํ† ํฐ์„ ์ด์ œ ์ •์˜ํ•ด๋ณด์ž. ๊ทธ๋ฆฌ๊ณ  ํ† ํฐ์„ ๋‹ด๊ณ  ๋ ‰์„œ์—์„œ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ตฌ์กฐ์ฒด๋ฅผ ํ™œ์šฉํ•ด ์ •์˜ํ•˜์ž.
 

// token/token.go

package token

type TokenType string

type Token struct {
	Type    TokenType
	Literal string   // ์—ฌ๊ธฐ์—” ํ† ํฐํ™”ํ•œ ๋ฌธ์ž์—ด์„ ๋‹ด์•„๋†“์„ ๊ณณ
}

const (
	ILLEGAL = "ILLEGAL"
	EOF     = "EOF"

	// Identifiers + literals
	IDENT = "IDENT" // add, result, x, y, ...
	INT   = "INT"   // 1343456

	// Operators
	ASSIGN   = "="
	PLUS     = "+"

	// Delimiters
	COMMA     = ","
	SEMICOLON = ";"

	LPAREN = "("
	RPAREN = ")"
	LBRACE = "{"
	RBRACE = "}"

	// ์˜ˆ์•ฝ์–ด
	FUNCTION = "FUNCTION"
	LET      = "LET"
)

 
์œ„์—์„œ ์ •์˜ํ•œ ํ† ํฐ ์ค‘ ํŠน์ดํ•œ ๊ฒƒ์ด 2๊ฐœ ์žˆ๋‹ค. ๋ฐ”๋กœ ILLEGAL ๊ณผ EOF์ด๋‹ค. ILLEGAL์€ ๋ ‰์„œ๊ฐ€ ์•Œ ์ˆ˜ ์—†๋Š” ํ† ํฐ์ด๋‚˜ ๋ฌธ์ž๋ฅผ ๋งˆ์ฃผํ–ˆ์„ ๋•Œ ๋ช…์‹œํ•  ํƒ€์ž…์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  EOF๋Š” End Of File์˜ ์ค„์ž„๋ง๋กœ, ๋ง ๊ทธ๋Œ€๋กœ ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ์˜ ๋์ด๋‹ค. ์ด EOF๋Š” ์ถ”ํ›„์— ๋งŒ๋“ค ํŒŒ์„œ์—๊ฒŒ ์ด์ œ ๊ทธ๋งŒ ํŒŒ์‹ฑ์„ ๋ฉˆ์ถ”๋ผ๊ณ  ์•Œ๋ ค์ฃผ๋Š” ์‹œ๊ทธ๋„์ด๊ธฐ๋„ ํ•˜๋‹ค.

2. ์†Œ์Šค์ฝ”๋“œ๋ฅผ ํ† ํฐํ™”์‹œํ‚ค๊ธฐ: Lexing

์ด์ œ ๋ณธ๊ฒฉ์ ์œผ๋กœ Lexing์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ ‰์„œ๋ฅผ ๋งŒ๋“ค์–ด๋ณผ ์ฐจ๋ก€๋‹ค. ๊ฐ€์žฅ ์ฒ˜์Œ์—๋Š” ๊ฐ„๋‹จํ•œ ์†Œ์Šค์ฝ”๋“œ๋งŒ ๋ ‰์‹ฑํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ๋งŒ๋“ค๊ณ , ์ด๋ฅผ ํ•˜๋‚˜์”ฉ ์ ์  ๊ณ ๋„ํ™”์‹œํ‚ค๋Š” ๋‹จ๊ณ„๋กœ ์ง„ํ–‰ํ•œ๋‹ค. ๊ทธ์— ์•ž์„œ ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ ‰์„œ๊ฐ€ ์ž˜ ๋™์ž‘ํ•˜๋Š”์ง€ ํ…Œ์ŠคํŠธํ•˜๋Š” ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ถ€ํ„ฐ ์†Œ๊ฐœํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‚œ ๋’ค, ์ด ํ…Œ์ŠคํŠธ ์ฝ”๋“œ์—์„œ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ํ•จ์ˆ˜ ํ•˜๋‚˜ํ•˜๋‚˜์˜ ๊ธฐ๋Šฅ์„ ์ฝ”๋“œ ์กฐ๊ฐ์œผ๋กœ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž.
(์ฐธ๊ณ ๋กœ Go๋ฅผ ํ™œ์šฉํ•œ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ์ž‘์„ฑ ๋ฐฉ๋ฒ•์€ ์—ฌ๊ธฐ์„œ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค. ๊ณต์‹ ๋ฌธ์„œ์—์„œ ์ž˜ ๊ฐ€์ด๋“œ ๋˜์–ด์žˆ์œผ๋‹ˆ, ํ˜น์‹œ ๋ชจ๋ฅธ๋‹ค๋ฉด ๊ผญ ์ฝ์–ด๋ณด์ž)
 

// lexer/lexer_test.go

package lexer

import (
	"monkey/token"
	"testing"
)

func TestNextToken(t *testing.T) {
	input := `=+(){},;`

	tests := []struct {
		expectedType    token.TokenType
		expectedLiteral string
	}{
		{token.ASSIGN, "="},
		{token.PLUS, "+"},
		{token.LPAREN, "("},
		{token.RPAREN, ")"},
		{token.LBRACE, "{"},
		{token.RBRACE, "}"},
		{token.COMMA, ","},
		{token.SEMICOLON, ";"},
		{token.EOF, ""},
	}

	src := New(input)

	for i, tt := range tests {
		tok := src.NextToken()
		if tok.Type != tt.expectedType {
			t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q",
				i, tt.expectedType, tok.Type)
		}
		if tok.Literal != tt.expectedLiteral {
			t.Fatalf("tests[%d] - literal wrong. expected=%q, got=%q",
				i, tt.expectedLiteral, tok.Literal)
		}
	}
}

 
๊ฐ€์žฅ ๋จผ์ € input ๋ณ€์ˆ˜๋ช…์„ ๋ณด๋ฉด ๋ฐฑํ‹ฑ(`)์„ ํ™œ์šฉํ•ด์„œ ์–ด๋–ค ๋ฌธ์ž์—ด์„ ๋ช…์‹œํ–ˆ๋‹ค. ์ฐธ๊ณ ๋กœ Go์–ธ์–ด์—์„œ ๋ฐฑ์Šฌ๋ž˜์‰ฌ๋‚˜ ๊ฐœํ–‰์„ ๋ฌธ์ž๋กœ ๊ฐ„์ฃผํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ฐฑํ‹ฑ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋œ๋‹ค. ์–ด์จŒ๊ฑด ๋ฐฑํ‹ฑ์„ ํ™œ์šฉํ•ด ๋ฌธ์ž์—ด์„ ๋ช…์‹œํ•œ ๊ฒƒ์ด ๋ฐ”๋กœ ๋ ‰์„œ๊ฐ€ ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์ธ ์†Œ์Šค์ฝ”๋“œ์ด๋‹ค. ์ฆ‰, ์šฐ๋ฆฌ๋Š” ์•„๋ž˜์˜ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•  ๊ฒƒ์ด๋‹ค.

=+(){},;

 
๊ทธ๋ฆฌ๊ณ  tests ๋ผ๋Š” ๋ณ€์ˆ˜์— ์Šฌ๋ผ์ด์Šค๋กœ ์ด๋ฃจ์–ด์ง„ ๊ตฌ์กฐ์ฒด๋ฅผ ์ •์˜ํ–ˆ๋‹ค. ํ•ด๋‹น ๊ตฌ์กฐ์ฒด์—๋Š” 2๊ฐœ์˜ ๋ฉค๋ฒ„๋ฅผ ๊ฐ–๋Š”๋ฐ, ํ•˜๋‚˜๋Š” ์ง์ „ ๋ชฉ์ฐจ์—์„œ ์ •์˜ํ•œ token.TokenType ์ด๋ผ๋Š” ํƒ€์ž…์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‚˜๋จธ์ง€ ํ•˜๋‚˜๋Š” ๋ฌธ์ž์—ด ํƒ€์ž…์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ตฌ์กฐ์ฒด๋“ค์„ ์ •์˜ํ–ˆ๋‹ค. ์ด ๊ตฌ์กฐ์ฒด๋“ค์€ ์‚ฌ์‹ค ๋ฐฉ๊ธˆ '๋ ‰์„œ๊ฐ€ ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์ธ ์†Œ์Šค์ฝ”๋“œ'๋ฅผ ํ† ํฐ์œผ๋กœ ์‚ฌ์ „์— ๋ถ„ํ• ํ•ด ๋†“์€ ๊ฒƒ์ด๋‹ค. ์–ด๋–ป๊ฒŒ ๋ณด๋ฉด ์ •๋‹ต์„ ๋ฏธ๋ฆฌ ์ •์˜ํ•œ ์…ˆ์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด '์ •๋‹ต'์„ ๊ฐ€์ง€๊ณ  for loop ๊ตฌ๋ฌธ์„ ์ด์šฉํ•ด ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ ‰์„œ๊ฐ€ ์ž˜ ๋™์žํ•˜๋Š”์ง€ ๊ฒ€์‚ฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค.
 
src ๋ณ€์ˆ˜๋ฅผ ๋ณด๋ฉด New ๋ผ๋Š” ํ•จ์ˆ˜์— ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ๋„ฃ์–ด์ฃผ์—ˆ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” ๋ ‰์„œ๊ฐ€ ๋ ‰์‹ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ๋งŒ๋“œ๋Š” ๊ตฌ์กฐ์ฒด๋กœ ๋งŒ๋“œ๋Š” ์—ญํ• ์„ ํ•œ๋‹ค. ์ถ”ํ›„์— ์†Œ๊ฐœํ•  ๊ฒƒ์ด๋‹ค. ์ด์ œ for loop ๊ตฌ๋ฌธ์„ ๋ณด์ž. tests ๋ผ๋Š” ๋ณ€์ˆ˜์— ์ •์˜ํ•œ ์ฆ‰, ๋ฏธ๋ฆฌ ์ •์˜ํ•œ ํ† ํฐ ์ •๋‹ต์„ ํ•˜๋‚˜์”ฉ loop๋ฅผ ๋Œ๋ฉด์„œ ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ ‰์„œ๊ฐ€ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ์ž˜ ๋ ‰์‹ฑํ•˜๋Š”์ง€ ํ…Œ์ŠคํŠธํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 
 
์ด์ œ ๊ทธ๋Ÿฌ๋ฉด ํ…Œ์ŠคํŠธ ์ฝ”๋“œ์— ์ž‘์„ฑ๋˜์–ด ์žˆ๋Š” ์•„์ง ์†Œ๊ฐœํ•˜์ง€ ์•Š์€ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ํ•จ์ˆ˜๋“ค์„ ํ•˜๋‚˜์”ฉ ์‚ดํŽด๋ณด์ž.

2-1. ๋ ‰์‹ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€์ƒ์œผ๋กœ ๋งŒ๋“ค๊ธฐ

๊ฐ€์žฅ ๋จผ์ € ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์€ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด(src ๋ณ€์ˆ˜)์„ ์šฐ๋ฆฌ๋งŒ์˜ ๋ ‰์„œ๊ฐ€ ๋ ‰์‹ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€์ƒ์œผ๋กœ ๋งŒ๋“ค์–ด์•ผ ํ•œ๋‹ค. ์†Œ์Šค์ฝ”๋“œ ๋ถ€ํ„ฐ ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

package lexer

type Lexer struct {
	input        string
	position     int
	readPosition int
	ch           byte
}

func New(input string) *Lexer {
	l := &Lexer{input: input}
	return l
}

 
๋จผ์ € Lexer ๋ผ๋Š” ์ด๋ฆ„์˜ ๊ตฌ์กฐ์ฒด๋ฅผ ์ •์˜ํ–ˆ๋‹ค. ํ•ด๋‹น ๊ตฌ์กฐ์ฒด๊ฐ€ ๊ฐ–๋Š” ๋ฉค๋ฒ„๋Š” ์ด 4๊ฐ€์ง€์ธ๋ฐ ์„ค๋ช…์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • input : ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด
  • position : ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ํ˜„์žฌ ์œ„์น˜
  • readPosition : ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ํ˜„์žฌ ์œ„์น˜์˜ ๋‹ค์Œ ์œ„์น˜
  • ch : ํ˜„์žฌ ๋ ‰์‹ฑํ•˜๊ณ  ์žˆ๋Š” ๋Œ€์ƒ ๋ฌธ์ž(๋ฌธ์ž์—ด ์•„๋‹˜! ์— ์ฃผ์˜. ๊ทธ๋ž˜์„œ ์œ„์—์„œ ํƒ€์ž…์„ byte๋กœ ์ •์˜ํ•จ. Go์–ธ์–ด์—์„œ byte๋Š” ๋ถ€ํ˜ธ๊ฐ€ ์—†๋Š” 8๋น„ํŠธ ์ฆ‰, uint8 ํƒ€์ž…์ด๋ฉฐ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋Š” ์ˆซ์ž ๊ฒฝ์šฐ์˜ ์ˆ˜๊ฐ€ 0~255์ด๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“ค ๋ ‰์„œ๋Š” ASCII ๋ฌธ์ž ๋ฒ”์œ„ ๋‚ด์— ํ‘œ๊ธฐ๋˜๋Š” ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ด๊ธฐ ๋•Œ๋ฌธ์— rune(32๋น„ํŠธ) ํƒ€์ž…์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Œ)

๋‹ค์Œ์œผ๋กœ New ๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ๋ณด์ž. ํ•ด๋‹น ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ๊ทธ๋Œ€๋กœ ๋ฐ›๋˜ Lexer ๋ผ๋Š” ๊ตฌ์กฐ์ฒด๋กœ ๋งŒ๋“œ๋Š” ์—ญํ• ์„ ํ•œ๋‹ค. ๋‹จ, ๋ฆฌํ„ดํ•˜๋Š” ๊ฐ’์€ ๊ตฌ์กฐ์ฒด์˜ ํฌ์ธํ„ฐ ๋ณ€์ˆ˜์ด๋‹ค.
 
๊ทธ๋Ÿฐ๋ฐ ์ด์ œ ์šฐ๋ฆฌ๋Š” ์ด New ํ•จ์ˆ˜์— ํ•œ ๊ฐ€์ง€ ํ•จ์ˆ˜๋ฅผ ๋” ์ถ”๊ฐ€ํ•  ๊ฒƒ์ด๋‹ค. ์†Œ์Šค์ฝ”๋“œ๋ถ€ํ„ฐ ๋ณด์ž.
 

// lexer/lexer.go

func New(input string) *Lexer {
	l := &Lexer{input: input}
	l.readChar()
	return l
}

func (l *Lexer) readChar() {
	if l.readPosition >= len(l.input) {
		l.ch = 0
	} else {
		l.ch = l.input[l.readPosition]
	}
	l.position = l.readPosition
	l.readPosition += 1
}

 
Lexer ๊ตฌ์กฐ์ฒด์˜ ํฌ์ธํ„ฐ ๋ณ€์ˆ˜๋ฅผ Receiver๋กœ ํ•˜๋Š” readChar ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ–ˆ๋‹ค. ์ด readChar ํ•จ์ˆ˜๋Š” ์–ด๋–ค ๊ธฐ๋Šฅ์„ ํ•˜๋Š”๊ฑด์ง€ ๋ณด์ž. ์ด๋ฆ„ ๊ทธ๋Œ€๋กœ ๋ฌธ์ž(Char)๋ฅผ ์ฝ๋Š” ๊ธฐ๋Šฅ์ด๋‹ค. if ~ else ๋ถ„๊ธฐ๋ฌธ์„ ๋ณด์ž. Lexer ๊ตฌ์กฐ์ฒด์˜ readPosition ๊ฐ’์ด ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™๋‹ค๋ฉด ์ฆ‰, readPosition ๊ฐ’์ด ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด ๊ธธ์ด์˜ ๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚ฌ๋‹ค๋ฉด Lexer ๊ตฌ์กฐ์ฒด์˜ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž(ch ๋ผ๋Š” ๊ตฌ์กฐ์ฒด ๋ฉค๋ฒ„)์—๋Š” 0 ์ฆ‰, NULL์„ ์ง‘์–ด๋„ฃ๋Š”๋‹ค. 
 
๋งŒ์•ฝ ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์˜ ๋ฌธ์ž๋ฅผ ch ๊ตฌ์กฐ์ฒด ๋ฉค๋ฒ„์— ํ• ๋‹นํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์˜ ๋ฌธ์ž ์œ„์น˜๋ฅผ ํ˜„์žฌ ์ฒ˜๋ฆฌํ•œ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์˜ ๋‹ค์Œ ์œ„์น˜์— ์žˆ๋Š” ๋ฌธ์ž๋กœ ์˜ฎ๊ธฐ๋„๋ก ํ•œ๋‹ค.

2-2. ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ํ† ํฐํ™” ์‹œํ‚ค๊ธฐ

์ง์ „ ๋ชฉ์ฐจ๊นŒ์ง€๋Š” ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์˜ ์œ„์น˜๋ฅผ ์ฐพ๋Š” ์ž‘์—…์„ ํ–ˆ๋‹ค. ๋ ‰์‹ฑํ•  ๋Œ€์ƒ์˜ ์œ„์น˜๋ฅผ ์ฐพ์•˜์œผ๋‹ˆ ์ด์ œ๋Š” ๋ ‰์‹ฑ ๋Œ€์ƒ์˜ ๋ฌธ์ž๊ฐ€ ์–ด๋–ค ํ† ํฐ์œผ๋กœ ๋งคํ•‘์‹œํ‚ฌ์ง€ ๊ฒฐ์ •ํ•ด์•ผ ํ•œ๋‹ค. ์ด๋ฒˆ์—๋„ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋จผ์ € ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

func (l *Lexer) NextToken() token.Token {
	var tok token.Token
	
	switch l.ch {
	case '=':
		tok = newToken(token.ASSIGN, l.ch)
	case ';':
		tok = newToken(token.SEMICOLON, l.ch)
	case '(':
		tok = newToken(token.LPAREN, l.ch)
	case ')':
		tok = newToken(token.RPAREN, l.ch)
	case ',':
		tok = newToken(token.COMMA, l.ch)
	case '+':
		tok = newToken(token.PLUS, l.ch)
	case '{':
		tok = newToken(token.LBRACE, l.ch)
	case '}':
		tok = newToken(token.RBRACE, l.ch)
	case 0:
		tok.Type = token.EOF
		tok.Literal = ""
	}
	l.readChar()
	return tok
}

func newToken(tokenType token.TokenType, ch byte) token.Token {
	return token.Token{Type: tokenType, Literal: string(ch)}
}

 
๋จผ์ € newToken ์ด๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ๋ณด์ž. ํ•ด๋‹น ํ•จ์ˆ˜๋Š” ํ† ํฐ ํƒ€์ž…๊ณผ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž๋ฅผ ์ธ์ž๋กœ ๋ฐ›์€ ๋’ค, token.Token ์ด๋ผ๋Š” ๊ตฌ์กฐ์ฒด๋กœ ๋งŒ๋“  ํ›„ ๋ฆฌํ„ดํ•œ๋‹ค. 
 
์ด์ œ NextToken ํ•จ์ˆ˜๋ฅผ ๋ณด์ž. ํ•ด๋‹น ํ•จ์ˆ˜์—์„œ๋Š” ์‹ค์ œ ํ˜„์žฌ ๋ ‰์‹ฑ ๋Œ€์ƒ์˜ ๋ฌธ์ž(ch)๊ฐ€ ์–ด๋–ค ๋ฌธ์ž๋ž‘ ์ผ์น˜ํ•˜๋Š”์ง€ ๋ณด๊ณ , ๊ทธ ๋ฌธ์ž๋ž‘ ๋งคํ•‘๋˜๋Š” ํ† ํฐ์œผ๋กœ ๋ณ€ํ™˜ ํ›„ ํ† ํฐ์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ ์ฃผ๋ชฉํ•  ์ ์€ switch-case ๋ฌธ์ด ๋๋‚œ ๋’ค, ์œ„์—์„œ ์•Œ์•„๋ณธ readChar ํ•จ์ˆ˜๋ฅผ ํ•œ ๋ฒˆ ํ˜ธ์ถœํ•จ์œผ๋กœ์จ ๋ ‰์‹ฑ ๋Œ€์ƒ์˜ ๋ฌธ์ž๋ฅผ ๋‹ค์Œ ์œ„์น˜๋กœ ์˜ฎ๊ธฐ๊ฒŒ ๋œ๋‹ค.
 
์ด์ œ ์œ„ ํ•จ์ˆ˜๊นŒ์ง€ ๋ชจ๋‘ ์ž‘์„ฑํ–ˆ์œผ๋ฉด ์œ„์—์„œ ์‚ดํŽด๋ณด์•˜๋˜ lexer_test.go ๋ผ๋Š” ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ํŒŒ์ผ์„ ์‹คํ–‰์‹œ์ผœ๋ณด์ž. ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ์‹คํŒจ๊ฐ€ ๋‚˜์ง€ ์•Š๋Š”๋‹ค๋ฉด ์ •์ƒ ํ†ต๊ณผํ•œ ๊ฒƒ์ด๋‹ค!

3. ์กฐ๊ธˆ ๋” ๊ทธ๋Ÿด๋“ฏํ•œ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•ด๋ณด์ž

์ง์ „ ๋ชฉ์ฐจ์—์„œ๋Š” ๋‹จ์ˆœํ•œ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•ด๋ณด์•˜๋‹ค. ์ด๋ฒˆ์—๋Š” ์ •๋ง ์šฐ๋ฆฌ๋งŒ์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์ธ Monkey ์–ธ์–ด์˜ ๋ฌธ๋ฒ•์„ ๋ ‰์‹ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ ‰์„œ๋ฅผ ๊ฐœ์„ ์‹œ์ผœ๋ณด์ž. ์ด๋ฒˆ์—๋„ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ฅผ ๋จผ์ € ์‚ดํŽด๋ณด์ž. ์ง์ „์— ๋ณด์•˜๋˜ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ์—์„œ inputs ์™€ tests ๋ณ€์ˆ˜๋งŒ ๋‹ฌ๋ผ์กŒ๋‹ค.
 

// lexer/lexer_test.go

func TestNextToken(t *testing.T) {
	input := `let five = 5;
let ten = 10;

let add = fn(x, y) {
 x + y;
};
let result = add(five, ten);`

	tests := []struct {
		expectedType    token.TokenType
		expectedLiteral string
	}{
		{token.LET, "let"},
		{token.IDENT, "five"},
		{token.ASSIGN, "="},
		{token.INT, "5"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "ten"},
		{token.ASSIGN, "="},
		{token.INT, "10"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "add"},
		{token.ASSIGN, "="},
		{token.FUNCTION, "fn"},
		{token.LPAREN, "("},
		{token.IDENT, "x"},
		{token.COMMA, ","},
		{token.IDENT, "y"},
		{token.RPAREN, ")"},
		{token.LBRACE, "{"},
		{token.IDENT, "x"},
		{token.PLUS, "+"},
		{token.IDENT, "y"},
		{token.SEMICOLON, ";"},
		{token.RBRACE, "}"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "result"},
		{token.ASSIGN, "="},
		{token.IDENT, "add"},
		{token.LPAREN, "("},
		{token.IDENT, "five"},
		{token.COMMA, ","},
		{token.IDENT, "ten"},
		{token.RPAREN, ")"},
		{token.SEMICOLON, ";"},
		{token.EOF, ""},
	}

	src := New(input)

	for i, tt := range tests {
		tok := src.NextToken()
		if tok.Type != tt.expectedType {
			t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q",
				i, tt.expectedType, tok.Type)
		}
		if tok.Literal != tt.expectedLiteral {
			t.Fatalf("tests[%d] - literal wrong. expected=%q, got=%q",
				i, tt.expectedLiteral, tok.Literal)
		}
	}
}

 
inputs์— ์žˆ๋Š” ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ๋ณด๋‹ˆ, ์ •๋ง ๊ฝค๋‚˜ ์‹ค์ œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋‹ต๋‹ค! ์ด inputs ๋ฌธ์ž์—ด์— ๋งž๊ฒŒ tests ๋ณ€์ˆ˜์—๋„ ๋ฏธ๋ฆฌ ์ •ํ•ด์ง„ ์ •๋‹ต ํ† ํฐ์„ ์ถ”๊ฐ€ํ•ด๋‘๋„๋ก ํ•˜์ž.
 
์ด๋ฒˆ์—๋Š” ์ง์ „์— ๋งŒ๋“  ๋ ‰์„œ ๊ธฐ๋Šฅ์œผ๋กœ ๋ ‰์‹ฑํ•˜๊ธฐ์—๋Š” ํ•œ๊ณ„์ ์ด ์žˆ๋‹ค. ๋ฐ”๋กœ ์‹๋ณ„์ž, ์˜ˆ์•ฝ์–ด, ์ˆซ์ž์ด๋‹ค. ์‹๋ณ„์ž๋Š” ํฌ๊ฒŒ ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž(์ผ๋ช… ๋ณ€์ˆ˜ ๊ฐ™์€..)์™€ ์˜ˆ์•ฝ์–ด๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž๋ผ ํ•จ์€ ์œ„ ์†Œ์Šค์ฝ”๋“œ์—์„œ five, ten, add ์™€ ๊ฐ™์€ ๊ฒƒ๋“ค์ด๋‹ค. ์˜ˆ์•ฝ์–ด๋Š” ์ด๊ฑด ๋ณ€์ˆ˜๋ผ๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๋Š” let ๋˜๋Š” ์ด๊ฑด ํ•จ์ˆ˜๋ผ๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๋Š” fn ๊ฐ™์€ ๊ฒƒ์ด ๋œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์ˆซ์ž๋Š” 5์™€ 10์ด๋ผ๋Š” ๊ฒƒ๋“ค์ธ๋ฐ, ๋ ‰์‹ฑ์—์„œ๋Š” ์‚ฌ์‹ค ์ด๊ฒƒ์ด ์ˆซ์ž๋ผ๋Š” ๊ฒƒ๋งŒ ์ธ์ง€ํ•˜๋ฉด ๋œ๋‹ค. ์ด ๊ฐ’์ด 5์ธ์ง€ 1์ธ์ง€ 2์ธ์ง€๋Š” ๋ ‰์‹ฑ์ด ์‹ ๊ฒฝ์“ธ ์—ญํ• ์€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.
 
์ด์ œ ๊ฐœ์„ ๋œ ๋ ‰์„œ์˜ ๊ธฐ๋Šฅ์„ ํ•˜๋‚˜์”ฉ ์ถ”๊ฐ€ํ•ด๋ณด์ž.

3-1. ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž, ์˜ˆ์•ฝ์–ด, ์ˆซ์ž๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ธฐ

๋ ‰์„œ๋Š” ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž, ์˜ˆ์•ฝ์–ด, ์ˆซ์ž๋ฅผ ์ž˜ ๊ตฌ๋ถ„ํ•ด์„œ ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ์ž˜ ๋ ‰์‹ฑํ•ด์•ผ ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด NextToken ํ•จ์ˆ˜๋ฅผ ๊ฐœ์„ ํ•ด์•ผ ํ•œ๋‹ค. ๋จผ์ € ๊ฐœ์„ ๋œ NextToken ํ•จ์ˆ˜๋ฅผ ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

func (l *Lexer) NextToken() token.Token {
	var tok token.Token

	switch l.ch {
	case '=':
		tok = newToken(token.ASSIGN, l.ch)
	case ';':
		tok = newToken(token.SEMICOLON, l.ch)
	case '(':
		tok = newToken(token.LPAREN, l.ch)
	case ')':
		tok = newToken(token.RPAREN, l.ch)
	case ',':
		tok = newToken(token.COMMA, l.ch)
	case '+':
		tok = newToken(token.PLUS, l.ch)
	case '{':
		tok = newToken(token.LBRACE, l.ch)
	case '}':
		tok = newToken(token.RBRACE, l.ch)
	case 0:
		tok.Type = token.EOF
		tok.Literal = ""
	default:
		if isLetter(l.ch) {
			tok.Literal = l.readIdentifier()
			tok.Type = token.LookupIdent(tok.Literal)
			return tok
		} else {
			tok = newToken(token.ILLEGAL, l.ch)
		}
	}
	l.readChar()
	return tok

 
์ˆ˜์ •๋œ ๋ถ€๋ถ„์€ default ๋ถ€๋ถ„์ด๋‹ค. default ๊ตฌ๋ฌธ์„ ์œ„ case ๊ตฌ๋ฌธ์— ์–ด๋–ค ๊ฒƒ๋„ ๋ถ„๊ธฐ๋˜์ง€ ์•Š์•˜์„ ๊ฒฝ์šฐ ํƒ€๊ฒŒ ๋˜๋Š” ๊ตฌ๋ฌธ์ด๋‹ค. ๋จผ์ € isLetter ํ•จ์ˆ˜๊ฐ€ ๋“ฑ์žฅํ•œ๋‹ค. ์ด isLetter ํ•จ์ˆ˜๋Š” ์–ด๋–ค ๊ธฐ๋Šฅ์„ ํ•˜๋Š”์ง€ ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

func isLetter(ch byte) bool {
	return 'a' <= ch && ch <= 'z' || 'A' <= ch && ch <= 'Z' || ch == '_'
}

 
๊ฐ„๋‹จํ•˜๋‹ค. ๋ฐ”์ดํŠธ ํƒ€์ž…์˜ ๋ฌธ์ž๋ฅผ ๋ฐ›์•„์„œ ํ•ด๋‹น ๋ฌธ์ž๊ฐ€ ์•ŒํŒŒ๋ฒณ a ~ z ๋˜๋Š” A ~ Z ๋˜๋Š” ์–ธ๋”์Šค์ฝ”์–ด(_)์ธ ๊ฒฝ์šฐ์—๋งŒ true๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ์—ฌ๊ธฐ์„œ ํ•จ์ˆ˜ ๊ธฐ๋Šฅ๋ณด๋‹ค๋Š” ์ด isLetter ํ•จ์ˆ˜๊ฐ€ ์™œ ์กด์žฌํ•ด์•ผ ํ•˜๋Š”์ง€๋ฅผ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ๋” ์ค‘์š”ํ•˜๋‹ค. ์šฐ์„  '๋ฌธ์ž' ๋ผ๋Š” ๊ฒƒ์—๋Š” '๊ธ€์ž'๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค. ์‰ฝ๊ฒŒ ๋งํ•ด ๋ฒค๋‹ค์ด์–ด ๊ทธ๋žจ์œผ๋กœ ํ‘œ์‹œํ•˜๋ฉด ๋ฌธ์ž์™€ ๊ธ€์ž๊ฐ„์˜ ๊ด€๊ณ„๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
 

 
๊ทธ๋Ÿฌ๋ฉด ์ด ๋ฌธ์ž ์ค‘์— '๊ธ€์ž' ์ธ ๊ฒƒ๊ณผ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'๋ฅผ ๋Œ€์ฒด ์™œ ๊ตฌ๋ถ„ํ•ด์•ผ ํ• ๊นŒ? ๋ฐ”๋กœ ์‹๋ณ„์ž๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ธฐ ์œ„ํ•จ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์šฐ๋ฆฌ๊ฐ€ ์ฒ˜์Œ ์˜ˆ์‹œ์—์„œ ์ž…๋ ฅ ์†Œ์Šค์ฝ”๋“œ๋กœ ๋ณด์•˜๋˜ =+(){},; ์ด๋Ÿฐ ๊ฒƒ๋“ค๋„ '๋ฌธ์ž'์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  add, five, ten ๊ณผ ๊ฐ™์€ ๊ฒƒ๋“ค๋„ '๋ฌธ์ž'๊ฐ€ ๋ชจ์—ฌ ๋ฌธ์ž์—ด์„ ํ˜•์„ฑํ•œ ๊ฒƒ์ด๋‹ค. ํ•˜์ง€๋งŒ, =+(){},; ์ด๋Ÿฐ ๊ฒƒ๋“ค์€ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'์ด๋‹ค. ๋ฐ˜๋ฉด์— add, five, ten ๊ฐ™์€ ๊ฒƒ๋“ค์€ '๊ธ€์ž'์ด๋‹ค. ์ฆ‰, ์ด ๋‘๊ฐœ๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์šฐ๋ฆฌ๋Š” ํŠน์ • ๋ฌธ์ž๊ฐ€ '๊ธ€์ž'์ธ์ง€, ํ˜น์€ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'์ธ์ง€ ํŒ๋ณ„ํ•ด์•ผ ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๋Ÿฌํ•œ ํŒ๋ณ„์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋กœ isLetter ํ•จ์ˆ˜์ด๋‹ค.
 
์ด์ œ ๋‹ค์Œ์œผ๋กœ ๋„˜์–ด๊ฐ€๋ณด์ž. ๋‹ค์Œ์œผ๋กœ ๋ณผ ํ•จ์ˆ˜๋Š” Lexer ๊ตฌ์กฐ์ฒด ํฌ์ธํ„ฐ ๋ณ€์ˆ˜๋ฅผ Receiver ๋กœํ•˜๋Š” ๋ฉ”์†Œ๋“œ์ธ readIdentifier ํ•จ์ˆ˜์ด๋‹ค. ํ•ด๋‹น ํ•จ์ˆ˜์˜ ์†Œ์Šค์ฝ”๋“œ ๋ถ€ํ„ฐ ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

func (l *Lexer) readIdentifier() string {
	position := l.position
	for isLetter(l.ch) {
		l.readChar()
	}
	return l.input[position:l.position]
}

 
ํ•จ์ˆ˜ ์ด๋ฆ„์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด ์ด ํ•จ์ˆ˜๋Š” ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์—ด(์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด)์—์„œ '์‹๋ณ„์ž'๋ฅผ ์ถ”์ถœํ•ด๋‚ด๋Š” ํ•จ์ˆ˜์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์—ด์˜ ํ˜„์žฌ ์œ„์น˜๋ฅผ ์‹œ์ž‘์ ์œผ๋กœ ํ•ด์„œ ํ•ด๋‹น ์œ„์น˜์— ์žˆ๋Š” ๋ฌธ์ž๊ฐ€ '๊ธ€์ž'๋ผ๋ฉด ๋ฌดํ•œ ๋ฃจํ”„๋ฅผ ํƒ€๊ฒŒํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ๋ฌดํ•œ๋ฃจํ”„ ์•ˆ์—์„œ๋Š” ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์—ด์˜ ํ˜„์žฌ ์œ„์น˜์—์„œ ๋‹ค์Œ ์œ„์น˜๋กœ ์ด๋™์‹œํ‚ค๋Š” readChar ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๋„๋ก ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๋ฃจํ”„๋ฅผ ๋Œ๋‹ค๊ฐ€ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋˜๋ฉด ๋ฃจํ”„๋ฅผ ๋น ์ ธ๋‚˜์™€ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'์˜ ์ง์ „ ์œ„์น˜๊นŒ์ง€์˜ ๋ฌธ์ž์—ด์„ ์Šฌ๋ผ์ด์‹ฑํ•ด์„œ ๋ฆฌํ„ดํ•œ๋‹ค.
 
๊ทธ๋ฆฌ๊ณ  ๋งˆ์ง€๋ง‰์œผ๋กœ token.LookupIdent ํ•จ์ˆ˜๋ฅผ ์‚ดํŽด๋ณด์ž. ํ•ด๋‹น ํ•จ์ˆ˜๋Š” ์‹๋ณ„์ž ์ค‘์—์„œ๋„ ์ด๊ฒƒ์ด ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž(IDENT)์ธ์ง€, ์˜ˆ์•ฝ์–ด์ธ์ง€ ๋ถ„๊ธฐํ•˜๋Š” ๊ธฐ๋Šฅ์„ ํ•˜๋Š” ํ•จ์ˆ˜๋‹ค. ํ•ด๋‹น ํ•จ์ˆ˜์˜ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ณด์ž.
 

// token/token.go

var keywords = map[string]TokenType{
	"fn":  FUNCTION,
	"let": LET,
}

func LookupIdent(ident string) TokenType {
	if tok, ok := keywords[ident]; ok {
		return tok
	}
	return IDENT
}

 
์šฐ์„  ๊ฐ€์žฅ ๋จผ์ € ํ•  ์ผ์€ keywords ๋ผ๋Š” ๋งต ์ž๋ฃŒ๊ตฌ์กฐ๋ฅผ ์ •์˜ํ–ˆ๋‹ค. ํ•ด๋‹น ๋งต์€ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์—์„œ ์ถ”์ถœํ•œ ์‹๋ณ„์ž๊ฐ€ ์˜ˆ์•ฝ์–ด์— ํ•ด๋‹นํ•˜๋Š”์ง€ ๋งคํ•‘ํ•ด๋†“์€ ํ…Œ์ด๋ธ”์ด๋‹ค. ์ฆ‰, ๋ฃฉ์—… ํ…Œ์ด๋ธ”์ธ ์…ˆ์ด๋‹ค. ์ถ”ํ›„์— ํ™•์žฅํ•˜๊ฒ ์ง€๋งŒ ์•ž์œผ๋กœ ์šฐ๋ฆฌ๊ฐ€ ๋ ‰์‹ฑ ๋Œ€์ƒ์— ์–ด๋–ค ์˜ˆ์•ฝ์–ด๋ฅผ ํฌํ•จ์‹œํ‚ค๊ณ  ์‹ถ๋‹ค๊ณ  ํ•˜๋ฉด ํ•ด๋‹น ๋งต ์ž๋ฃŒ๊ตฌ์กฐ์— ๊ณ„์† ์ถ”๊ฐ€ํ•ด๋‚˜๊ฐ€๋ฉด ๋œ๋‹ค.
 
์ด์ œ LookupIdent ํ•จ์ˆ˜๋ฅผ ๋ณด์ž. ๋กœ์ง์€ ๊ฐ„๋‹จํ•˜๋‹ค. ๋ฌธ์ž์—ด์„ ์ž…๋ ฅ ๋ฐ›์€ ๋’ค ํ•ด๋‹น ๋ฌธ์ž์—ด์ด keywords ๋งต ์ž๋ฃŒ๊ตฌ์กฐ์˜ ํ‚ค ๊ฐ’์— ์กด์žฌํ•˜๋Š”์ง€ ์ฆ‰, ์šฐ๋ฆฌ๊ฐ€ ์‚ฌ์ „์— ๋ช…์‹œํ•ด๋†“์€ ์˜ˆ์•ฝ์–ด ๋ชฉ๋ก์— ํฌํ•จ๋˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ํ•ด๋‹น ๋ถ„๊ธฐ๋ฅผ ํƒ€์ง€ ์•Š์œผ๋ฉด ๊ทธ๊ฒƒ์€ ์‚ฌ์šฉ์ž ์ •์˜ ์‹๋ณ„์ž๋กœ ๊ฐ„์ฃผํ•ด ๋ฆฌํ„ดํ•œ๋‹ค.
 
๊ทธ๋ฆฌ๊ณ  ๋งˆ์ง€๋ง‰์œผ๋กœ default ๊ตฌ๋ฌธ ๋‚ด์˜ else ๊ตฌ๋ฌธ์ด ์กด์žฌํ•˜๋Š” ๊ฒƒ์€ case ๊ตฌ๋ฌธ์—์„œ '๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž'์— ๋Œ€ํ•ด์„œ๋Š” ๋ถ„๊ธฐ๋ฅผ ๋‹ค ํ–ˆ์œผ๋ฏ€๋กœ, ๋งŒ์•ฝ ํ•ด๋‹น else ๊ตฌ๋ฌธ์œผ๋กœ ํƒ€๋Š” ๋ฌธ์ž๋Š” ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ ‰์„œ๊ฐ€ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์—†๋Š” ๋ฌธ์ž์ž„์„ ์ฆ‰, ์˜ˆ์™ธ์ฒ˜๋ฆฌ๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด ILLEGAL ์ด๋ผ๋Š” ํ† ํฐ์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ํ•ด๋‹น ํ† ํฐ์ด ๋ฐ˜ํ™˜๋˜๋ฉด ์šฐ๋ฆฌ์˜ ๋ ‰์„œ์— ๋Œ€ํ•œ ํ…Œ์ŠคํŠธ๊ฐ€ ์‹คํŒจํ–ˆ๋‹ค๊ณ  ๋ณด๋ฉด ๋œ๋‹ค.

3-2. ๊ณต๋ฐฑ๊ณผ ์ˆซ์ž์ž„์„ ๊ตฌ๋ถ„ํ•˜๊ธฐ

๋‹ค์Œ์œผ๋กœ ๊ฐœ์„ ํ•  ๋ถ€๋ถ„์€ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์—ด์— ๊ณต๋ฐฑ์ด ์žˆ๊ฑฐ๋‚˜ ์ˆซ์ž๋กœ ๋œ ๋ฌธ์ž์—ด(ex. 5, 10)์ด ์žˆ์„ ๊ฒฝ์šฐ ์ˆซ์ž๋ผ๋Š” ํ† ํฐ์œผ๋กœ ์ธ์ง€ํ•˜๋„๋ก ๊ฐœ์„ ํ•ด์•ผ ํ•œ๋‹ค. ๋จผ์ € ๊ณต๋ฐฑ์ด ์žˆ์„ ๊ฒฝ์šฐ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜์˜ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•ด๋ณด์ž.
 

// lexer/lexer.go

func (l *Lexer) skipWhitespace() {
	for l.ch == ' ' || l.ch == '\t' || l.ch == '\n' || l.ch == '\r' {
		l.readChar()
	}
}

 
์ด ํ•จ์ˆ˜ ๋กœ์ง๋„ ๊ฐ„๋‹จํ•˜๋‹ค. ํ˜„์žฌ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž๊ฐ€ ๊ณต๋ฐฑ์ด๊ฑฐ๋‚˜ ํƒญ ๋ฌธ์ž, ๊ณต๋ฐฑ ๋ฌธ์ž, ๊ทธ๋ฆฌ๊ณ  ์ปค์„œ๋ฅผ ์•ž์œผ๋กœ ์˜ฎ๊ธฐ๋Š” ๋ฌธ์ž(\r)๋ผ๋ฉด ๋‹ค์Œ ์œ„์น˜๋กœ ๋ ‰์‹ฑ ๋Œ€์ƒ์„ ์˜ฎ๊ธฐ๋Š” ๊ฒƒ์ด๋‹ค.
 
๊ทธ๋ฆฌ๊ณ  ์ˆซ์ž์ธ์ง€๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ํ•จ์ˆ˜์˜ ์†Œ์Šค์ฝ”๋“œ๋„ ์‚ดํŽด๋ณด์ž.
 

// lexer/lexer.go

func (l *Lexer) readNumber() string {
	position := l.position
	for isDigit(l.ch) {
		l.readChar()
	}
	return l.input[position:l.position]
}

func isDigit(ch byte) bool {
	return '0' <= ch && ch <= '9'
}

 
isDigit ํ•จ์ˆ˜๋Š” ๊ฐ„๋‹จํ•˜๋‹ˆ ๋ณ„๋‹ค๋ฅธ ์„ค๋ช…์€ ํ•˜์ง€ ์•Š๊ฒ ๋‹ค. readNumber ํ•จ์ˆ˜๋„ ์‚ฌ์‹ค ์œ„์—์„œ ์‚ดํŽด๋ณธ ์‹๋ณ„์ž๋ฅผ ์ถ”์ถœํ•˜๋Š” ํ•จ์ˆ˜์ธ readIdentifier ํ•จ์ˆ˜์™€ ๋กœ์ง์€ ๋™์ผํ•˜๋‹ค. ๋‹จ์ˆœํžˆ ์กฐ๊ฑด๋ฌธ ๋ถ„๊ธฐ์— isDigit ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒƒ์œผ๋กœ ๋ฐ”๋€Œ์—ˆ์„ ๋ฟ์ด๋‹ค.
 
์ด์ œ ์œ„ 2๊ฐ€์ง€ ํ•จ์ˆ˜๋ฅผ NextToken ํ•จ์ˆ˜์— ์ถ”๊ฐ€ํ•ด์ฃผ๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์•„์ง„๋‹ค.
 

// lexer/lexer.go

func (l *Lexer) NextToken() token.Token {
	var tok token.Token

	switch l.ch {
	case '=':
		tok = newToken(token.ASSIGN, l.ch)
	case ';':
		tok = newToken(token.SEMICOLON, l.ch)
	case '(':
		tok = newToken(token.LPAREN, l.ch)
	case ')':
		tok = newToken(token.RPAREN, l.ch)
	case ',':
		tok = newToken(token.COMMA, l.ch)
	case '+':
		tok = newToken(token.PLUS, l.ch)
	case '{':
		tok = newToken(token.LBRACE, l.ch)
	case '}':
		tok = newToken(token.RBRACE, l.ch)
	case 0:
		tok.Type = token.EOF
		tok.Literal = ""
	default:
		if isLetter(l.ch) {
			tok.Literal = l.readIdentifier()
			tok.Type = token.LookupIdent(tok.Literal)
			return tok
		} else if isDigit(l.ch) {
			tok.Literal = l.readNumber()
			tok.Type = token.INT
			return tok
		} else {
			tok = newToken(token.ILLEGAL, l.ch)
		}
	}
	l.readChar()
	return tok
}

 
์ด์ œ ์œ„์ฒ˜๋Ÿผ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด ๊ทธ๋Ÿด๋“ฏํ•œ Monkey ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋กœ ์ž‘์„ฑ๋œ ์†Œ์Šค์ฝ”๋“œ๊ฐ€ ์ž˜ ๋ ‰์‹ฑ๋  ๊ฒƒ์ด๋‹ค!

3-3. ๋‘๊ฐœ ์งœ๋ฆฌ ๋ฌธ์ž์™€ ๋‹ค๋ฅธ ์ข…๋ฅ˜์˜ ์˜ˆ์•ฝ์–ด๋„ ๋ ‰์‹ฑํ•˜๊ธฐ

์ด๋ฒˆ์—” [๋ชฉ์ฐจ 3-2]์—์„œ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ์—์„œ ์ˆ˜ํ–‰ํ–ˆ๋˜ ๊ฒƒ๋ณด๋‹ค ์ข€ ๋” ๋‹ค์–‘ํ•œ ์˜ˆ์•ฝ์–ด๊ฐ€ ์ถ”๊ฐ€๋œ Monkey ์–ธ์–ด ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋ ‰์‹ฑํ•ด๋ณด์ž. ์ด๋ฒˆ์—๋„ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ถ€ํ„ฐ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž.

 

// lexer/lexer_test.go

func TestNextToken(t *testing.T) {
	input := `let five = 5;
let ten = 10;

let add = fn(x, y) {
 x + y;
};
let result = add(five, ten);
!-/*5;
5 < 10 > 5;
if (5 < 10) {
  return true;
} else {
  return false;
}

10 == 10;
10 != 9;
`

	tests := []struct {
		expectedType    token.TokenType
		expectedLiteral string
	}{
		{token.LET, "let"},
		{token.IDENT, "five"},
		{token.ASSIGN, "="},
		{token.INT, "5"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "ten"},
		{token.ASSIGN, "="},
		{token.INT, "10"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "add"},
		{token.ASSIGN, "="},
		{token.FUNCTION, "fn"},
		{token.LPAREN, "("},
		{token.IDENT, "x"},
		{token.COMMA, ","},
		{token.IDENT, "y"},
		{token.RPAREN, ")"},
		{token.LBRACE, "{"},
		{token.IDENT, "x"},
		{token.PLUS, "+"},
		{token.IDENT, "y"},
		{token.SEMICOLON, ";"},
		{token.RBRACE, "}"},
		{token.SEMICOLON, ";"},
		{token.LET, "let"},
		{token.IDENT, "result"},
		{token.ASSIGN, "="},
		{token.IDENT, "add"},
		{token.LPAREN, "("},
		{token.IDENT, "five"},
		{token.COMMA, ","},
		{token.IDENT, "ten"},
		{token.RPAREN, ")"},
		{token.SEMICOLON, ";"},
		{token.BANG, "!"},
		{token.MINUS, "-"},
		{token.SLASH, "/"},
		{token.ASTERISK, "*"},
		{token.INT, "5"},
		{token.SEMICOLON, ";"},
		{token.INT, "5"},
		{token.LT, "<"},
		{token.INT, "10"},
		{token.GT, ">"},
		{token.INT, "5"},
		{token.SEMICOLON, ";"},
		{token.IF, "if"},
		{token.LPAREN, "("},
		{token.INT, "5"},
		{token.LT, "<"},
		{token.INT, "10"},
		{token.RPAREN, ")"},
		{token.LBRACE, "{"},
		{token.RETURN, "return"},
		{token.TRUE, "true"},
		{token.SEMICOLON, ";"},
		{token.RBRACE, "}"},
		{token.ELSE, "else"},
		{token.LBRACE, "{"},
		{token.RETURN, "return"},
		{token.FALSE, "false"},
		{token.SEMICOLON, ";"},
		{token.RBRACE, "}"},
		{token.INT, "10"},
		{token.EQ, "=="},
		{token.INT, "10"},
		{token.SEMICOLON, ";"},
		{token.INT, "10"},
		{token.NOT_EQ, "!="},
		{token.INT, "9"},
		{token.SEMICOLON, ";"},
		{token.EOF, ""},
	}

	src := New(input)

	for i, tt := range tests {
		tok := src.NextToken()
		if tok.Type != tt.expectedType {
			t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q",
				i, tt.expectedType, tok.Type)
		}
		if tok.Literal != tt.expectedLiteral {
			t.Fatalf("tests[%d] - literal wrong. expected=%q, got=%q",
				i, tt.expectedLiteral, tok.Literal)
		}
	}
}

 

์ž…๋ ฅ์œผ๋กœ ์ฃผ์–ด์ง€๋Š” ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ๋ณด๋‹ˆ, ํฌ๊ฒŒ ์ถ”๊ฐ€๋œ ๊ฒƒ๋“ค์€ !-/*<> ๊ทธ๋ฆฌ๊ณ  !=, == ๋ผ๋Š” 2๊ฐœ์งœ๋ฆฌ ๋ฌธ์ž์™€ if, else, return ๊ฐ™์€ ๋‹ค๋ฅธ ์ข…๋ฅ˜์˜ ์˜ˆ์•ฝ์–ด์ด๋‹ค. ์—ฌ๊ธฐ์„œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด ์ค‘ ์•„๋ž˜์™€ ๊ฐ™์€ ๋ถ€๋ถ„์ด ๋ณด์ธ๋‹ค.

 

!-/*5;
5 < 10 > 5;

 

๋ฌผ๋ก  ์•„๋ฌด๋ฆฌ ์šฐ๋ฆฌ๋งŒ์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์ธ Monkey ๋ผ๊ณ  ํ•ด๋„ ์‚ฌ์‹ค ์ด๊ฒŒ ๋ฌธ๋ฒ•์ ์œผ๋กœ๋Š” ๋งž์ง€ ์•Š๋Š”๋‹ค. ํ•˜์ง€๋งŒ ์ด ํ…Œ์ŠคํŠธ์—์„œ ์ด๋Ÿฐ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ์ž…๋ ฅํ•œ ๊ฒƒ์€ ๋ ‰์„œ์˜ ์—ญํ• ์ด "๊ทธ์ € ์ž…๋ ฅ๋œ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ํ† ํฐํ™” ํ•˜๋Š” ๊ฒƒ์ผ ๋ฟ"์„ ๊ฐ•์กฐํ•˜๊ธฐ ์œ„ํ•ด์„œ๋‹ค. ๋‹ค์‹œ ๋งํ•ด, ๋ ‰์„œ๋Š” ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์„ ํ† ํฐํ™”ํ•˜๋Š” ๊ฒƒ๋งŒ ์‹ ๊ฒฝ์“ฐ๋ฉด ๋œ๋‹ค. ๊ทธ ์†Œ์Šค์ฝ”๋“œ ๋ฌธ์ž์—ด์ด ๋ฌธ๋ฒ•์ ์œผ๋กœ ๋งž๋Š”์ง€, ๋งž์ง€ ์•Š๋Š”์ง€๋Š” ๊ทธ ์ดํ›„์— ํŒ๋ณ„ํ•˜๋ฉด ๋œ๋‹ค.

 

๋จผ์ € !-/*<> ์™€ ๊ฐ™์€ 1๊ฐœ์งœ๋ฆฌ ๋ฌธ์ž๋กœ ๋œ ๊ฒƒ์„ ๋ ‰์‹ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ ์šฐ๋ฆฌ๋Š” ๊ฐ€์žฅ ๋จผ์ € ํ•ด๋‹น ๋ฌธ์ž๋ฅผ ํ† ํฐ์— ์ถ”๊ฐ€ํ•ด์ค€๋‹ค. ๊ทธ๋ฆฌ๊ณ  if, else, return ๊ณผ ๊ฐ™์ด ์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ ์˜ˆ์•ฝ์–ด์— ๋Œ€ํ•ด์„œ๋Š” ํ† ํฐ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์–ด๋–ค ๊ธ€์ž๊ฐ€ ์–ด๋–ค ์˜ˆ์•ฝ์–ด๋กœ ๋งคํ•‘๋˜๋Š”์ง€ ๋ฃฉ์—… ํ…Œ์ด๋ธ”์ธ keywords ๋ณ€์ˆ˜์—๋‹ค๊ฐ€๋„ ์ถ”๊ฐ€ํ•ด์ฃผ์ž.

 

// token/token.go

const (
	ILLEGAL = "ILLEGAL"
	EOF     = "EOF"

	// Identifiers + literals
	IDENT = "IDENT" // add, result, x, y, ...
	INT   = "INT"   // 1343456

	// Operators
	ASSIGN   = "="
	PLUS     = "+"
	MINUS    = "-"
	BANG     = "!"
	ASTERISK = "*"
	SLASH    = "/"

	LT = "<"
	GT = ">"
	// Delimiters
	COMMA     = ","
	SEMICOLON = ";"

	LPAREN = "("
	RPAREN = ")"
	LBRACE = "{"
	RBRACE = "}"

	// ์˜ˆ์•ฝ์–ด
	FUNCTION = "FUNCTION"
	LET      = "LET"
	TRUE     = "TRUE"
	FALSE    = "FALSE"
	IF       = "IF"
	ELSE     = "ELSE"
	RETURN   = "RETURN"
)

var keywords = map[string]TokenType{
	"fn":  FUNCTION,
	"let": LET,
	"true": TRUE,
	"false": FALSE,
	"if":   IF,
	"else": ELSE,
	"return": RETURN,
}

 

๊ทธ๋ฆฌ๊ณ  ์ด์ œ ๋ ‰์‹ฑํ•  ๋Œ€์ƒ ๋ฌธ์ž๊ฐ€ ์–ด๋–ค ํ† ํฐ์ธ์ง€ ํ™•์ธํ•˜๊ณ  ๊ทธ์— ๋งž๋Š” ํ† ํฐ์„ ์ƒ์„ฑํ•˜๋Š” NextToken ํ•จ์ˆ˜์˜ switch-case ๊ตฌ๋ฌธ์— ๋ฐฉ๊ธˆ ์ถ”๊ฐ€ํ•œ ๋ฌธ์ž๋“ค์„ ๋ฐ˜์˜ํ•ด๋ณด์ž. ์ฐธ๊ณ ๋กœ ์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ ์˜ˆ์•ฝ์–ด์˜ ๊ฒฝ์šฐ๋Š” ๊ธฐ์กด NextToken ํ•จ์ˆ˜์˜ default ๊ตฌ๋ฌธ์„ ๊ทธ๋Œ€๋กœ ์ด์šฉํ•˜๋ฉด ๋œ๋‹ค.

 

// lexer/lexer.go

func (l *Lexer) NextToken() token.Token {
	var tok token.Token

	switch l.ch {
	case '=':
		tok = newToken(token.ASSIGN, l.ch)
	case ';':
		tok = newToken(token.SEMICOLON, l.ch)
	case '(':
		tok = newToken(token.LPAREN, l.ch)
	case ')':
		tok = newToken(token.RPAREN, l.ch)
	case ',':
		tok = newToken(token.COMMA, l.ch)
	case '+':
		tok = newToken(token.PLUS, l.ch)
	case '-':
		tok = newToken(token.MINUS, l.ch)
	case '!':
		tok = newToken(token.BANG, l.ch)
	case '*':
		tok = newToken(token.ASTERISK, l.ch)
	case '/':
		tok = newToken(token.SLASH, l.ch)
	case '<':
		tok = newToken(token.LT, l.ch)
	case '>':
		tok = newToken(token.GT, l.ch)
	case '{':
		tok = newToken(token.LBRACE, l.ch)
	case '}':
		tok = newToken(token.RBRACE, l.ch)
	case 0:
		tok.Type = token.EOF
		tok.Literal = ""
	default:
		if isLetter(l.ch) {
			tok.Literal = l.readIdentifier()
			tok.Type = token.LookupIdent(tok.Literal)
			return tok
		} else if isDigit(l.ch) {
			tok.Literal = l.readNumber()
			tok.Type = token.INT
			return tok
		} else {
			tok = newToken(token.ILLEGAL, l.ch)
		}
	}
	l.readChar()
	return tok

 

ํ•˜์ง€๋งŒ ์œ„ ์ƒํƒœ์—์„œ์˜ ๋ ‰์„œ๋Š” ์•„์ง ๊ทน๋ณตํ•˜์ง€ ๋ชปํ•˜๋Š” ์ ์ด ์žˆ๋‹ค. ๋ฐ”๋กœ ๋‘ ๊ฐœ์˜ ๊ธ€์ž๊ฐ€ ์•„๋‹Œ ๋ฌธ์ž๋กœ ๊ตฌ์„ฑ๋œ, ์˜ˆ๋ฅผ ๋“ค์–ด != ๋‚˜ == ์™€ ๊ฐ™์€ ์—ฐ์‚ฐ์ž์— ๋Œ€ํ•œ ํ† ํฐํ™”์ด๋‹ค.(๋˜ ๋‹ค๋ฅธ ์˜ˆ์‹œ๋กœ <= >= ++, -- ์ด๋Ÿฐ ๊ฒƒ๋“ค์ด ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ํ•ด๋‹น ์ฑ…์—์„œ๋Š” ์ด๋Ÿฐ ์—ฐ์‚ฐ์ž๋“ค์— ๋Œ€ํ•œ ๋ ‰์‹ฑ์€ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค)

 

์šฐ๋ฆฌ๋Š” ์ด 2๊ฐœ ๋ฌธ์ž๋กœ ์ด๋ฃจ์–ด์ง„ ์—ฐ์‚ฐ์ž๋ฅผ ์–ด๋–ป๊ฒŒ ํ† ํฐํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ์„๊นŒ? ์šฐ์„  ์šฐ๋ฆฌ๊ฐ€ ํ™•์ธํ•˜๋ ค๊ณ ๋Š” '2๊ฐœ ๋ฌธ์ž'๊ฐ€ ๋งž๋Š”์ง€ ํ™•์ธํ•˜๋Š” ํ•จ์ˆ˜๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ฆ‰, ๋งŒ์•ฝ ๋ ‰์‹ฑ ๋Œ€์ƒ์˜ ๋ฌธ์ž๊ฐ€ = ๋กœ ๋“ฑ์žฅํ–ˆ๋‹ค๊ณ  ํ•ด๋ณด์ž. ์ด = ๋ฌธ์ž ๋‹ค์Œ์— ๋˜ ๋ฌธ์ž = ๊ฐ€ ๋‚˜์˜ค๋Š”์ง€๋ฅผ ์‚ดํŽด๋ณด์•„์•ผ ํ•œ๋‹ค. ์ด๋ฅผ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•œ peekChar ํ•จ์ˆ˜๋ฅผ ๋จผ์ € ์‚ดํŽด๋ณด์ž.

 

// lexer/lexer.go

func (l *Lexer) peekChar() byte {
	if l.readPosition >= len(l.input) {
		return 0
	} else {
		return l.input[l.readPosition]
	}
}

 

ํ•ด๋‹น ํ•จ์ˆ˜์˜ ๋กœ์ง ์ž์ฒด๋Š” ๊ฐ„๋‹จํ•˜๋‹ค. ๋‹จ์ˆœํžˆ ํ˜„์žฌ ๋ ‰์‹ฑ ๋Œ€์ƒ ๋ฌธ์ž์˜ ๋‹ค์Œ ์œ„์น˜์— ์žˆ๋Š” ๋ฌธ์ž๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด์ œ ์œ„ ํ•จ์ˆ˜๋ฅผ NextToken ํ•จ์ˆ˜์— ๋ฐ˜์˜ํ•ด๋ณด๋„๋ก ํ•˜์ž. ์šฐ๋ฆฌ๊ฐ€ ๊ณ ๋ คํ•˜๋Š” 2๊ฐœ ๋ฌธ์ž๋Š” == ๋˜๋Š” != ์ด๊ธฐ ๋•Œ๋ฌธ์— switch-case ๊ตฌ๋ฌธ์—์„œ = ๋˜๋Š” ! ๋ฌธ์ž๊ฐ€ ๋‚˜์™”์„ ๋•Œ์˜ ๋ถ€๋ถ„์— ๋ฐ˜์˜ํ•ด์ฃผ๋ฉด ๋œ๋‹ค.

 

// lexer/lexer.go

func (l *Lexer) NextToken() token.Token {
	var tok token.Token

	switch l.ch {
	case '=':
		if l.peekChar() == '=' {
			ch := l.ch
			l.readChar()
			literal := string(ch) + string(l.ch)
			tok = token.Token{Type: token.EQ, Literal: literal}
		} else {
			tok = newToken(token.ASSIGN, l.ch)
		}
        
	...(์ƒ๋žต)...
    
	case '!':
		if l.peekChar() == '=' {
			ch := l.ch
			l.readChar()
			literal := string(ch) + string(l.ch)
			tok = token.Token{Type: token.NOT_EQ, Literal: literal}
		} else {
			tok = newToken(token.BANG, l.ch)
		}
        
	...(์ƒ๋žต)...
    
	l.readChar()
	return tok
}

 

์ด์ œ ๋งˆ์ง€๋ง‰์œผ๋กœ, == ์™€ != ๋ผ๋Š” ํ† ํฐ์ด ์กด์žฌํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•„๋ž˜์™€ ๊ฐ™์ด ์ƒ์ˆ˜๊ฐ’์— ์ถ”๊ฐ€ํ•ด์ฃผ์ž.

 

// token/token.go

const (
	...(์ƒ๋žต)...
    
	ASTERISK = "*"
	SLASH    = "/"
	EQ       = "=="
	NOT_EQ   = "!="

	...(์ƒ๋žต)...
    
	ELSE     = "ELSE"
	RETURN   = "RETURN"
)

4. ๋ ‰์„œ์— ๋Œ€ํ•œ REPL ๋งŒ๋“ค๊ธฐ

3๋ฒˆ ๋ชฉ์ฐจ๊นŒ์ง€ ํ•ด์„œ ์šฐ๋ฆฌ๋งŒ์˜ ๋ ‰์„œ๋ฅผ ๋งŒ๋“ค์—ˆ๋‹ค! ์ด์ œ๋Š” ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ๋ ‰์„œ๋ฅผ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด ์ข€ ๋” ์‚ฌ์šฉ์ž ์นœํ™”์ ์ธ REPL(Read, Eval, Print, Loop)์„ ๋งŒ๋“ค์–ด๋ณด์ž. REPL์€ ์ฝ˜์†” ๋˜๋Š” ๋Œ€ํ™”ํ˜• ๋ชจ๋“œ๋ผ๊ณ  ์ด์•ผ๊ธฐํ•œ๋‹ค. ํŒŒ์ด์ฌ ์–ธ์–ด๊ฐ€ ์ต์ˆ™ํ•œ ์‚ฌ๋žŒ๋“ค์€ ํ„ฐ๋ฏธ๋„์— python ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ๋“ค์–ด๊ฐ€๊ฒŒ ๋˜๋Š” ๋Œ€ํ™”ํ˜• ๋ชจ๋“œ๋ž‘ ๋˜‘๊ฐ™๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋œ๋‹ค.

 

REPL์„ ๋งŒ๋“œ๋Š” ์ฝ”๋“œ๋Š” ์šฐ๋ฆฌ๊ฐ€ ๊ตฌ์ฒด์ ์ธ ๋กœ์ง์„ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์€ ๊ฑฐ์˜ ์—†๊ณ , ํŒจํ‚ค์ง€๋ฅผ ๊ฐ€์ ธ๋‹ค ์“ฐ๊ฑฐ๋‚˜ ์˜ˆ์™ธ์ฒ˜๋ฆฌํ•˜๋Š” ๋กœ์ง์ด ์ „๋ถ€๋‹ค. ๋”ฐ๋ผ์„œ ์ฝ”๋“œ๋งŒ ์ฒจ๋ถ€ํ•˜๊ณ  ๊ธ€์„ ๋งˆ๋ฌด๋ฆฌํ•˜๋ ค๊ณ  ํ•œ๋‹ค.

 

๋จผ์ € repl ํŒจํ‚ค์ง€๋ฅผ ์ƒ์„ฑํ•ด์„œ ์•„๋ž˜์˜ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ์ž…๋ ฅํ•˜์ž.

 

// repl/repl.go

package repl

import (
	"bufio"
	"fmt"
	"io"
	"monkey/lexer"
	"monkey/token"
)

const PROMPT = ">> "

func Start(in io.Reader, out io.Writer) {
	scanner := bufio.NewScanner(in)

	for {
		fmt.Fprintf(out, PROMPT)
		scanned := scanner.Scan()
		if !scanned {
			return
		}
		line := scanner.Text()
		l := lexer.New(line)

		for tok := l.NextToken(); tok.Type != token.EOF; tok = l.NextToken() {
			fmt.Fprintf(out, "%+v\n", tok)
		}
	}
}

 

๊ทธ๋ฆฌ๊ณ  ์ด์ œ ์‚ฌ์šฉ์ž์˜ ์—”ํŠธ๋ฆฌํฌ์ธํŠธ์ธ main ํŒจํ‚ค์ง€๋ฅผ ๋งŒ๋“ค๊ณ , main ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•ด๋ณด์ž.

 

// main.go

package main

import (
	"fmt"
	"monkey/repl"
	"os"
	"os/user"
)

func main() {
	user, err := user.Current()
	if err != nil {
		panic(err)
	}
	fmt.Printf("Hello, %s!\n", user.Username)
	fmt.Printf("Feel free to type in commands\n")
	repl.Start(os.Stdin, os.Stdout)
}

 

๊ทธ๋ฆฌ๊ณ  ์œ„ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ํ˜„์žฌ ๊ฒฝ๋กœ๋กœ ๋‘๊ณ , go run ๋ช…๋ น์–ด๋ฅผ ์ˆ˜ํ–‰ํ•ด๋ณด์ž. ์šฐ๋ฆฌ๋งŒ์˜ ๋ ‰์„œ์— ๋Œ€ํ•ด์„œ ๋Œ€ํ™”ํ˜• ๋ชจ๋“œ๋กœ ํ…Œ์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค!


์ด์ œ ๋‹ค์Œ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ ๋งŒ๋“  ๋ ‰์„œ๊ฐ€ ํ† ํฐํ™”์‹œํ‚จ ํ† ํฐ๋“ค์„ ์ถ”์ƒ๊ตฌ๋ฌธํŠธ๋ฆฌ๋กœ ๋งŒ๋“œ๋Š” ํŒŒ์‹ฑ(Parsing)์„ ํ•ด๋ณผ ์ฐจ๋ก€๋‹ค.

๋ฐ˜์‘ํ˜•