Kaleidoscope: Tutorial Introduction and Lexer

translate from http://llvm.org/docs/tutorial/LangImpl01.html

本tutorial将会介绍一个简单语言的实现.

1.1 Tutorial Introduction #

1.2 The Basic Language #

Kaleidoscope是一个过程语言. 它允许你定义函数, 使用条件分支, math, etc.

在本tutorial中, 我们将会扩展Kaleidoscope来支持if/then/else 判断, for循环, 用户自定义operators, 用一个简单的命令行接口的JIT 编译 , etc.

因为我们都想要把事情做的简单. 所以in Kaleidoscope中, 唯一的数据类型是64-bit floating point type. 因此, 所有的values都隐式地double类型.

下面是使用Kaleidoscope来计算Fibonacci numbers的一个simple example:

//Compute the x'th fibonacci number.
def fib(x)
    if x < 3 then 
        1
    else
        fib(x-1) + fib(x-2)

# This expression will compute the 40th number        
fib(40)

我们也允许Kaleidoscope来调用标准库函数. 这意味着, 你可以在函数前面加关键字"extern".

For Example:

extern sin(arg);
extern cos(arg);
extern atan2(arg2, arg2);

atan2(sin(.4), cos(42))

第6章包含更有趣的例子, 我们使用Kaleidoscope写了一个小型应用 display a Mandelbrot Set at various levels of magnification.

下面让我们探究Kaleidoscope的实现吧!!!.

1.3 Lexer #

Token类型:

// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
// of these for known things.
enum Token {
  tok_eof = -1,

  // commands
  tok_def = -2,
  tok_extern = -3,

  // primary
  tok_identifier = -4,
  tok_number = -5,
};

static std::string IdentifierStr; // Filled in if tok_identifier
static double NumVal;             // Filled in if tok_number

从标准输入中返回next token

// gettok - Return the next token from standard input.
static int gettok() {
  static int LastChar = ' ';

  // Skip any whitespace.
  while (isspace(LastChar))
    LastChar = getchar();

判断token是否为标识符

if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
  IdentifierStr = LastChar;
  while (isalnum((LastChar = getchar())))
    IdentifierStr += LastChar;

  if (IdentifierStr == "def")
    return tok_def;
  if (IdentifierStr == "extern")
    return tok_extern;
  return tok_identifier;
}

判断token是否为数字

if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
  std::string NumStr;
  do {
    NumStr += LastChar;
    LastChar = getchar();
  } while (isdigit(LastChar) || LastChar == '.');

  NumVal = strtod(NumStr.c_str(), 0);
  return tok_number;
}

判断是否为注释

if (LastChar == '#') {
  // Comment until end of line.
  do
    LastChar = getchar();
  while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');

  if (LastChar != EOF)
    return gettok();
}

是否文件末尾

// Check for end of file.  Don't eat the EOF.
if (LastChar == EOF)
      return tok_eof;

// Otherwise, just return the character as its ascii value.
int ThisChar = LastChar;
LastChar = getchar();
return ThisChar;
}