介绍和分析
tree-sitter
是一个parser生成工具,用于生成语法树。
eof
规则表示从当前位置开始,匹配空字符直到文件结尾,中途任何的非空字符都会导致匹配失败。
tree-sitter
本身不提供eof
规则,以下是使用external scanner功能实现的eof
。
代码实现
.代表项目根目录
记得将代码中的your_language
替换成你的语言名称
./binding.gyp 文件中
{"targets": [{..."sources": ["bindings/node/binding.cc","src/parser.c","src/scanner.c", // 添加这一行],...}]
}
scanner.c 文件需要手动创建
external scanner的具体规则用法参考 tree-sitter external scanner
./src/scanner.c
#include "tree_sitter/alloc.h"
#include "tree_sitter/array.h"
#include "tree_sitter/parser.h"static bool scan_eof(TSLexer *lexer);enum TokenType { Eof };void *tree_sitter_your_language_external_scanner_create(void) { return NULL; }void tree_sitter_your_language_external_scanner_destroy(void *payload) {}unsigned tree_sitter_your_language_external_scanner_serialize(void *payload,char *buffer) {return 0;
}void tree_sitter_your_language_external_scanner_deserialize(void *payload,const char *buffer,unsigned length) {}bool tree_sitter_your_language_external_scanner_scan(void *payload, TSLexer *lexer,const bool *valid_symbols) {if (valid_symbols[Eof]) {return scan_eof(lexer);}return false; // 匹配失败
}bool scan_eof(TSLexer *lexer) {// 标记匹配开始位置lexer->mark_end(lexer);while (!lexer->eof(lexer)) {// 检查是否有非空字符,具体忽略哪些空字符可以在这里指定if (lexer->lookahead != ' ' && lexer->lookahead != '\n' &&lexer->lookahead != '\r' && lexer->lookahead != '\t' &&lexer->lookahead != '\f' && lexer->lookahead != '\v') {return false; // 匹配失败}lexer->advance(lexer, true); // 消耗字符lexer->mark_end(lexer); // 动态更新结束位置}// 到达文件末尾,匹配成功lexer->result_symbol = Eof;return true;
}
最后在external中声明添加的规则,就可以使用这个规则了
./grammer.js
module.exports = grammar({name: "your_language",externals: $ => [$.eof,],rules: {source_file: $ => seq("world",$.eof), // 一个简单的示例}
运行
创建一个简单的文本文件example.txt
,结尾可以加上任意数量的空字符
world
控制台运行命令
tree-sitter generate
tree-sitter parse ./example.txt
得到以下输出,说明eof
规则运行成功
(source_file [0, 0] - [6, 0](world [0, 0] - [0, 5])(eof [6, 0] - [6, 0]))