Embedded Scripting Languages
Table of Contents
- 1. Embedded Scripting Languages
- 1.1. Overview
- 1.2. Embedded Scripting Languages Selection
- 1.3. MuParser - Math expression parser
- 1.4. Exprtk - Math expression parser
- 1.5. Lua scripting engine
- 1.6. Squirrel Scripting Language
- 1.7. Duktape - Embeddable Javascript Engine
- 1.8. QuickJS - ES20 Javascript Engine
- 1.9. Chaiscript
- 1.10. Python Engine via Pybind11
- 1.11. Scheme-like lisp interpreter
- 1.12. Expression interpreter
- 1.13. Operator Precedence Climbing Algorithm
1 Embedded Scripting Languages
1.1 Overview
Embedded scripting languages, sometimes called extension languages, are programming languages available as self-contained software libraries and designed to be embedded in a host programming language, often a compiled statically typed language. Embedded scripting languages allow adding new features, modifying the behavior of a program at runtime without recompilation and source code modification. Since embedded scripting languages are available as libraries, the host language can inspect variables defined by a scripting language program and call functions defined in the scripting language for possibly allowing end-users to customize and configure the application. Functions, also called subroutines or procedure, or performance-intensive functions, that would not be suitable to implement in a scripting language, registered by the host language can also be called by the embedded scripting language.
Embedded scripting is useful for a wide range of user-case, including, video games, interactivity, repl, configuration languages, sandboxing and DSL - domain-specific programming languages. In video game programming, languages such as Lua, Moonscript or GDScript (Godot game engine) allow non-programmers such as game designers or game modders to configure, extend and add new functionality to games written in C++ at runtime without any recompilation and dealing with low level programming. These programming languages can also be useful for adding interactivity, faster iteration cycle and feedback. An example of this case is the beloved text editor Emacs that uses Elisp as its embedded scripting language. End-users can inspect, modify and customize Emacs behavior at runtime with instantaneous feedback by just evaluating Elisp functions or modifying variables at the scratch buffer or Elisp interactive REPL, the IELM repl. Another case where embedded scripting can be found is web browsers, that are usually written in C++ and use JavaScript, also known by the official name of ECMascript, and html (non-turing complete) as its embedded scripting languages. Html defines how the user interface look like and JavaScript modifies the browser behavior and the html interface at runtime allowing curious users to evaluate JavaScript, in the development tools console for inspecting and interacting with html DOM (Domain Object Model) objects with instantaneous feedback and synchronization between a modified DOM element and its graphical representation.
Common API Functionalities
Some common API functionalities of a embedded scripting language API are evaluation of source code provided as string, file or input stream; ability to register functions of the host language and call them from the scripting language and possiblity insptecting variables defined in the scripting language. Another common feature is the ability to call functions defined in the scripting language from the host language. The following code block provide some examples about theses features.
Engine engine; // Ability to evaluate source code at runtime value* result = engine.eval("10 * 3 - 5 / 100 * x3 + 2.51 * if(x > 0, 1/x, 0/0) ;") if( result == nullptr){ printf("There was an error => shutdown the application."); exit(-1); } printf(" [INFO] Result = %f ", result->to_float()); // Interactive Python-like REPL for(;;){ std::string user_input = getline_from_console(): if(user_input == "quit" || user_input == "exit"){ break; } result = engine.eval(line); assert ( result == nullptr ); printf(" RESULT>> %s ", result->to_string().to_cstr()); } // ---- Ability to register functions of the host language -------// value* udf_user_defined_function(size_t size, value* args) { // .... ... do computations here... return result; } engine.add_function("user_defined_function, "&udf_user_defined_function); result = egine.eval(r"( x = 10.2512; y = user_defined_function('hello', x); z = x + y; return z; )"; // ------- Ability to call functions defined in the scripting language -------// engine.eval(" def myfunc(x, y) = 3 * x + 5.1 * y; ") FuncObject* = engine.getFunction("myfunc"); assert FuncObject != nullptr; auto args = engine.make_args(); args->add_int(10); args->add_float(20.0); result = FuncObject->call(args);
Reasonable Features of Embedded Scripting Languages
This section lsits some reasonable features that a language designed to be embedded should have:
- Ligthweight - Small footprint, small size and memory requirements.
- Scripting Engine as a library:
- Must be available as C or C++ library which exposes the interpreter API allowing the client code to evaluate scripting code from strings or files and also to retrieve objects from the virtual machine memory.
- Sandboxing or capability limitations
- A reasonable requirement for an embedded scripting language is limiting the capabilities and APIs which can be used possibly allowing the execution of non-trusted scripts. Example: the Javascript engines of most web browsers do not allow the script to interact with file systems or operating system APIs.
- Permissive License for static linking
- Use Cases
- DSL - Domain Specific Languages
- Game Engines
- Configuration files => Data Description language.
- User content
- Allow application runtime changes without recompilation.
- User extension without modification of source code.
Examples of embedded scripting languages usage and use-cases
- TCL - Tool Command Language => Used by many EDA - Electronic Design Automation Sofware in electronic engineering.
- JavaScript => Used in Web Browsers, which are written mostly in C++, for controlling user interaction, animation and etc.
- TinyScheme => Used by GNU GIMP drawing application ans Apple's MacOSX sandbox configuration.
- Lua language => Used by many game engines and also by applications such as: Nginx web server; Linux Conky; Geany Editor; NMap editor and so on.
- Squirrel Language => Used in game engines
- Python
- => Python is used as embedded scripting language by GDB - GNU Debugger ; WinDBG - Windows Debugger; IDA - Debugger for Reverse Engineering.
- Disadvantages: Python has a large footprint; it was not designed as an embedded scripting language and it is not possible to forbid the interpreter from calling file system and process creation APIs.
- AutoLisp [proprietary] => Lisp-like language used in Autocad.
- SQL (Structured Query Language) => Many databases are implemented in C or C++ uses SQL as scripting language and domain specific language (DSL) for querying and storing data. Example: SQLite, Postgres SQL, …
- Emacs Lisp => Emacs core, including the LISP engine/interpreter, is written in C and other parts are written in Emacs Lisp. This lisp dialect allows extending Emacs at with custom extensions (plugins) and also modifying the application behavior at runtime.
- VBA - Visual Basic for Application [proprietary] => Used in Microsft Office Suite, specially in Microsft Excel.
Considerations for choosing embedded scripting languages
- Small footprint and small overhead
- Stability of C or C++ API
- C++ API bindings
- Permissive license for static linking or option of dual license for static linking.
- Documentation and of C or C++ binding APIs, examples about binding code
- Garbage collector implementation
- Heavy duty computations with significant overhead should be performed on the C++-side, specially loops.
- JIT - Just-In-Time compiler => can increase the performance by
translating bytecodes into machine code.
- Example: Lua JIT, Javascript V8 engine used by Chrome browser.
- Features for running untrusted scripts or configuration:
- Ability for restricting capabilities such as creating process, accessing the file system and so on.
- Non-turing complete => better for configuration
- Familiarity of the targe audience
- Lua is pervasive in Games
- TCL is pervasive on Electronic Design Automation Software
- Javascript is widely used on the Web, NodeJS and also as embedded scripting language of web browsers.
Further Reading
1.2 Embedded Scripting Languages Selection
Selection of embedded scripting languages and engines available as libraries:
Non categorized:
- TCL - Tool Command Language
- License: akin to BSD
- Implementation: C
- Syntax Type: N/A
- Note: pervasive in EDA - Electronic Design Automation software.
- See:
- mruby (Embeddable Ruby implementation)
- License: BSD
- Implementation: C
- Syntax type: Ruby
- Wren
- License: MIT
- Implementation: C
- Syntax Type: N/A
- Jinx
- License: MIT
- Implementation: C++17
- Syntax Type: N/A
- Gravity
- License: -
- Implementation: C
- Syntax type: Apple's SWIFT language
- DaScript
- License: -
- Implementation: C++14
- Syntax type: Akin to Python
Lua or similar to Lua (mimics Lua syntax)
- Lua
- License: MIT
- Implementation: C
- Syntax Type: syntax inspired by scheme.
- LuaJIT (Lua with JIT - Just-in-Time compiler which translates
bytecodes to native machine code for better performance.)
- License: MIT
- Implementation: C
- Syntax type: lua
- GameMonkey Script
- License: MIT
- Implementation: C++
- Syntax type: Akin to Lua
- Squirrel
- License: MIT
- Implementation: C++
- Syntax type: Lua-like
- Note: Despite be implemented in C++, does not expose a C++ API, it exposes a C API.
- Squilu (Squirrel fork)
- License: MIT
- Implementation: C++
- Syntax type: Lua-like
Similar to C++ (syntax that mimics C++)
- AngelScript (Note: statically typed)
- License: Zlib
- Implementation: C++
- Syntax type: C++-like
Smiliar to Javascript or subset of Javascript (ECMAScript)
- SpiderMonkey - Java Script Engine (VM - Virtual Machine)
used by Mozzila Firefox.
- License: MPL 2.0
- Implementation: C and C++
- Syntax tyope: JavScript (ECMAscript)
- Used by:
- Firefox Web Browser
- MongoDB Document Database
- CoucheDB
- Adobe Acrobat Reader
- See:
- Repository:
- Duktape
- License: MIT
- Implementation: C
- Syntax type: Javascript (aka ECMAScript) E5/E5.1
- Espruino
- License: MPL 2
- Implementation: C
- Syntax type: ES5 - Javascript (ECMaScript)
- JerryScript
- License: Apache v2
- Implementation: C
- Syntax type: Javascript (ECMAScript)
- ChaiScript
- License: BSD
- Implementation: C++
- Syntax type: Javascript-like
Similar to Lisp or Scheme
- ECL - Embeddable Common Lisp
- License: LGPL - 2
- Implementation: C
- Syntax type: Lisp, Common Lisp
- TinyScheme
- License: BSD
- Implementation: C
- Syntax tyope: scheme, lisp
- Note: Used in GNU GIMP as scripting language and Apple MacOSX's sandbox as configuration language.
- S7 Scheme (Variant of TinyScheme)
- License: BSD
- Implementation: C
- Syntax type: C
- Chibi Scheme
- Implementation: C
- Syntax type: Scheme, Lisp
- Janet Language
- License: MIT
- Implementation: C
- Syntax type: Clojure, Lisp
- ArkScript
- License: MPL license
- Implementation: C
- Syntax type: Lisp-like, more clojure-like
License obligations and requirements:
- LGPL allows dynamically linking of closed source applications, but static linking requires source code disclosure and release under the same license. Some LGPL libraries, such as QT, allows static linking via commercial license.
- MIT, BSD, APACHE and so on => Add a copy of the license; Give credit.
1.3 MuParser - Math expression parser
MuParser is a non-turing complete embedded scripting engine for evaluating math expressions.
Web Site:
Repository:
Conan Reference:
Sample Project
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.9) project(cppexperiments) set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) # ============= Conan Bootstrap =============================# # Download automatically, you can also just copy the conan.cmake file if(NOT EXISTS "${CMAKE_BINARY_DIR}/conan.cmake") message(STATUS "Downloading conan.cmake from https://github.com/conan-io/cmake-conan") file(DOWNLOAD "https://github.com/conan-io/cmake-conan/raw/v0.13/conan.cmake" "${CMAKE_BINARY_DIR}/conan.cmake") endif() include(${CMAKE_BINARY_DIR}/conan.cmake) # Possible values "default" and "llvm8" set(CONAN_PROFILE default) conan_cmake_run(REQUIRES muparser/2.2.6@conan/stable BASIC_SETUP BUILD missing) #======= Targets Settings ===============+# add_executable(muparser1_formula muparser1_formula.cpp) target_link_libraries(muparser1_formula muparser)
File: muparser1_formula.cpp
#include <iostream> #include <string> #include <cmath> #include <muParser.h> // Defined for UNIX-like: Linux, Mac OSX, QNX, ... #if __unix__ #include <unistd.h> #endif bool isTTY_terminal() { #if __unix__ return ::ttyname(STDIN_FILENO) != nullptr; #else return false; #endif } double future_value(double PV, double rate, double nper) { return PV * std::pow(1 + rate / 100.0, nper); } int main() { mu::Parser p1; std::puts("\n====== EXPERIMENT 1 - Simple math expression ======="); p1.SetExpr("2^1 + 2^3 + 2^4"); std::cout << " [*] p1-A value: " << p1.GetExpr() << " = " << p1.Eval() << std::endl; p1.SetExpr("3.5 * 10 - sin(3.1415) * 2.0 + sqrt(10) / 100.0 + 2^3"); std::cout << " [*] p1-B value: " << p1.GetExpr() << " = " << p1.Eval() << std::endl; std::puts("\n====== EXPERIMENT 2 - Expression with variables ======="); p1.DefineConst("pi", 3.1415); double x = 10.0, y = 4.5; p1.DefineVar("x", &x); p1.DefineVar("y", &y); p1.SetExpr(" 3 * pi + sin(pi) + 4 * x + y - 5"); std::cout << " [*] p1-C value: " << p1.GetExpr() << " = " << p1.Eval() << std::endl; std::puts("\n====== EXPERIMENT 3 - Expression with custom function ======="); p1.DefineFun("fv", &future_value); p1.DefineFun("payoff", [](double S, double K){ return std::max(S - K, 0.0); }); p1.SetExpr("fv(100, 2, 8) + payoff(100, 90)"); std::cout << " [*] p1-D value: " << p1.GetExpr() << " = " << p1.Eval() << std::endl; std::puts("\n====== EXPERIMENT 4 - Parser error handling ======="); // When an error happens it throws an exception: mu::ParserError try{ p1.SetExpr("10.2 * 5.0 + a * 3"); double value = p1.Eval(); std::cout << " [*] p1-E value: " << value << std::endl; } catch (mu::ParserError const& ex) { std::cerr << " [ERROR] p1-C Parser error: " << ex.GetMsg() << std::endl; } std::puts("\n====== EXPERIMENT 5 - Calutor Interactive Shell ======"); if(isTTY_terminal()) { std::cout << " === Calculator Started OK. =====" << std::endl; mu::Parser p2; double ans = 0.0; p2.DefineVar("ans", &ans); std::string line; while(std::cin.good()) { std::cout << " EXPR => "; std::getline(std::cin, line); if(line == "") continue; if(line == "quit") break; p2.SetExpr(line); try { ans = p2.Eval(); std::cout << " => ans = " << ans << "\n\n"; } catch(mu::ParserError const& ex) { std::cerr << " [ERROR] Parser error " << ex.GetMsg() << std::endl; } } } #if _WIN32 std::cout << "Enter RETURN to exit. " << std::endl; std::cin.get(); #endif return 0; }
Program output:
$ ./muparser1_formula ====== EXPERIMENT 1 - Simple math expression ======= [*] p1-A value: 2^1 + 2^3 + 2^4 = 26 [*] p1-B value: 3.5 * 10 - sin(3.1415) * 2.0 + sqrt(10) / 100.0 + 2^3 = 43.0314 ====== EXPERIMENT 2 - Expression with variables ======= [*] p1-C value: 3 * pi + sin(pi) + 4 * x + y - 5 = 48.9246 ====== EXPERIMENT 3 - Expression with custom function ======= [*] p1-D value: fv(100, 2, 8) + payoff(100, 90) = 127.166 ====== EXPERIMENT 4 - Parser error handling ======= [ERROR] p1-C Parser error: Unexpected token "a" found at position 13. ====== EXPERIMENT 5 - Calutor Interactive Shell ====== === Calculator Started OK. ===== EXPR => 9.81 * sin(3.1415 / 2.0) + 100 * sqrt(285.6) + exp(3.65) => ans = 1738.26 EXPR => ans / 100.0 - 80.0 => ans = -62.6174 EXPR => ans * ans => ans = 3920.94 EXPR =>
1.4 Exprtk - Math expression parser
Exprtk is a MIT-license single-header, header-only and turing-complete math expression parsing engine.
Official Web Site:
Official repository:
Features:
- Header-only library in a single file.
- Turing-complete
- Sandboxed
- Supports binding to defined-functions in C++-side.
- Functions can operate on vectors
- Control structures: for-loop; if-else; ternary operator.
Drawbacks:
- The library exprtk.hpp has one megabyte (1 mb) in a single header file which significantly slows down the compile-time. Header-only design should only be used for small libraries.
Sample Project
- GIST: https://gist.github.com/2ff870b653b2ec1519bf0423165db1c5
- Note: The static library target 'mparser' makes the compile-time of the client code 'main.cpp' faster by using the PIMPL (Pointer-To-Implementation) design pattern and not exposing exprtk.hpp in the library header file.
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.9) project(exprk-parser) #========== Global Configurations =============# #----------------------------------------------# set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) set(CMAKE_CXX_EXTENSIONS OFF) #----------- Add dependencies --------------------------# #============= Functions and macros ===========================# macro(Download_Single_Headerlib FILE URL) file(DOWNLOAD ${URL} ${CMAKE_BINARY_DIR}/include/${FILE}) IF(NOT Download_Single_Headerlib_flag) include_directories(${CMAKE_BINARY_DIR}/include) set(Download_Single_Headerlib_flag TRUE) ENDIF() endmacro() Download_Single_Headerlib( exprtk.hpp https://github.com/ArashPartow/exprtk/raw/d81ac1a2ddd9877a7981d32c731fd9a75544ec68/exprtk.hpp ) #----------- Target settings -------------------------------# add_library( mparser mparser.cpp mparser.hpp ) add_executable( main main.cpp ) target_link_libraries( main mparser )
File: mparser.hpp (cmake target: mparser static library)
#ifndef _MPARSER_HPP_ #define _MPARSER_HPP_ #include <string> #include <memory> #include <functional> // Unique-ownership smart pointer for implementing PimPl // // Note: It was created because std::uniqe_ptr does // not support 'incomplete types'. // // Note: PimPl (Pointer-To-Implementation) idiom // is a technique for reducing compile-time and // maintaining ABI stability which mitigates the // fragile-ABI problem. // template<typename T, void (*disposer) (T*)> class pimpl_ptr { T* m_hnd; public: pimpl_ptr() : m_hnd(nullptr) //, m_disp(disp) { } pimpl_ptr(T* hnd) : m_hnd(hnd) { } ~pimpl_ptr() { this->release(); } // Disable copy constructor pimpl_ptr(pimpl_ptr const&) = delete; // Disable copy- pimpl_ptr& operator=(pimpl_ptr const&) = delete; // Move Ctor pimpl_ptr(pimpl_ptr&& rhs) { std::swap(m_hnd, rhs.m_hnd); } // Move assignment operator pimpl_ptr& operator=(pimpl_ptr&& rhs) { std::swap(m_hnd, rhs.m_hnd); } T* get() const { return m_hnd; } void release() { // Note: it is not possible to delete incomplete type // in this way: 'delete m_hnd;' disposer(m_hnd); m_hnd = nullptr; } void reset(T* hnd) { this->release(); m_hnd = hnd; } bool empty() const { return m_hnd == nullptr; } T& operator* () const { return *m_hnd; } T* operator-> () const { return m_hnd; } }; class MathEvaluator { struct impl; static void dispose(impl* p); pimpl_ptr<impl, MathEvaluator::dispose> m_pimpl; public: MathEvaluator(); ~MathEvaluator() = default; MathEvaluator& add_var(std::string name, double& ref); // Register function pointer callback or non-capture lambda MathEvaluator& add_function(std::string, double fptr (double) ); // Register function of two variables MathEvaluator& add_function(std::string, double fptr (double, double) ); double eval_code(std::string code); bool compile(std::string code); double value(); void repl(); }; #endif
File: mparser.cpp (cmake target: mparser static library)
#include "mparser.hpp" #include <iostream> #include <string> #include <exprtk.hpp> // --------------------------------------------------// struct MathEvaluator::impl { exprtk::expression<double> expr; exprtk::symbol_table<double> symbol_table; exprtk::parser<double> parser; std::string code; // Println function exprtk::rtl::io::println<double> println{}; }; // static method void MathEvaluator::dispose(impl* p) { delete p; } MathEvaluator::MathEvaluator(): m_pimpl(new impl) { m_pimpl->symbol_table.add_constants(); m_pimpl->expr.register_symbol_table(m_pimpl->symbol_table); m_pimpl->symbol_table.add_function("println", m_pimpl->println); } MathEvaluator& MathEvaluator::add_var(std::string name, double& ref) { m_pimpl->symbol_table.add_variable(name, ref); return *this; } MathEvaluator& MathEvaluator::add_function(std::string name, double fptr (double) ) { m_pimpl->symbol_table.add_function(name, fptr); return *this; } MathEvaluator& MathEvaluator::add_function(std::string name, double fptr (double, double) ) { m_pimpl->symbol_table.add_function(name, fptr); return *this; } bool MathEvaluator::compile(std::string code) { bool r = m_pimpl->parser.compile(code, m_pimpl->expr); if(!r){ std::string err = "Error: "; err = err + m_pimpl->parser.error(); std::cerr << " Error: " << err << "\n"; throw std::runtime_error(" [PARSER] Unable to parse expression."); } return r; } double MathEvaluator::value() { return m_pimpl->expr.value(); } void MathEvaluator::repl() { std::string line; double result; while( std::cin.good() ) { std::cout << " EXPRTK $ >> "; std::getline(std::cin, line); if(line.empty()) continue; if(line == "exit") return; try { this->compile(line); std::cout << this->value() << '\n'; } catch (std::runtime_error& ex) { std::cerr << "Error: " << ex.what() << '\n'; } } }
File: main.cpp (cmake target: main)
#include <iostream> #include <string> #include <cassert> #include "mparser.hpp" double myfun(double a, double b); void test_engine(MathEvaluator& engine, double& x); int main(int argc, char** argv) { std::puts(" [TRACE] I am up and running Ok. "); MathEvaluator engine; double x = 1.0, y = 2.0, z = 3.0; engine.add_var("x", x) .add_var("y", y) .add_var("z", z) .add_function("myfun", &myfun); assert(argc == 2); auto command = std::string(argv[1]); if(command == "test" ){ test_engine(engine, x); } if(command == "repl" ){ engine.repl(); } std::cerr << " [TRACE] Shutdown engine Ok. " << '\n'; return 0; } // -----------------------------------------// double myfun(double a, double b) { std::cerr << " [TRACE] a = " << a << "\n"; std::cerr << " [TRACE] b = " << b << "\n"; double r = 3.0 * a + 5.0 * b; std::cerr << " [TRACE] result = " << r << "\n"; std::cerr << "---------------------------------\n"; return r; } void test_engine(MathEvaluator& engine, double& x) { std::string code = R"( // Define local variables var a := 2.0 / exp(x) * x^2 + y; var b := 10.0 * sqrt(x) + z; // println('\n => x = ', x); // println('\n => y = ', y); // Call custom function var k := myfun(x, y); // Comment: the last expression is returned 4.0 * a + 3 * b + 10 * z + k; )"; engine.compile(code); x = 3.0; std::cout << " => x = " << x << " ; engine = " << engine.value() << "\n"; x = 5.0; std::cout << " => x = " << x << " ; engine = " << engine.value() << "\n"; x = 15.0; std::cout << " => x = " << x << " ; engine = " << engine.value() << "\n"; x = -15.0; std::cout << " => x = " << x << " ; engine = " << engine.value() << "\n"; x = 20.0; std::cout << " => x = " << x << " ; engine = " << engine.value() << "\n"; std::string code2 = R"( // Vector/array variable var xs [6] := { 2.0, 10.2, -2.50, 9.256, 100.0, 25.0 }; var ys [6] := { -2.0, 1.225, -5.56, 19.000, 125.0, 125.0 }; println(' => xs =', xs); println(' => ys = ', ys); println(' => 3 * xs + 4 * ys = ', 3 * xs + 4 * ys); println(' => sum(xs) = ', sum(ys) ); println(' => sum(ys) = ', sum(xs) ); )"; engine.compile(code2); engine.value(); }
Building
$ git clone https://gist.github.com/2ff870b653b2ec1519bf0423165db1c5 gist && cd gist
$ cmake --config Debug -H. -B_build
$ cmake --build _build --target
Running
Running 'test' subcommand.
# Redirect std::cerr or stderr to file $ _build/main test 2> out.log [TRACE] I am up and running Ok. => x = 3 ; engine = 121.546 => x = 5 ; engine = 140.43 => x = 15 ; engine = 218.19 => x = -15 ; engine = -nan => x = 20 ; engine = 251.164 => xs = 2.00000 10.20000 -2.50000 9.25600 100.00000 25.00000 => ys = -2.00000 1.22500 -5.56000 19.00000 125.00000 125.00000 => 3 * xs + 4 * ys = -2.00000 35.50000 -29.74000 103.76800 800.00000 575.00000 => sum(xs) = 262.66500 => sum(ys) = 143.95600 # View log file $ _build/main test 2> out.log [TRACE] I am up and running Ok. => x = 3 ; engine = 121.546 => x = 5 ; engine = 140.43 => x = 15 ; engine = 218.19 => x = -15 ; engine = -nan => x = 20 ; engine = 251.164 => xs = 2.00000 10.20000 -2.50000 9.25600 100.00000 25.00000 => ys = -2.00000 1.22500 -5.56000 19.00000 125.00000 125.00000 => 3 * xs + 4 * ys = -2.00000 35.50000 -29.74000 103.76800 800.00000 575.00000 => sum(xs) = 262.66500 => sum(ys) = 143.95600
Running 'repl' subcommand for interactive shell.
rlwrap _build/main repl
[TRACE] I am up and running Ok.
EXPRTK $ >> sin(pi) * 2.0 + 5.3 * cos(3/2 * pi)
-7.28665e-16
EXPRTK $ >> exp(3.1)
22.198
EXPRTK $ >> var a := exp(2.5); var b := a * 3.0 + 10.0; 3 * a + b
83.095
EXPRTK $ >>
EXPRTK $ >> println(' x = ', x)
x = 1.00000
0
EXPRTK $ >> myfun(5.1, 2.56)
[TRACE] a = 5.1
[TRACE] b = 2.56
[TRACE] result = 28.1
---------------------------------
28.1
EXPRTK $ >>
EXPRTK $ >> myfun(5.1, 2.0 * pi + 10.0)
[TRACE] a = 5.1
[TRACE] b = 16.2832
[TRACE] result = 96.7159
---------------------------------
96.7159
EXPRTK $ >>
EXPRTK $ >> exit
[TRACE] Shutdown engine Ok.
1.5 Lua scripting engine
1.5.1 Overview
Lua (moon in Portuguese) is a ligthweight multi paradigm scripting language written in C. Due to Lua be available as small footprint library, this language is widely used as embedded scripting by many applications for scripting and as configuration or data description language.
Use cases:
- Scripting for C or C++ applications
- Extension language => Add new functionality and updates without recompilation.
- Provide interactive REPL or shell to C or C++ applications.
- Program configuration (settings)
- Data description language
Some applications using Lua:
- Nmap network scanner
- MediaWiki (engine used by Wikipedia)
- Nginx Web Server
- Redis Database
- Linux Conky
- LuaTex
- NetBSD
- Cheat Engine
- Geany text editor
- Xmake building system for C, C++, Objective-C, Swift, Assembly, Golang, Rust, Dlang and Cuda
- Lots of games use lua for scripting and allow non-programmers, such as end-users and game artists to contribute to the game development, create animations, movements, finite state machines and so on.
- … …
More at:
Repository and C++ Binding Libraries
- Official Repository Mirror
- Lua Bind
- Sol2 (means 'sun' 2 in Portuguese)
1.5.2 Further Reading
Documentation
- Lua 5.1 Reference Manual - Documentation
- Lua (programming language) - Wikipedia
Lua C API
- Programming in Lua : 24.1
- lua-users wiki: Binding Code To Lua
- lua-users wiki: Simple Lua Api Example
- Programming in Lua : 24
- Exposing C functions to Lua
Lua embedded in Geany Text Editor
- https://plugins.geany.org/geanylua/geanylua-intro.html
- https://github.com/geany/geany-plugins/tree/master/geanylua/examples
- https://github.com/DGivney/geany-lua-scripts
- Collection of Lua scripts for Geany text editor
- https://github.com/geany/geany-plugins/tree/master/geanylua
Lua scripting in NMap Network Scanner
- Script Language | Nmap Network Scanning
- "The core of the Nmap Scripting Engine is an embeddable Lua interpreter. Lua is a lightweight language designed for extensibility. It offers a powerful and well-documented API for interfacing with other software such as Nmap. The second part of the Nmap Scripting Engine is the NSE Library, which connects Lua and Nmap. This layer handles issues such as initialization of the Lua interpreter, scheduling of parallel script execution, script retrieval and more. It is also the heart of the NSE network I/O framework and the exception handling mechanism. It also includes utility libraries to make scripts more powerful and convenient. The utility library modules and extensions are described in the section called 'NSE Libraries'."
- Nmap Scripting Engine Source Code / GITHUB
- Extending Nmap With Lua - DEV
Lua Scripting in Wireshark - Network Capture Application
Lua Scripting in XMake Building System
Lua scripting on NetBSD Kernel
- Lua in teh NetBSD Kernel
- Lua Kernel Scripting in NetBSD - Fast, Flexible Packet Filtering
- Towards a Lua scripted operating system
- NPF scripting with Lua by Lourival Vieira Neto
Lua scripting on Redis Database
Lua scripting on Nginx Web Server
- Lua on NGINX web server
- Pushing Nginx to its limit with Lua - CloudFlare
- Implementing a Fileserver with Nginx and Lua
Lua scripting on Unreal Engine / game engine
- Learn Lua in 15 Minutes
- Lua: The Little Language That Could
- Roberto De Ioris - Scriptiamo Unreal Engine con Lua - Codemotion Rome…
- GitHub - rdeioris/LuaMachine: Unreal Engine 4 Plugin for Lua APIs implementation
- LuaMachine by Roberto De Ioris in Code Plugins - UE4 Marketplace
- "Unreal Engine 4 Plugin for adding Lua scripting to your projects: If you want modders to customize your game/project, or you need to allow game designers to script parts of the logic, this plugin is for you; Contrary to the other Unreal Engine 4 Lua plugins, this one does not try to expose the Unreal Engine 4 api, but completely hides it exposing to the user/scripter only the features the developer decided to include (via Blueprints or C++); Currently Windows 64bit, Mac, Linux x86_64 (both Runtime and Editor), Linux AArch64, Android 32bit, Android 64bit, iOS (Runtime only) are supported."
Lua for DSL - Domain Specific Language
- lua-users wiki: Lua Data Formats
- "Lua can be used as a language to represent data, not just as a general programming language. Different languages have been devised for different types of data representation in text format."
- Writing a DSL in Lua
- A visual DSL toolkit in Lua: Past, present and future
- Choosing Lua as the data description and configuration language
- Declarative Internal DSLs in Lua / A Game-Changing Experience
- Lua Programming/Tables - Wikibooks, open books for an open world
- Lua/Tables - Wikiversity
- Lua declarative programming basics / Sudo Null IT News
- The basics and design of lua table
Videos
- CppCon 2017: Andreas Weis "Howling at the Moon: Lua for C++ Programmers"
- "C++ is a great tool for solving complex problems in a thorough way. But every once in a while, the desire for a simpler language emerges. For those parts of our code where performance is of secondary concern, but the ability to perform rapid iterations over the code is paramount, a scripting language might be a tempting choice. But integrating a second language besides C++ and managing the interaction between the two is also scary. Lua is a lightweight, dynamic language that was designed to be used as an embedded language within existing applications. It is easy to learn, has very reasonable runtime performance, and a memory footprint small enough that it is usable even on embedded systems. Furthermore, it is almost trivial to integrate with C++. This talk will give a brief introduction to the Lua scripting language, highlighting specifically how it can complement C++'s language features to enrich a developer's toolbox. In the second part of the talk, we will look at Lua's C API and give suggestions how to integrate it with a modern C++17 codebase. In particular we will focus on how to interface with the dynamic language Lua without compromising the benefits of C++'s strong type system."
- CppCon 2018: JeanHeyd Meneide “Scripting at the Speed of Thought: Lua and C++ with sol3”
- "A big part of accelerating development and promoting collaboration often translates to deferring a lot of the typical programmer work to a scripting language, to allow for those with more design-oriented ideas and experience to handle some of the workload. What happens, then, when you have to bind a scripting language like Lua into C++ to allow for this workflow? This session is going to be all about how you enable non-developers and developers alike to rapidly increase their development productivity by turning routines and algorithms into data that can be shipped alongside your product in Lua. We will talk primarily how you can use the library sol2 to abstract away the muck of working with the Lua C API and instead focus on using and describing both Lua and C++ with each other in a simple manner. We will demonstrate some of the more interesting properties of sol such as Overloading Support, Container Support, Usertypes – C++ classes made available with automatic support for unique/shared pointers – and Tables. By the time this session is over, you will have a firm grasp over what a good Lua Binding can offer you, and how you can accelerate your C++ development with it."
General
1.5.3 Example project with Sol2 C++ binding library
This sample project builds a C++ statically linked executable embedding the Lua scripting engine using the Sol2 binding library, which is header only. Neither Sol2 nor Lua libraries need to be installed before building this sample project as the CMake scripts take care of downloading and building all dependencies.
Lua repostiory mirror:
Sol2 library repository:
Sol2 library documentation:
Sample Project
GIST:
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.0) project(duktape-cc-trial) set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) include(lua-lib.cmake) #----- Target Definitions ----------------------------# add_executable( embed-lua-sol embed-lua-sol.cpp) target_link_libraries( embed-lua-sol lua::lualib ) # Lua REPL executable built from static library liblua.a (Linux) # Note: the main() function is in the file main.c in the lua sources directory add_executable( lua-repl $<TARGET_OBJECTS:lua::lualib> ) target_link_libraries( lua-repl m pthread )
File: lua-lib.cmake
- CMake Script for downloading sol2 binding library and lua library sources.
include(FetchContent) # Note: the 'add_subriectory' line was commented becuyase # library that will be downloaded does not have # a CMakeListst.txt file at the root directory. macro(Download_Library_Git NAME TAG REPOSITORY_URL) FetchContent_Declare( ${NAME} GIT_REPOSITORY ${REPOSITORY_URL} GIT_TAG ${TAG} ) FetchContent_GetProperties(${NAME}) if(NOT cpputest_POPULATED) FetchContent_Populate(${NAME}) message("${NAME}_SOURCE_DIR} = ${${NAME}_SOURCE_DIR}") # => Disable following line: the library does not have a CMakeLists.txt # at the root directory. # add_subdirectory(${${NAME}_SOURCE_DIR} ${${NAME}_BINARY_DIR}) endif() endmacro() # ====>> Download Lua library <<==========================# Download_Library_Git( lua v5.3.5 https://github.com/lua/lua ) file(GLOB_RECURSE lua_sources "${lua_SOURCE_DIR}/*.c") file(GLOB_RECURSE lua_headers" ${lua_SOURCE_DIR}/*.h") message( [TRACE] " lua_SOURCE_DIR = ${lua_SOURCE_DIR} ") add_library( lua STATIC ${lua_sources} ${lua_headers} ) target_include_directories( lua PUBLIC ${lua_SOURCE_DIR} ) add_library( lua::lualib ALIAS lua) # ====>> Download Sol C++ binding library <<====================# FetchContent_Declare( sol2 GIT_REPOSITORY https://github.com/ThePhD/sol2 GIT_TAG v3.2.0 ) FetchContent_MakeAvailable( sol2 ) include_directories( ${sol2_SOURCE_DIR}/include )
File: xmake.lua
- Building script for XMake building system, which uses lua as embedded scripting language and as a building system DSL (Domain Specific Language) .
- Xmake packages: sol2, lua, luajit
add_rules("mode.debug", "mode.release") includes_lua = false add_requires("sol2 v3.2.1") target("embed-lua-sol") set_kind("binary") set_languages("c++17") add_files("./embed-lua-sol.cpp") add_packages("sol2")
File: embed-lua-sol.cpp
#include <iostream> #include <string> #include <vector> #include <algorithm> #include <sol/sol.hpp> class Counter { private: std::string m_name; int m_counter; public: // Ctor [1] => Default ctor Counter(): Counter("untitled", 0) { } // Ctor [2] Counter(std::string name, int counter) : m_name{std::move(name)}, m_counter{counter} { std::cout << " [TRACE] Counter created with => { " << " name = " << m_name << " ; counter = " << m_counter << " } \n"; } int getCounter() const { return m_counter; } void setCounter(int n) { m_counter = n; std::cout << " [TRACE] I was set to value " << n << std::endl; } void increment() { m_counter++; std::cout << " [TRACE] increment event =>> counter = { " << m_name << " ; " << m_counter << " } " << std::endl; } }; double add_fun(double a, double b) { std::cout << " [TRACE] addfun() => a = " << a << " ; b = " << b << std::endl; return a + b; } void sol_eval(sol::state& ctx, std::string code) { try { ctx.script( std::move(code) ); } catch ( sol::error const& ex) { std::cerr << " [SOL ERROR] " << ex.what() << "\n"; } } int main() { // Create an instance of lua Engine (aka virtual Machine) sol::state ctx{}; // Load basic libraries (such as print) ctx.open_libraries(sol::lib::base, sol::lib::coroutine, sol::lib::string, sol::lib::io); // Register function pointer ctx.set_function("add_fun", &add_fun); // Register lambda object ctx.set_function("make_vector", [](int n ) -> std::vector<int> { auto vec = std::vector<int>(n); for(int i = 0; i < n; i++ ) vec[i]= i * 3; return vec; }); // Set variables in the Lua engine ctx["points"] = 2000; ctx.set("character", "<Marcus Tulius Cicero>"); /* ===========>>> E X P E R I M E N T / 1 <<=========*/ std::puts("\n [EXPERIMENT 1] ==>> Evaluating string as code "); { // ===>>> Eval code as string <<=== // Throws exception sol::error ctx.script(R"( print(" [LUA] Hello world lua "); x = add_fun(10, 20); print("\n [LUA] result = " .. x); v = make_vector(5); print("\n [LUA] Printing a vector "); for i = 1, 5 do print(" -> v[" .. i .. " ] = " .. v[i] ); end print(" [LUA] VAR points = " .. points ); print(" [LUA] VAR character = " .. character ); )"); } /* ===========>>> E X P E R I M E N T / 2 <<=========*/ std::puts("\n\n [EXPERIMENT 2] ==>> Reading configuration "); { ctx.script(R"( -- Simulation of user configuration from script asset_path = "C:\\\\Users\\myuser\\data\\files\\"; user_credits = 2000; width = 200.561; )"); auto asset_path = ctx.get<std::string>("asset_path"); int user_credits = ctx["user_credits"]; std::cout << " [*] => asset_path = " << asset_path << "\n"; std::cout << " [*] => user_credits = " << user_credits << "\n"; } /* ===========>>> E X P E R I M E N T / 3 <<=========*/ std::puts("\n\n [EXPERIMENT 3] ==>> Register C++ classes "); struct StatefulFunctor { int state = 0; StatefulFunctor(int state): state(state){ } int operator()(){ std::cout << " *=>> [StatefulFunctor] My new state is = " << this->state << "\n"; return state++; } }; auto stateful = StatefulFunctor(10); ctx.set_function("stateful", stateful); ctx.script(R"( stateful(); stateful(); stateful(); )"); ctx.new_usertype<Counter>( // Class name "Counter" // --- Register methods ------ // ,"getCounter", &Counter::getCounter ,"setCounter", &Counter::setCounter ,"increment", &Counter::increment // --- Register properties ---- // , "value", sol::property( &Counter::getCounter , &Counter::setCounter) ); sol_eval(ctx, R"( print("\n ----->>> Calling C++ classes from Lua <----- "); -- Create new instance (object) of C++ class Counter counter = Counter.new(); counter:increment(); counter:increment(); counter:increment(); x = counter:getCounter(); print(" [*] value of counter is equal to = " .. x); counter.value = 2000; print(" [*] Counter value is equal to = " .. counter.value ); )"); Counter* ptr = ctx.get<Counter*>("counter"); std::cout << " [FROM C++] counter value = " << ptr->getCounter() << "\n"; return 0; }
Building => Debug build:
$ git clone https://gist.github.com/17a37d905d3d71c0ae66661a189481b5 lua-sol && cd lua-sol
$ cmake --config Debug -H. -B_build
$ cmake --build _build --target
Building => Release build:
$ git clone https://gist.github.com/17a37d905d3d71c0ae66661a189481b5 lua-sol && cd lua-sol
$ cmake --config Release -H. -B_build
$ cmake --build _build --target
Check executable:
# Confirm whether the executable is statically linked against LuaLib $ ldd _build/embed-lua-sol linux-vdso.so.1 (0x00007ffdf0fc4000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fe4490e6000) libm.so.6 => /lib64/libm.so.6 (0x00007fe448fa0000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe448f85000) libc.so.6 => /lib64/libc.so.6 (0x00007fe448dbb000) /lib64/ld-linux-x86-64.so.2 (0x00007fe4492fa000) # Static library $ file _build/liblua.a _build/liblua.a: current ar archive
Program output:
$ _build/embed-lua-sol [EXPERIMENT 1] ==>> Evaluating string as code [LUA] Hello world lua [TRACE] addfun() => a = 10 ; b = 20 [LUA] result = 30.0 [LUA] Printing a vector -> v[1 ] = 0 -> v[2 ] = 3 -> v[3 ] = 6 -> v[4 ] = 9 -> v[5 ] = 12 [LUA] VAR points = 2000 [LUA] VAR character = <Marcus Tulius Cicero> [EXPERIMENT 2] ==>> Reading configuration [*] => asset_path = C:\\Users\myuser\data\files\ [*] => user_credits = 2000 [EXPERIMENT 3] ==>> Register C++ classes *=>> [StatefulFunctor] My new state is = 10 *=>> [StatefulFunctor] My new state is = 11 *=>> [StatefulFunctor] My new state is = 12 ----->>> Calling C++ classes from Lua <----- [TRACE] Counter created with => { name = untitled ; counter = 0 } [TRACE] increment event =>> counter = { untitled ; 1 } [TRACE] increment event =>> counter = { untitled ; 2 } [TRACE] increment event =>> counter = { untitled ; 3 } [*] value of counter is equal to = 3 [TRACE] I was set to value 2000 [*] Counter value is equal to = 2000 [FROM C++] counter value = 2000
Run Lua repl executable (defined in CMake):
$ rlwrap _build/lua-repl Lua 5.3.5 Copyright (C) 1994-2018 Lua.org, PUC-Rio > > > print(" Hello world Lua / Luna / Moon REPL ") Hello world Lua / Luna / Moon REPL > for i = 1, 5 do print(" i = " .. i ) end i = 1 i = 2 i = 3 i = 4 i = 5 function myfunction(a, b) return math.sin(a) * math.exp(b) / a - a * b end > myfunction(3.5, 2.0) -7.7405591279893 function add (a) local sum = 0 for i,v in ipairs(a) do sum = sum + v end return sum end > add({ 2.5, 10.2, -2.51, 8.251, 10.56}) 29.001 function add (a) local sum = 0 for i,v in ipairs(a) do sum = sum + v end return su
Building and running with Xmake
Building with Xmake:
$ >> xmake build -P . -v [ 25%]: ccache compiling.release embed-lua-sol.cpp /usr/bin/gcc -c -m64 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -std=c++17 \ -isystem /home/user/.xmake/packages/s/sol2/v3.2.1/5fa1980e381b40ed942748bdef09488a/include \ -isystem /home/user/.xmake/packages/l/lua/v5.4.4/4de5c31814a64c0d9242d1e524082873/include/lua \ -DNDEBUG -o build/.objs/embed-lua-sol/linux/x86_64/release/embed-lua-sol.cpp.o embed-lua-sol.cpp [ 50%]: linking.release embed-lua-sol /usr/bin/g++ -o build/linux/x86_64/release/embed-lua-sol build/.objs/embed-lua-sol/linux/x86_64/release/embed-lua-sol.cpp.o \ -m64 -L/home/user/.xmake/packages/l/lua/v5.4.4/4de5c31814a64c0d9242d1e524082873/lib -s -llua -ldl -lm [100%]: build ok!
Generating a CMakeLists.txt file:
$ >> xmake project -v -k cmake -P . # Overrides CMakeLists.txt if it already exists configure { plat = linux buildir = build arch = x86_64 mode = release ccache = true ndk_stdcxx = true host = linux kind = static } create ok!
Running the application:
$ >> xmake run -P . [EXPERIMENT 1] ==>> Evaluating string as code [LUA] Hello world lua [TRACE] addfun() => a = 10 ; b = 20 [LUA] result = 30.0 [LUA] Printing a vector -> v[1 ] = 0 -> v[2 ] = 3 -> v[3 ] = 6 -> v[4 ] = 9 -> v[5 ] = 12 [LUA] VAR points = 2000 [LUA] VAR character = <Marcus Tulius Cicero> [EXPERIMENT 2] ==>> Reading configuration [*] => asset_path = C:\\Users\myuser\data\files\ [*] => user_credits = 2000 [EXPERIMENT 3] ==>> Register C++ classes *=>> [StatefulFunctor] My new state is = 10 *=>> [StatefulFunctor] My new state is = 11 *=>> [StatefulFunctor] My new state is = 12 ----->>> Calling C++ classes from Lua <----- [TRACE] Counter created with => { name = untitled ; counter = 0 } [TRACE] increment event =>> counter = { untitled ; 1 } [TRACE] increment event =>> counter = { untitled ; 2 } [TRACE] increment event =>> counter = { untitled ; 3 } [*] value of counter is equal to = 3 [TRACE] I was set to value 2000 [*] Counter value is equal to = 2000 [FROM C++] counter value = 2000
1.6 Squirrel Scripting Language
1.6.1 Overview
- Squirrel is a embedded scripting language, similar to Lua, but with C-like syntax, designed to be embedded in larger C or C++ applications such as game engines. Squirrel is written in C++, but it only exposes a C API, which makes binding C++ code cumbersome. However, there are many libraries which simplifies the embedding of squirrel in C++ codebases.
Official Web Site
Official Repository
Squirrel fork with a more C++-like syntax
Articles about squirrel language
- Squirrel Programming Guide | Dev Center
- Speaking UNIX: The Squirrel portable shell and scripting language
- http://wiki.ogre3d.org/Squirrel+Scripting+Language
Applications using Squirrel
- CodeBlocks IDE for C and C++
- OpenTTD Game - http://www.openttd.org/en/
Libraries for simplifying embedding squirrel in C++ code
Libraries for simplifying squirrel embedding in C++ code (binding C++ code):
1.6.2 Building Squirrel standalone REPL interpreter
Download and build:
$ mkdir ~/build && cd build
$ git clone https://github.com/albertodemichelis/squirrel
$ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug
$ cmake --build _build --target
Play with squirrel interactive shell (REPL):
$ _build/bin/sq Squirrel 3.1 stable Copyright (C) 2003-2017 Alberto Demichelis (64 bits) sq> print(" === Hello world Squirrel === ") === Hello world Squirrel === sq> function add_to_10(x){ return x + 10; } sq>print(add_to_10(25)) 35 sq> x <- cos(3.1415 / 2) + 10 sq>print(" x = " + x.tostring()) x = 10 sq> for(local i = 0; i < 5; i++) print(" \n [TRACE] i = " + i.tostring()); [TRACE] i = 0 [TRACE] i = 1 [TRACE] i = 2 [TRACE] i = 3 [TRACE] i = 4
1.6.3 Example - embedding Squirrel with Squall Library
This example demonstrates how to embed the Squirrel programming language in a C++ application using the Squall header-only library.
- Squal Repository: https://github.com/jonigata/squall
- Note: This project is self-contained, no library needs to be installed on the system as Squall automatically fetches Squirrel sources using Cmake FetchContent.
Sample Project
File: CMakeLists.txt
cmake_minimum_required (VERSION 3.11) project(squirrel_squall_test) set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED on) set(BUILD_EXAMPLES off) # -----------------------------------------------# include(FetchContent) FetchContent_Declare( squall URL https://github.com/jonigata/squall/archive/master.zip ) FetchContent_MakeAvailable(squall) #-------- TARGET DEFINITIONS --------------------# message([TRACE] " squall_SOURCE_DIR = ${squall_SOURCE_DIR}/squall ") add_executable ( squirrel-test squirrel_test.cpp) target_link_libraries ( squirrel-test squirrel_static sqstdlib_static) target_include_directories ( squirrel-test PUBLIC ${squirrel_SOURCE_DIR}/include ${squall_SOURCE_DIR} )
File: squirrel_test.cpp
#include <iostream> #include <string> #include <algorithm> #include <vector> #include <squall/squall_vmstd.hpp> #include <squall/squall_klass.hpp> void some_cpp_fun(int n ) { for(int i = 0; i < n; i++) std::printf("\n [some_cpp_function] => i = %d ", i); } class ChartXY { int m_width; int m_height; public: ChartXY(): m_width(20), m_height(50) { std::cout << " [ChartXY] Ctor() - I was created!. OK. " << std::endl; } void set_width(int x){ m_width = x; } void set_height(int x){ m_height = x; } void draw() const { std::printf(" [ChartXY] draw() => Draw chart with: width = %d ; height = %d" , m_width, m_height); } }; int main() { // Create a virtual-machine object for Squirrel language // Note: throws squall::squirrel_error squall::VMStd vm; // ------------------------------------------------------------------------// // [EXPERIMENT 1] Evaluate scripts provided as strings // //-------------------------------------------------------------------------// std::puts(" =>> [EXPERIMENT] 1 - Evaluating code as string. "); std::puts(" ---------------------------------------------\n"); try { vm.dostring(R"( // --- Squirrel Comment ----- // print(" <SQUIRREL> =>> Hello world squirrel!"); function myfunc(x) { local a = x + 5; local b = 7 * a + x; return b - a + 10; } function myfunc2() { print(" \n <SQUIRREL> I was called by the C++ code "); } print("\n <SQUIRREL> =>> myfunc(4) = " + myfunc(4).tostring() ); print("\n\n <SQUIRREL> --- For Loop test ---- "); for(local i = 0; i < 5; i++) { print("\n i = " + i.tostring() ); } )"); } catch( squall::squirrel_error const& ex ) { std::cerr << "\n [SQUIRREL ERROR] Error => " << ex.what() << std::endl; } // ------------------------------------------------------------------------// // [EXPERIMENT 2] Evaluate scripts provided as strings // //-------------------------------------------------------------------------// std::puts("\n =>> [EXPERIMENT] 2 - Getting variables defined in the code."); std::puts(" -----------------------------------------------------------\n"); { vm.dostring(R"( // ---- Global varibles for configuration ------ // ::myvar_width <- 100; ::myvar_float <- 122.56161; ::myvar_string <- "/path/to/interpreter.exe"; )"); squall::TableBase table = vm.root_table(); auto myvar_float = table.get<float>("myvar_float"); auto myvar_width = table.get<int>("myvar_width"); auto myvar_string = table.get<std::string>("myvar_string"); std::cout << " =>> myvar_width = " << myvar_width << std::endl; std::cout << " =>> myvar_float = " << myvar_float << std::endl; std::cout << " =>> myvar_string = " << myvar_string << std::endl; } // ------------------------------------------------------------------------// // [EXPERIMENT 3] Call functions defined in the script (Virtual Machine ) // //-------------------------------------------------------------------------// std::puts("\n\n =>> [EXPERIMENT] 3 - Calling functions defined in the script (VM)"); std::puts(" ------------------------------------------------------------------\n"); // Throws: 'squall::squirrel_error' int result = vm.call<int>("myfunc", 10); std::cout << " =>>> myfunc(4) = " << result << std::endl; vm.call<void>("myfunc2"); // ------------------------------------------------------------------------// // [EXPERIMENT 4] Call C++ functions from the script // //-------------------------------------------------------------------------// std::puts("\n\n =>> [EXPERIMENT] 4 - Calling functions defined in the script (VM)"); std::puts(" ------------------------------------------------------------------\n"); // Register C++ function pointer vm.defun("some_cpp_fun", &some_cpp_fun); vm.dostring(R"( print(" \n [SQUIRREL] => Call C++ function some_cpp_fun() "); some_cpp_fun(5); )"); // Register C++ lambda object vm.defun("call_me", [=](std::string const& param) { std::cout << "\n [TRACE] call_me() Parameter = " << param << "\n"; return " name = " + param; }); vm.dostring(R"( local x = call_me("<SQUIRREL-INTERPRETER>"); print(" [SQUIRREL] \n x <- " + x); )"); // ------------------------------------------------------------------------// // [EXPERIMENT 5] Call C++ classes from Squirrel-side // //-------------------------------------------------------------------------// std::puts("\n\n =>> [EXPERIMENT] 5 - Calling C++ classes from Squirrel "); std::puts(" ------------------------------------------------------------------\n"); // Create metaobject 'k' that describes ChartXY class squall::Klass<ChartXY> k(vm, "ChartXY"); k.func("set_width", &ChartXY::set_width); k.func("set_height", &ChartXY::set_height); k.func("draw", &ChartXY::draw); vm.dostring(R"( function manipulate_chart(ch){ ch.set_width(25); ch.set_height(10); ch.draw(); } function draw_with(ch, w, h) { print(" \n [SQUIRREL LOG] Function draw_with called. OK. \n"); ch.set_width(w); ch.set_height(h); ch.draw(); } )"); ChartXY mychart; vm.call<void>("manipulate_chart", &mychart); vm.call<void>("draw_with", &mychart, 100, 200); std::cout << "\n\n"; squall::TableBase table = vm.root_table(); // Pass object to Squirrel side table.set("mychart", mychart); vm.dostring(R"( mychart.set_width(250); mychart.set_width(600); mychart.draw(); )"); #if 0 // Segmentation Falt Coredump if the C++ object // is created on the Squirrel-side. vm.dostring(R"( local c = ChartXY(); c.set_width(150); c.set_height(175); c.draw(); )"); #endif return 0; }
Build:
$ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug $ cmake --build _build --target
Check executable dependencies:
$ ldd _build/squirrel-test linux-vdso.so.1 (0x00007ffe34dd6000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fd3ebd2a000) libm.so.6 => /lib64/libm.so.6 (0x00007fd3ebbe4000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fd3ebbc9000) libc.so.6 => /lib64/libc.so.6 (0x00007fd3eb9ff000) /lib64/ld-linux-x86-64.so.2 (0x00007fd3ebf3e000)
Run application:
$ _build/squirrel-test =>> [EXPERIMENT] 1 - Evaluating code as string. --------------------------------------------- <SQUIRREL> =>> Hello world squirrel! <SQUIRREL> =>> myfunc(4) = 68 <SQUIRREL> --- For Loop test ---- i = 0 i = 1 i = 2 i = 3 i = 4 =>> [EXPERIMENT] 2 - Getting variables defined in the code. ----------------------------------------------------------- =>> myvar_width = 100 =>> myvar_float = 122.562 =>> myvar_string = /path/to/interpreter.exe =>> [EXPERIMENT] 3 - Calling functions defined in the script (VM) ------------------------------------------------------------------ =>>> myfunc(4) = 110 <SQUIRREL> I was called by the C++ code =>> [EXPERIMENT] 4 - Calling functions defined in the script (VM) ------------------------------------------------------------------ [SQUIRREL] => Call C++ function some_cpp_fun() [some_cpp_function] => i = 0 [some_cpp_function] => i = 1 [some_cpp_function] => i = 2 [some_cpp_function] => i = 3 [some_cpp_function] => i = 4 [TRACE] call_me() Parameter = <SQUIRREL-INTERPRETER> [SQUIRREL] x <- name = <SQUIRREL-INTERPRETER> =>> [EXPERIMENT] 5 - Calling C++ classes from Squirrel ------------------------------------------------------------------ [ChartXY] Ctor() - I was created!. OK. [ChartXY] draw() => Draw chart with: width = 25 ; height = 10 [SQUIRREL LOG] Function draw_with called. OK. [ChartXY] draw() => Draw chart with: width = 100 ; height = 200 [ChartXY] draw() => Draw chart with: width = 600 ; height = 200
1.7 Duktape - Embeddable Javascript Engine
1.7.1 Overview
- Duktape is a small footprint embeddable Javascript (ECMAScript) engine, written in C, which can be used for providing scripting capabilities for C or C++ applications.
- License: MIT
- Possible Use Cases:
- Configuration
- Data description language
- User plugins
- User extensions
- Scripting for games
- Features:
- Embeddable, portable, compact: can run on platforms with 160kB flash and 64kB RAM
- Built-in debugger
- Built-in regular expression engine
- Minimal, retargetable platform dependencies
- Combined reference counting and mark-and-sweep garbage collection with finalization
- Bytecode dump/load for caching compiled functions
- Distributable includes an optional logging framework, CommonJS-based module loading implementations, etc
Official Website
Official Repository
C++ Binding Libraries
1.7.2 Example project with DukGlue C++ binding library
This following project CMakeLists.txt automatically downloads dukglue binding library and duktape engine sources and builds a C++ demonstration code embedding duktape JavaScript engine.
- DukGlue Binding Library: https://github.com/Aloshi/dukglue
- Advantage:
- Easy to use and lots of examples.
- Drawbacks:
- Lack of namespaces which enhances API discoverability
- Lack of C++ wrappers to some Duktape C-types
- Lack of a CMakeLists.txt at the top directory.
- Advantage:
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.0) project(duktap-embed) include(FetchContent) # Download library archive (zip, *.tar.gz, ...) from URL macro(Download_Library_Url NAME URL) FetchContent_Declare(${NAME} URL ${URL}) FetchContent_GetProperties(${NAME}) if(NOT ${NAME}_POPULATED) FetchContent_Populate(${NAME}) # add_subdirectory(${${NAME}_SOURCE_DIR} ${${NAME}_BINARY_DIR}) endif() endmacro() # ====>> Duktape JavaScript Engine Configuration <<===========# Download_Library_Url(duktape "https://duktape.org/duktape-2.5.0.tar.xz" ) # FetchContent_MakeAvailable(duktape) message( [TRACE] " =>> duktape_SOURCE_DIR = ${duktape_SOURCE_DIR} ") file(GLOB_RECURSE duktape_sources "${duktape_SOURCE_DIR}/src/*.c") file(GLOB_RECURSE duktape_headers "${duktape_SOURCE_DIR}/src/*.h") message( [TRACE] " duktape_sources = ${duktape_sources} ") add_library (duktape ${duktape_sources} ${duktape_headers} ) target_include_directories(duktape PUBLIC ${duktape_SOURCE_DIR}/src ) # ----------- DukGlue Library ----------------------------# FetchContent_Declare( dukglue URL https://github.com/Aloshi/dukglue/archive/master.zip ) FetchContent_MakeAvailable(dukglue) #----- Main Target Definition ----------------------------# add_executable(duktape-embed duktape-embed.cpp) target_link_libraries(duktape-embed duktape dukglue)
File: duktape-embed.cpp
#include <iostream> #include <string> #include <vector> #include <cassert> // Repository: https://github.com/Aloshi/dukglue #include <dukglue/dukglue.h> void print_number(int x) { std::cout << " [TRACE] number passed is = " << x << std::endl; } void log_text(std::string const& text) { std::cout << " =>> [C++-LOG] - " << text << std::endl; } int eval_code(duk_context* ctx, std::string const& code) { return duk_peval_string(ctx, code.c_str()); } void plot_points(std::vector<float> const& points) { std::cout << " =>> [TRACE] Plot points =>> "; for(auto const& x: points) { std::cout << " x = " << x; } std::cout << " \n"; } class Counter { private: std::string m_name; int m_counter; public: // Ctor [1] => Default ctor Counter(): Counter("untitled", 0) { } // Ctor [2] Counter(std::string name, int counter) : m_name{std::move(name)}, m_counter{counter} { std::cout << " [TRACE] Counter created with => { " << " name = " << m_name << " ; counter = " << m_counter << " } \n"; } int getCounter() const { return m_counter; } void setCounter(int n) { m_counter = n; std::cout << " [TRACE] I was set to value " << n << std::endl; } void increment() { m_counter++; std::cout << " [TRACE] increment event =>> counter = { " << m_name << " ; " << m_counter << " } " << std::endl; } }; int main() { // Create Duktape Virtual machine duk_context* ctx = duk_create_heap_default(); /* ========================== EXPERIMENT 1 =============*/ std::puts("\n === [EXPERIMENT 1] ==>> Register and call C++ functions <<===== "); { // Register pointer to functions function (function pointer) in // the JS engine (aka virtual machine) dukglue_register_function(ctx, &print_number, "print_number"); dukglue_register_function(ctx, log_text, "log_text"); dukglue_register_function(ctx, plot_points, "plot_points"); const char* code1 = R"( print_number(10); log_text(" Hello world from Javascript" ); log_text(" Toke is equal to " + 100 ); log_text( " " + 1000 ); plot_points( [ 20.5, 100.23, -125.254, 8.251, 100.0 ]); )"; // Evaluate code, returns false on error auto n = eval_code(ctx, code1); if(n) { std::cerr << " [ERROR] A duktape evaluation error has happened. " << std::endl; } } /* ========================== EXPERIMENT 2 ====================*/ std::puts("\n === [EXPERIMENT 2] ==>> Register and call C++ classes <<===== \n"); { // Register class counter dukglue_register_constructor<Counter>(ctx, "Counter"); dukglue_register_constructor<Counter, std::string, int>(ctx, "Counter"); dukglue_register_method(ctx, &Counter::getCounter , "getCounter"); dukglue_register_method(ctx, &Counter::setCounter , "setCounter"); dukglue_register_method(ctx, &Counter::increment , "increment"); dukglue_register_property(ctx // Pointer to engine (VM) , &Counter::getCounter // Getter , &Counter::setCounter // Setter , "number" // Property name ); int ret = eval_code(ctx, R"( var counter = new Counter("mycounter", 10); for(i = 0 ; i < 5; i++) { counter.increment(); } var n = counter.getCounter(); log_text(" [BEFORE] Counter value = " + n ); counter.setCounter(100); log_text(" [AFTER 1 ] Counter value = " + counter.getCounter() ); counter.number = 400; log_text(" [AFTER 2] Counter value = " + counter.number ); )"); assert( ret == 0 ); } /* ======= Calling Javascript Engine from C++ ====================*/ // Note: It is useful for reading data or user configuration std::puts("\n === [EXPERIMENT 3] ==>> Calling engine objects from C++ <<===== \n"); { const char* code = R"( // Global variables for configuration points = 200; asset_path = "C:\\\\Users\\dummy\\data\\graphics"; function my_js_function(n){ log_text( " <my_js_function> =>> n = " + n ); var k = 20 * n + 100; return k; } )"; eval_code(ctx, code); // Throws error: DukErrorException auto points = dukglue_peval<int>(ctx, "points"); std::cout << " [*] =>> points = " << points << std::endl; // Throws error: DukErrorException auto asset_path = dukglue_peval<std::string>(ctx, "asset_path"); std::cout << " [*] =>> asset_path = " << asset_path << std::endl; auto jsexpr = dukglue_peval<double>(ctx, "3.51 * 10.52 - 8.251 / 100"); std::cout << " [*] jsexpr = " << jsexpr << std::endl; // Call Javascript function from C++ auto func = dukglue_peval<DukValue>(ctx, "my_js_function"); int res = dukglue_pcall<int>(ctx, func, 20); std::cout << " [*] res = " << res << std::endl; } // Release Javascript engine object (aka virtual machine) ::duk_destroy_heap(ctx); return 0; }
Build:
$ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug $ cmake --build _build --target
Program output:
$ ./build/duktape-embed
=== [EXPERIMENT 1] ==>> Register and call C++ functions <<=====
[TRACE] number passed is = 10
=>> [C++-LOG] - Hello world from Javascript
=>> [C++-LOG] - Toke is equal to 100
=>> [C++-LOG] - 1000
=>> [TRACE] Plot points =>> x = 20.5 x = 100.23 x = -125.254 x = 8.251 x = 100
=== [EXPERIMENT 2] ==>> Register and call C++ classes <<=====
[TRACE] Counter created with => { name = mycounter ; counter = 10 }
[TRACE] increment event =>> counter = { mycounter ; 11 }
[TRACE] increment event =>> counter = { mycounter ; 12 }
[TRACE] increment event =>> counter = { mycounter ; 13 }
[TRACE] increment event =>> counter = { mycounter ; 14 }
[TRACE] increment event =>> counter = { mycounter ; 15 }
=>> [C++-LOG] - [BEFORE] Counter value = 15
[TRACE] I was set to value 100
=>> [C++-LOG] - [AFTER 1 ] Counter value = 100
[TRACE] I was set to value 400
=>> [C++-LOG] - [AFTER 2] Counter value = 400
=== [EXPERIMENT 3] ==>> Calling engine objects from C++ <<=====
[*] =>> points = 200
[*] =>> asset_path = C:\\Users\dummy\data\graphics
[*] jsexpr = 36.8427
=>> [C++-LOG] - <my_js_function> =>> n = 20
[*] res = 500
1.7.3 Example project with Duktape-CC binding library
- Duktape-CC binding library: https://github.com/stfwi/duktape-cc/
- Benefits
- Namespace
- RAII for duktape C-API
- Javascript common known APIs such as console.log()
- Disadvantage:
- No possible to bind lambda function.
- No possible to bind C++ classes or objects
- No CMakeLists.txt at top directory, which makes the library usage easier, but the following cmake scripts solves this problem.
- Benefits
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.0) project(duktape-cc-trial) set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) include(duktape.cmake) #----- Main Target Definition ----------------------------# add_executable( duktape-script duktape-script.cpp) target_link_libraries( duktape-script duktape-cc )
File: duktape.cmake
include(FetchContent) # Note: the 'add_subriectory' line was commented becuyase # library that will be downloaded does not have # a CMakeListst.txt file at the root directory. macro(Download_Library_Git NAME TAG REPOSITORY_URL) FetchContent_Declare( ${NAME} GIT_REPOSITORY ${REPOSITORY_URL} GIT_TAG ${TAG} ) FetchContent_GetProperties(${NAME}) if(NOT cpputest_POPULATED) FetchContent_Populate(${NAME}) message("${NAME}_SOURCE_DIR} = ${${NAME}_SOURCE_DIR}") # => Disable following line: the library does not have a CMakeLists.txt # at the root directory. # add_subdirectory(${${NAME}_SOURCE_DIR} ${${NAME}_BINARY_DIR}) endif() endmacro() # ====>> Duktape JavaScript Engine Configuration <<===========# Download_Library_Git( duktape-cc 51fed200b0c3353a60fa560aa8a13a480f0ec0c7 https://github.com/stfwi/duktape-cc/ ) file(GLOB_RECURSE duktape_sources "${duktape-cc_SOURCE_DIR}/duktape/*.c") file(GLOB_RECURSE duktape_headers1 "${duktape-cc_SOURCE_DIR}/duktape/*.hh") file(GLOB_RECURSE duktape_headers2 "${duktape-cc_SOURCE_DIR}/duktape/*.h") add_library( duktape-cc ${duktape_sources} ${duktape_headers1} ${duktape_headers2} ) target_include_directories( duktape-cc PUBLIC ${duktape-cc_SOURCE_DIR} )
File: duktape-script.cpp
#include <iostream> #include <string> #include <vector> #include <fstream> #include <duktape/duktape.hh> #include <duktape/mod/mod.stdio.hh> int main() { std::puts(" [TRACE] Program started Ok. "); // Create Duktape Engine object (Virtual Machine) auto ctx = duktape::engine{}; // Load all functions from stdio module // ==> Note: It is necessary for console.log() work duktape::mod::stdio::define_in(ctx); std::puts("\n [EXPERIMENT 1] ======= Evaluate string as code ========"); ctx.eval<void>(R"( console.log(" [INFO] Hello world Javascript Engine "); var i = 0; while(i < 5) { console.log(" [TRACE] <ducktape> i = " + i); i++; } )"); std::puts("\n [EXPERIMENT 2] == Read/write values to the engine the engine ="); // Write or pass values to the engine. ctx.define("app.version", "0.251"); ctx.define("user.points", 1000); ctx.define("array1", std::vector<double>{ 4.51, 9.25, -25.154, 205.2 }); ctx.define("array2", std::vector<std::string>{ "C++", "ADA-Spark", "Rust", "Dlang", "OCaml" }); std::string script_file = "/tmp/myscript.js"; const char* script_code = R"( console.log(" => app.version = " + app.version ); console.log(" => user.points = " + user.points ); console.log(" => array1 = " + array1); console.log(" => array2 = " + array2); myconfig_path = "/Users/data/osx/config"; user_credits = 1020; vector = [100.25, 90.251, -120.5150]; )"; // Write script code to file auto fs = std::ofstream(script_file); // Flush forces writing to the IO fs << script_code << std::flush; // Execute script from file ctx.include(script_file); std::cout << " ---- Read configuration from file " << std::endl; // Throws exception: duktape::detail::basic_script_error<void> auto myconfig_path = ctx.eval<std::string>("myconfig_path"); auto credits = ctx.eval<int>("user_credits"); auto vec = ctx.eval<std::vector<double>>("vector"); std::cout << "\n\n[*] my_config_path = " << myconfig_path << "\n"; std::cout << "[*] user_credits = " << credits << "\n"; std::cout << "[*] vec[0] = " << vec[0] << " ; vec[1] = " << vec[1] << "\n"; return 0; }
Build:
$ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug $ cmake --build _build --target
Program output:
$ _build/duktape-script [TRACE] Program started Ok. [EXPERIMENT 1] ======= Evaluate string as code ======== [EXPERIMENT 2] == Read/write values to the engine the engine = [INFO] Hello world Javascript Engine [TRACE] <ducktape> i = 0 [TRACE] <ducktape> i = 1 [TRACE] <ducktape> i = 2 [TRACE] <ducktape> i = 3 [TRACE] <ducktape> i = 4 => app.version = 0.251 => user.points = 1000 => array1 = 4.51,9.25,-25.154,205.2 => array2 = C++,ADA-Spark,Rust,Dlang,OCaml ---- Read configuration from file [*] my_config_path = /Users/data/osx/config [*] user_credits = 1020 [*] vec[0] = 100.25 ; vec[1] = 90.251
1.8 QuickJS - ES20 Javascript Engine
QuickJS is small and lightweight embeddable Javascript engine, written in C, which supports ES2020 technical specification. This engine was created by Fabrice Bellard, creator of many widely used open source projects, namely, QEMU emulator used for emulation of operating systems and embedded systems hardware; FFmpeg tool for video and audio conversion; TCC (Tiny C Compiler). Some features supported by the engine are: modules, proxies, BigInt and asynchronous generators.
- QuickJS Web Site:
- QuickJS Repository:
- QuickJSpp C++ Wrapper Repository (License: CC0)
See also:
Sample CMake project
The following sample CMake projects demonstrates how to embed QuickJS engine in a C++ code by using the QuickJSpp C++ wrapper. The CMake script (CMakeLists.txt) downloads the QuickJS source code from its repository and creates static library target for the JavaScript engine which is then linked against the sample application embedding QuickJS. This CMakeLists.txt script fully automates all steps, which relieves the library user from installing QuickJS manually by using GNU make, which is the original building system used by the engine.
GIST Containing all sources:
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.9) project(QuickJS-Experiment) #========== Global Configurations =============# #----------------------------------------------# set( CMAKE_CXX_STANDARD 17 ) set( CMAKE_VERBOSE_MAKEFILE ON ) set( CMAKE_CXX_EXTENSIONS OFF) # ------------ Download CPM CMake Script ----------------# ## Automatically donwload and use module CPM.cmake file(DOWNLOAD https://raw.githubusercontent.com/TheLartians/CPM.cmake/v0.26.2/cmake/CPM.cmake "${CMAKE_BINARY_DIR}/CPM.cmake") include("${CMAKE_BINARY_DIR}/CPM.cmake") #----------- Add dependencies --------------------------# CPMAddPackage( NAME quickjs GITHUB_REPOSITORY bellard/quickjs GIT_TAG 204682fb87ab9312f0cf81f959ecd181180457bc # DOWNLOAD_ONLY YES ) # Add this directory where is this file (CMakeLists.txt) to include path. include_directories( ${CMAKE_CURRENT_LIST_DIR} ) # =============== QuickJS settings ====================================# include_directories( ${quickjs_SOURCE_DIR}/ ) message([TRACE] " quickjs source = ${quickjs_SOURCE_DIR} ") file(GLOB quickjs_hpp ${quickjs_SOURCE_DIR}/*.h ) file(GLOB quickjs_src ${quickjs_SOURCE_DIR}/quickjs.c ${quickjs_SOURCE_DIR}/libregexp.c ${quickjs_SOURCE_DIR}/libunicode.c ${quickjs_SOURCE_DIR}/cutils.c ${quickjs_SOURCE_DIR}/quickjs-libc.c ${quickjs_SOURCE_DIR}/libbf.c ) add_library( qjs-engine ${quickjs_src} ${quickjs_hpp} ) target_compile_options( qjs-engine PRIVATE -MMD -MF -Wno-sign-compare -Wno-missing-field-initializers -Wundef -Wuninitialized -Wundef -Wuninitialized -Wwrite-strings -Wchar-subscripts ) target_compile_definitions( qjs-engine PUBLIC CONFIG_BIGNUM=y CONFIG_VERSION="2020-11-08" _GNU_SOURCE ) if(UNIX) target_link_libraries( qjs-engine PRIVATE m pthread dl) endif() # =========== Target Settings =========================================# # QuickJS compiler. add_executable( qjsc ${quickjs_SOURCE_DIR}/qjsc.c ) target_compile_definitions( qjsc PUBLIC CONFIG_BIGNUM=y CONFIG_VERSION="2020-11-08" _GNU_SOURCE ) target_link_libraries( qjsc qjs-engine ) # Sample application that embeds the quickJS Javascript engine. add_executable( main main.cpp ) target_link_libraries( main qjs-engine )
File: main.cpp
#include <iostream> #include <quickjspp.hpp> class ChartXY { private: double x = 0.0, y = 0.0; double width = 100.0, height = 100.0; public: ChartXY() { } ChartXY(double w, double h): width(w), height(h) { } void show() const { std::cout << " [ĆhartXY Object] x = " << x << " ; y = " << y << " ; width = " << width << " height = " << height << '\n'; } void set_width(double width) { this->width = width; std::fprintf(stdout, " [ChartXY] Width set to %f \n", width); } void set_height(double height) { this->height = height; std::fprintf(stdout, " [ChartXY] Height set to %f \n", height); } double get_height() const { return this->height; } double get_width () const { return this->width; } void plot_points(std::vector<double> const& points) { std::cout << " [ChartXY] Plotting points =>> "; for(auto p : points) { std::cout << " " << p; } std::cout << "\n"; } }; qjs::Value try_eval_module( qjs::Context& context , qjs::Runtime& runtime , std::string const& code) { try { return context.eval(code, "<eval>", JS_EVAL_TYPE_MODULE); } catch( const qjs::exception& ex) { //js_std_dump_error(ctx); auto exc = context.getException(); std::cerr << (exc.isError() ? "Error: " : "Throw: ") << (std::string)exc << std::endl; if((bool)exc["stack"]) std::cerr << (std::string)exc["stack"] << std::endl; js_std_free_handlers(runtime.rt); return context.newObject(); } } int main(int argc, char** argv) { std::cout << " [INFO] Started Ok" << std::endl; using namespace qjs; Runtime runtime; //JSRuntime* rt = runtime.rt; Context context(runtime); //JSContext* ctx = context.ctx; js_std_init_handlers(runtime.rt); /* loader for ES6 modules */ JS_SetModuleLoaderFunc(runtime.rt, nullptr, js_module_loader, nullptr); js_std_add_helpers(context.ctx, argc - 1, argv + 1); /* system modules */ js_init_module_std(context.ctx, "std"); js_init_module_os(context.ctx, "os"); std::fprintf(stderr, " [TRACE] Before loading code. \n"); const char* str = R"( /* import * as std from 'std'; import * as os from 'os'; globalThis.std = std; globalThis.os = os; */ console.log(" [QUICJS] => =>> Script loaded. Ok. \n"); for(n = 1; n <= 5; n++){ console.log(` [QUICKJS-TRACE] n = ${n}/5 `); } // ----- Define user variables here ---- asset_path = "/Users/mydir-macosx/data/blackjack.txt"; game_score = 0.25156; let x = 10.352; datapoints = [ 0.251, 19.2363, 9.262, 100.125 ]; console.log(`\n [QUICKJS] asset_path = ${asset_path}` ); console.log(` [QUICKJS] score = ${100.0 * game_score} (in percent) \n`); console.log(` [QUICKJS] data points = ${datapoints} `) )"; try { context.eval(str); //, "", JS_EVAL_TYPE_MODULE); } catch( const qjs::exception& ex) { //js_std_dump_error(ctx); auto exc = context.getException(); std::cerr << (exc.isError() ? "Error: " : "Throw: ") << (std::string)exc << std::endl; if((bool)exc["stack"]) std::cerr << (std::string)exc["stack"] << std::endl; js_std_free_handlers(runtime.rt); return 1; } std::fprintf(stderr, " [TRACE] After loading code. \n"); int number = (int) context.eval(" 10 * (3 + 1 + 10 ) - 1000 * 2"); std::cout << " [RESULT] number = " << number << '\n'; std::puts("\n [*] ===== Read configuration variables defined in the js code. ====\n"); { auto var_asset_path = context.global()["asset_path"].as<std::string>(); std::cout << " =>> asset_path = " << var_asset_path << '\n'; auto score = context.global()["game_score"].as<double>(); std::cout << " =>> game_score (%) = " << 100.0 * score << '\n'; auto points = context.global()["datapoints"].as<std::vector<double>>(); std::cout << " ==>> datapoints = [" << points.size() << "]( "; for(auto p : points) { std::cout << p << ' '; } std::cout << " ) \n"; } std::puts("\n [*] ===== Define variables in C++-side ====\n"); { context.global()["user_name"] = context.newValue("Gaius Julius Caesar"); context.global()["user_points"] = context.newValue(101235); auto data = std::vector<std::string>{ "ADA", "RUST", "C++11", "C++17", "C++20" , "Dlang", "OCaml", "C#(Csharp)" }; context.global()["user_data"] = context.newValue(data); // Note: This code should be within an exception handler. context.eval(R"( console.log(` [STEP 2] user_name = ${user_name} ; points = ${user_points} `); console.log(` [STEP 2] user_data = ${user_data} ; type = ${ typeof(user_data) } `); console.log(` [STEP 2] user_data[5] = ${ user_data[5] } `) // Iterate over the array for(let x in user_data){ console.log(user_data[x]); } )"); } std::puts("\n [*] ===== Register class ChartXY ====\n"); auto& module = context.addModule("chart"); module.class_<ChartXY>("ChartXY") .constructor() .constructor<double, double>() .fun<&ChartXY::show>("show") .fun<&ChartXY::set_height>("set_height") .fun<&ChartXY::set_width>("set_width") .fun<&ChartXY::plot_points>("plot_points") .property<&ChartXY::get_width, &ChartXY::set_width>("width") .property<&ChartXY::get_height, &ChartXY::set_height>("height") ; module.add("user_path", "/Users/data/assets/game/score/marks"); module.add("user_points", 1023523); module.function("myfunc", [](double x, double y){ return 4.61 * x + 10 * y * y; }); const char* module_code = R"( import { ChartXY } from "chart"; import * as chart from "chart" console.log(` [SCRIPT] chart.user_path = ${chart.user_path} \n\n`); console.log(` [SCRIPT] chart.user_points = ${chart.user_points} \n\n`); console.log(` [SCRIPT] Result = ${ chart.myfunc(5.61, 9.821) } \n`); let ch = new ChartXY(200, 600); ch.show(); ch.set_width(800.0); ch.set_height(700.0) ch.show(); console.log(" [QUICKJS] Change chart dimensions using properties "); ch.width = 500; ch.height = 660; console.log(`\n <QUICKJS> Chart width = ${ch.width} ; Chart height = ${ch.height} \n`); ch.plot_points( [ 10.522, 8.261, -100.24, 7.2532, 56.123, 89.23 ] ); )"; try_eval_module(context, runtime, module_code); js_std_loop(context.ctx); // ----- Shutdown virtual machine ---------------// js_std_free_handlers(runtime.rt); return 0; }
Building and Running
Download sources:
$ git clone https://gist.github.com/1abdd1d36cd3e973cd1f11f5c20ef7eb qqjs && cd qqjs
$ ls
CMakeLists.txt main.cpp quickjspp.hpp
Building:
$ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug $ cmake --build _build --target
Running qjsc (QuickJS - transpiler or C code generator)
$ _build/qjsc QuickJS Compiler version 2020-11-08 usage: qjsc [options] [files] options are: -c only output bytecode in a C file -e output main() and bytecode in a C file (default = executable output) -o output set the output filename -N cname set the C name of the generated data -m compile as Javascript module (default=autodetect) -D module_name compile a dynamically loaded module or worker -M module_name[,cname] add initialization code for an external C module -x byte swapped output -p prefix set the prefix of the generated C names -S n set the maximum stack size to 'n' bytes (default=262144)
Running main application, which embeds QuickJS JS engine:
$ >> _build/main [INFO] Started Ok [TRACE] Before loading code. [QUICJS] => =>> Script loaded. Ok. [QUICKJS-TRACE] n = 1/5 [QUICKJS-TRACE] n = 2/5 [QUICKJS-TRACE] n = 3/5 [QUICKJS-TRACE] n = 4/5 [QUICKJS-TRACE] n = 5/5 [QUICKJS] asset_path = /Users/mydir-macosx/data/blackjack.txt [QUICKJS] score = 25.156 (in percent) [QUICKJS] data points = 0.251,19.2363,9.262,100.125 [TRACE] After loading code. [RESULT] number = -1860 [*] ===== Read configuration variables defined in the js code. ==== =>> asset_path = /Users/mydir-macosx/data/blackjack.txt =>> game_score (%) = 25.156 ==>> datapoints = [4]( 0.251 19.2363 9.262 100.125 ) [*] ===== Define variables in C++-side ==== [STEP 2] user_name = Gaius Julius Caesar ; points = 101235 [STEP 2] user_data = ADA,RUST,C++11,C++17,C++20,Dlang,OCaml,C#(Csharp) ; type = object [STEP 2] user_data[5] = Dlang ADA RUST C++11 C++17 C++20 Dlang OCaml C#(Csharp) [*] ===== Register class ChartXY ==== [SCRIPT] chart.user_path = /Users/data/assets/game/score/marks [SCRIPT] chart.user_points = 1023523 [SCRIPT] Result = 990.3825099999999 [ĆhartXY Object] x = 0 ; y = 0 ; width = 200 height = 600 [ChartXY] Width set to 800.000000 [ChartXY] Height set to 700.000000 [ĆhartXY Object] x = 0 ; y = 0 ; width = 800 height = 700 [QUICKJS] Change chart dimensions using properties [ChartXY] Width set to 500.000000 [ChartXY] Height set to 660.000000 <QUICKJS> Chart width = 500 ; Chart height = 660 [ChartXY] Plotting points =>> 10.522 8.261 -100.24 7.2532 56.123 89.23
1.9 Chaiscript
Scripting engine available as a header-only library that has Javascript-like syntax and easy integration to C++ codebases.
Drawbacks:
- Header-only => Slow compile-time and large executable size due to the intesive use of templates. Extern templates C++ language feature could reduce the compile time.
Repository:
Web site:
Files
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.0) project(chaiscript-eval) set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) include( FetchContent ) set( BUILD_SAMPLES OFF CACHE BOOL "") set( BUILD_SAMPLES OFF CACHE BOOL "") set( RUN_FUZZY_TESTS OFF CACHE BOOL "") set( RUN_PERFORMANCE_TESTS OFF CACHE BOOL "") FetchContent_Declare( chaiscript GIT_REPOSITORY https://github.com/ChaiScript/ChaiScript/ GIT_TAG v6.1.0 ) FetchContent_MakeAvailable(chaiscript) include_directories( chaiscript-runtime PUBLIC ${chaiscript_SOURCE_DIR}/include ) add_executable( runner chaiscript-eval.cpp) if(UNIX) target_link_libraries( runner PUBLIC pthread dl ) endif()
File: chaiscript-eval.cpp
#include <iostream> #include <string> #include <chaiscript/chaiscript.hpp> void scriptable_function(const std::string& label, int w) { std::cout << "\n [CALLED] label = " << label << " ; w = " << w << '\n'; } class Robot { std::string name; float x = 0, y = 0; public: Robot(){ } Robot(float x, float y){} void setPosition(float x, float y) { this->x = x; this->y = y; std::fprintf(stderr, " [INFO] Robot moved to x = %f ; y = %f \n", x, y); } void showPosition() { std::fprintf(stderr, " [INFO] Robot position (x = %f, y = %f ) \n", x, y); } }; int main() { // Create script engine object chaiscript::ChaiScript chai; // Register user function chai.add( chaiscript::fun(&scriptable_function) , "scriptable_function"); chai.add( chaiscript::constructor<Robot()>(), "Robot" ); chai.add( chaiscript::fun(&Robot::showPosition), "showPosition"); chai.add( chaiscript::fun(&Robot::setPosition), "setPosition" ); const char* code = R"( // It supports C++-like syntax for(var i = 0; i < 5; ++i) { print(i); } puts(" ========= Line ================= \n"); scriptable_function("Moon", 200); scriptable_function("Mars", 500); var robot = Robot(); robot.setPosition(200, 400); robot.showPosition(); // User configuration function will be called // by the script engine. def on_init_hook() { puts("\n [TRACE] User function called. Ok."); } )"; // Attempt to evaluate code try { chai.eval(code); } catch(chaiscript::exception::eval_error const& ex) { std::cout << " [TRACE] Exception = " << ex.what() << std::endl; } std::puts(" ==== Get robot object from script ========="); auto robot = chai.eval<std::shared_ptr<Robot>>("robot"); robot->showPosition(); robot->setPosition(400, 1000); } // ---- End of main() ----------//
Check Executable
$ du -h build/runner 23M build/runner # Remove debugging symbols $ strip build/runner $ du -h build/runner 7.6M build/runner $ file build/runner build/runner: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2
Running
$ ./runner 0 1 2 3 4 ========= Line ================= [CALLED] label = Moon ; w = 200 [CALLED] label = Mars ; w = 500 [INFO] Robot moved to x = 200.000000 ; y = 400.000000 [INFO] Robot position (x = 200.000000, y = 400.000000 ) ==== Get robot object from script ========= [INFO] Robot position (x = 200.000000, y = 400.000000 ) [INFO] Robot moved to x = 400.000000 ; y = 1000.000000
1.10 Python Engine via Pybind11
Documentation:
Advantages:
- High popularity
- Lost of libraries
- Easy usage
Drawbacks:
- Hard to static link
- Not ligthweight and designed to be embedded as Lua.
- It is not possible to run multiple instances of the Python interpreter.
- It is not possible to sandbox the interpreter and restrict accessing files or process manipulation APIs.
- Requires pre-installation of Python development headers. But, it can be mitigated by using Anaconda or miniconda Python distributions.
Known Cases:
- GDB - GNU Debugger
- IDA Debugger
- Sublime Text Editor
Sample Project
GIST containing the sources:
File: CMakeLists.txt
cmake_minimum_required(VERSION 3.9) project(embed-python-scripting) #========== Global Configurations =============# #----------------------------------------------# set(CMAKE_CXX_STANDARD 17) set(CMAKE_VERBOSE_MAKEFILE ON) set(CMAKE_CXX_EXTENSIONS OFF) # ------------ Download CPM CMake Script ----------------# ## Automatically donwload and use module CPM.cmake file(DOWNLOAD https://raw.githubusercontent.com/TheLartians/CPM.cmake/v0.26.2/cmake/CPM.cmake "${CMAKE_BINARY_DIR}/CPM.cmake") include("${CMAKE_BINARY_DIR}/CPM.cmake") #----------- Add dependencies --------------------------# find_package(PythonLibs REQUIRED) CPMAddPackage( NAME pybind11 URL https://github.com/pybind/pybind11/archive/v2.5.zip DOWNLOAD_ONLY true ) # add_subdirectory( {pybind11_SOURCE_DIR} ) include_directories( ${pybind11_SOURCE_DIR}/include ${PYTHON_INCLUDE_PATH} ) message( [TRACE] " pybind11_SOURCE_DIR = ${pybind11_SOURCE_DIR} ") # configure_file(script.py ${CMAKE_BINARY_DIR} COPYONLY ) #----------- Set targets -------------------------------# add_executable(app1 app1.cpp) target_link_libraries( app1 ${PYTHON_LIBRARIES} ) add_custom_command( TARGET app1 POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_SOURCE_DIR}/script.py ${CMAKE_CURRENT_BINARY_DIR}/script.py)
File: app.cpp
#include <iostream> #include <string> #include <sstream> #include <fstream> #include <cassert> #include <vector> #include <pybind11/embed.h> namespace py = pybind11; // Requires: <string>, <stream>, <sstream> std::string readFile(std::string const& file) { auto is = std::ifstream(file); if( !is.good() ){ throw std::runtime_error("Error: stream has errors."); } std::stringstream ss; ss << is.rdbuf(); return ss.str(); } int main(int argc, char** argv) { extern std::string pycode; auto guard = py::scoped_interpreter{}; // ------ EXPERIMENT 1 ------------------------------------// std::puts(" [EXPERIMENT 1] ===== Execute Python code ======================\n"); auto code = readFile("./script.py"); // std::cout << " code = " << code << "\n"; auto g = py::globals(); // Define global variables for the interpreter global scope g["game_assets"] = "/Users/myuser/game_assets"; g["speed"] = 20.151; g["z_value"] = 100; // Evaluate Python code try { py::exec( code, g ); } catch(pybind11::error_already_set const& ex) { std::cerr << " [PYBIND11 ERROR] " << ex.what() << std::endl; return EXIT_FAILURE; } // ------ EXPERIMENT 2 ------------------------------------// std::puts("\n [EXPERIMENT 2] == Read user defined configuration variables ===\n"); auto v_x = g["x"].cast<double>(); auto v_path = g["path"].cast<std::string>(); std::cout << " [*] v_x = " << v_x << std::endl; std::cout << " [*] v_path = " << v_path << std::endl; return EXIT_SUCCESS; } // ----- Internal Python Embedded Module ---------------------------// const char* version() { return "SampleModule Version 3.451-ZETA"; } // Sample "function-object class" class LinearFunctor { public: double A = 0, B = 0; LinearFunctor(); LinearFunctor(double a, double b): A(a), B(b){ } double GetA() const { return A; } void SetA(double a) { A = a; } double GetB() const { return B; } void SetB(double b) { B = b; } void show() const { std::cout << " LinearFunction: y(x) = A * x + B" << std::endl; std::cout << " => A = " << this->A << " ; B = " << this->B << std::endl; } std::string toString() const { std::stringstream ss; ss << " LinearFunction: y(x) = A * x + B" << std::endl; ss << " => A = " << this->A << " ; B = " << this->B << std::endl; return ss.str(); } // Function-call operator double operator()(double x) { return A * x + B; } }; // ---- Internal Module -----------------------// PYBIND11_EMBEDDED_MODULE(SampleModule, m) { // optional module docstring m.doc() = "Sample Python built with C++ CeePlusPlus "; m.def("version", &version, "Show Library Version"); m.def("cppLambda" ,[](double x, double y){ return 3.0 * x + y;} ,"A C++ lambda object or functor" //,py::arg("x"), py::args("y") = 15 ); // Register LinearFunction py::class_<LinearFunctor>(m, "LinearFunctor") .def(py::init<double, double>()) // Register overloaded consructor .def("GetA", &LinearFunctor::GetA) // Reister method GetA() .def("GetB", &LinearFunctor::GetB) // Register method GetB() .def("SetA", &LinearFunctor::SetA) // Reister method GetA() .def("SetB", &LinearFunctor::SetB) // Register method GetB() .def("show", &LinearFunctor::show) // Register method show .def("call", &LinearFunctor::operator()) // Register function-call operator with name 'call' .def("__call__", &LinearFunctor::operator ()) // Register fun-call operator .def("__repr__", &LinearFunctor::toString) // Register strin representation .def_readwrite("A", &LinearFunctor::A) // Register field A .def_readwrite("B", &LinearFunctor::B); // Register field B } /** --- End of PYBIND11_MODULE registration --- */
File: script.py
print(" => game_assets = ", game_assets) print(" => speed = ", speed) print(" => z_value = ", z_value) x: float = 10.0 * 20.51 / 200 path = "C:\\\\Users\\dummy\\Documents\\data" print(" [PYTHON] The value of x = ", x) for i in range(5): print(" [PYTHON] i = ", i) # It is not possible to restrict the interpreter! import os print(" [*] Current path = ", os.getcwd() ) print("\n ------------------------------------------") print("\n =>>> Test Python Internal Module (C++) <<=\n") import SampleModule as m from SampleModule import LinearFunctor print(f" -> Module Information = [{m.__doc__}] ") print( " -> Module Version = ", m.version()) print( " -> m.cppLambda(100, 25) = ", m.cppLambda(100, 25) ) functor = LinearFunctor(8.0, -10.0) print(f"\n C++ Functor -> ${functor} ") print(" functor(5.0) = ", functor(5.0)) print(" functor(8.0) = ", functor(8.0))
Build and running:
$ cd /tmp $ git clone https://gist.github.com/d7fda02034757374a0b0114e54c7daff python-embed-script $ cd python-embed-script $ cmake -H. -B_build -DCMAKE_BUILD_TYPE=Debug $ cmake --build _build --target
Program output:
$ _build/app1
[EXPERIMENT 1] ===== Execute Python code ======================
=> game_assets = /Users/myuser/game_assets
=> speed = 20.151
=> z_value = 100
[PYTHON] The value of x = 1.0255
[PYTHON] i = 0
[PYTHON] i = 1
[PYTHON] i = 2
[PYTHON] i = 3
[PYTHON] i = 4
[*] Current path = /home/mxpkf8/temp-projects/python-embed-script
------------------------------------------
=>>> Test Python Internal Module (C++) <<=
-> Module Information = [Sample Python built with C++ CeePlusPlus ]
-> Module Version = SampleModule Version 3.451-ZETA
-> m.cppLambda(100, 25) = 325.0
C++ Functor -> $ LinearFunction: y(x) = A * x + B
=> A = 8 ; B = -10
functor(5.0) = 30.0
functor(8.0) = 54.0
[EXPERIMENT 2] == Read user defined configuration variables ===
[*] v_x = 1.0255
[*] v_path = C:\\Users\dummy\Documents\data
1.11 Scheme-like lisp interpreter
1.11.1 Overview
Scheme is a simple dialect of lisp with several functional-programming features. Functions are first class citizens, they can be passed as argument to other functions and returned from functions. Everything is an expression and evaluates to something, even assignments and if-else statements. Some versions of scheme also feature tail-call optimization that allow writing tail-recursive functions to be converted to loop without creating excessive stack frames.
This section presents a lisp-like interpreter based on Scheme, implemented using modern C++ features and object oriented design patterns with a handwritten parser and lexer. This lisp interpreter is based on lispy, a lisp interpreter written in Python. Just like lispy, this implementation is easy to understand and can be easily be embedded in C++ applications for adding scripting capabilities or ursing S-expressions as a DSL (Domain-Specific Language) or data description language.
Techniques used in this implementation:
- Smart pointers are used for memory managment.
- Composite design pattern for representing the AST
- Visitor design pattern is used for traversing the AST and implementing AST printing. In a functional programming language with proper sum types (also known as variants or discriminated unions), the visitor could be replaced by pattern matching.
- Inheritance is used for representing sum types or discriminated unitons, common in functional programming languages. The AST (Abstract Syntax Tree) nodes, including atoms and lists, are represented by classes inheriting from IEXpr abstract class.
- The interpreter uses direct recursive evaluation of the AST just like the inspiration lispy.
- Lists are not implemented using linked lists, instead they are implemented using std::vector container.
- Besides the recursive abstract syntax tree evaluation implementation, the interpreter could also be implemented using a SECD (stack-environment-control-dump) virtual machine.
- This implementation uses the fn keyword instead of lambda; def instead of define for creating named functions; and set instead of define for defining variables.
Possible implementation techniques of Lisp-like languages:
- Recursive AST (Abstract Syntax Tree) evaluation
- SECD (stack-environment-control-dump) virtual/abstract machine
- Meta-circular evaluator => Implement lisp-like dialects on top of an existing lisp implementation such as Common Lisp.
- Compile lisp dialect to bytecodes of an existing virtual machine including, JVM (Java Virtual Machine); CLR (Common Language Runtime) - .NET virtual machine; parrot virtual machine or WASM - Web Assembly.
- JIT (Just-In-Time) compiler => Compile itself to machine code at runtime for improving performance and speed. This technique is used by several common lisp implementations.
- Transpile S-expressions to another language, including C or C++.
Lisp terminology:
- Sexp - S-Expression
- Stands for symbolic expressions
- Homoiconicity
- Code is data. The lisp code represents the AST (Abstract Syntax Tree) that is just nested lists of lists or atoms.
- atoms
- Everything that is not a list is an atom. An indivisible value, including strings, symbols, keywords, numbers and so on.
- car - comes from 'Contents of the Address part of the Register'
- cdr - comes from 'Contents of the Decrement part of the Register'
- cons - abbreviation of the word 'construct'
- Lisp 1 =>> There is a single namespace for variables and functions. Example: Scheme variants.
- Lisp 2 =>> There are separate namespaces for variables and functions. Example: Common Lisp and Elisp - Emacs' lisp.
- Special forms
- They are forms or primitives not evaluated as functions, instead arguments may not be evaluated at all. Most special forms are primitives or control structures such as 'if', 'set', 'define', 'lambda' and so on. For instance the 'if' special form, that has three arguments, the first argument is a predicate, the second argument is the action that happens when the predicate is true and the third argument is the action that happens when the predicate is false. If the predicate is true, the second argument is not evaluated and the if expression is evaluated to the value of the first argument. If the predicate is false, the first argument is not evaluated and the if expression is evaluated to the value of the second argument.
- DSL - Domain Specific Lanaguage
- (if <PREDICATE> <THEN-EXPR> <ELSE-EXPR> )
- The <THEN-EXPR> or then-expression is evaluated if the predicate evaulates to true that happens when the predicate does not evaluate to #f or nil. If the predicate is evaluated to any other value, the else-expression is evaluated.
- User-defined procedures
- User-defined functions using the keyword fn (equivalent to lambda) or def equivalent to Scheme's define keyword.
- Primitive procedures
- Built-in functions or procedures, for instance (+), (*), apply, map, list and so on.
- Literals
- Boolean literals => #t or #f - in this implementation
- nil literal: nil => For designating empty return value.
- symbol literal: 'a-symbol
- keyword literal: :keyword1, :keyword2
- String literal - between quotes: "my string"
Known uses of Lisp-like languages as configuration language, scripting language, embedded scripting language or extension language:
- Emacs text editor - uses its own Lisp dialect ELisp as a configuration language and scripting language. Note: Elisp is older than Scheme and Common Lisp.
- GNU Guix functional package manager that uses GNU GUile as a configuration and scripting language. GNU Guix solves the Linux dependency hell problem by allowing reproducible installation of any package. This package manager also allows multiple versions of the same software to cohexis.
- Autocad® from Autodesk - CAD (Computer Aided Design) software for engineering drawing that uses Autolisp, its own lisp dialect based on common lisp, as an embedded scripting language.
- GNU Gimp - Image editor that uses TinyScheme as embedded scripting language.
- Apple Sandbox - Uses TinyScheme as configuration language.
C++ Features useful for implementation of lexers, AST (Abstract Syntax Trees) and interpreters:
- std::shared_ptr
- std::basic_istream and std::istream
- std::getline()
- std::vector container
- std::string container
- std::map container
- std::function container
- Enum class (scoped enumeration)
- C++ Core Guidelines: Rules for Enumerations - ModernesCpp.com
- c++ - Why is enum class preferred over plain enum? - Stack Overflow
- std::static_pointer_cast (downcasting shared pointers)
- std::make_shared()
- std::enable_shared_from_this - cppreference.com
- Solution for Multiple enable_shared_from_this in Inheritance Tree - CodeProject
- Using enabled_shared_from_this. enable_shared_from_this allows to…
- Remarks on enable_shared_from_this - Christian Aichinger's posts
Typical AST - Abstract Syntax Tree of Lisp/Scheme implementations:
Expr - Expression | | +--------------+-------------------------+ | | Cons(Expr, Expr) Atom - anything that is not a pair (cons cell) (cons pair, | cons linked list +--------+--------+---+------+-----------+---------+--------------+ or cons cell ) | | | | | | | Nil Number String Boolean Symbol Keyword Function literal literal Literal Literal | -------------+------------- | | Native Lisp Function Function implemented in Lisp data Expr = -- Cons cell Cons Expr, Expr -- Nil literal - represents an empty value | Nil -- Boolean literal | Bool Bool -- Number literal | Num Double -- String literal | Str String -- Symbol | Sym String -- keyword | Key String -- Native function written in the hosting language | NativeFun ([Expr] -> Expr) -- Function written in Lisp | LispFunc (environment: Dict<String,Expr>, args: <String>, name: <String>, call: [Expr] -> Expr) Example =>> The pair or cons cell '( hello . 100) is reprented as: Cons( Sym("hello"), Num(100)) Example ->> The list (1 hello "world") is represented as a linked list in the following format: Cons( Num(1), Cons( Sym("Hello"), Cons( Str("hello"), Nil )))
AST - Abstract Syntax Tree used in this implementation:
Expression | | +-----------+------------------------+ | | | | List of Expr Atom - anything that is not a list | +-------+---------+------+--+---+--------------+----------+ | | | | | | | Nil Number String Boolean Symbol Keyword Function Literal Literal Literal Literal :keyword object #t (true) | #f (false) +----------+---------------+ | | Native Function Lisp Function written in C or C++ written in Lisp or in the hosting language Haskell-like notation: data Expr = -- List data structure - can be represented by a linked list or vector List [Expr] -- Nil literal - represents an empty value | Nil -- Boolean literal | Bool Bool -- Number literal | Num Double -- String literal | Str String -- Symbol | Sym String -- keyword | Key String -- Native function written in the hosting language | NativeFun ([Expr] -> Expr) -- Function written in Lisp | LispFunc (environment: Dict<String,Expr>, args: <String>, name: <String>, call: [Expr] -> Expr)
EBNF Grammar:
sexp: atom | list ; list: "(" sexp* ")" ; atom: SYMBOL | KEYWORD // Example: :x, :keyword | NUMBER | STRING | BOOLEAN | NIL ; NIL: "nil"; BOOLEAN: "true" | "false";
1.11.2 Code
File: code.lisp - Sample Lisp scripting code
; Lisp test code ;-------------------------------- (comment "Comment special form is evaluated to nil and discarded.") ;; Create a sample list (set mylist (list 10 #t #f "hello world" 'hello 'world ; Operation 1 :keyword1 (+ 10 25 6) '(+ 10 25 6) ; Operation 2 :keyword2 (* 4 2 6) '(* 4 2 6) )) ; Factorial function (set code-fact " ; Factorial function (def fact (n) (if (= n 1) 1 ;; Base case (* n (fact (- n 1))) ;; Recursion case )) ") ;; S-expression for fibbonaci function (def fib(n) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2)) ) )) (def make-adder (k) (fn (x) (+ k x))) ;; Define a function that adds 10 to a number (set add10 (make-adder 10)) ;; Defines a function that adds 10 to a number (set add50 (make-adder 50)) (set func (let ( (a (sqrt 125)) (b (+ 20 a)) (c (* a b)) ) (comment "Create a function using lexical scope. The variable (a, b and c) are not visible outside the function. ") (fn (x) (/ (+ a b x) c)) ) ) ;; Compute many trigonometric properties of an angle (def compute-trig (angle-degrees) (let ( (angle (/ (* angle-degrees PI) 180) ) ) (list :angle angle-degrees :cos (cos angle) :sin (sin angle) :tan (tan angle)) ) )
File: gui-dsl.lisp - Hypothetical configuration file written in Lisp that represents a html-like user interface.
;; Theoretical GUI DSL (Domain Specific Language) (div :class "myclass-csss-style" :bgcolor "#ff81a" ;; <p>Paragraph 1</p> (p "Paragraph 1 ") ;; <img src="http://..../myimage.png" /> (img :src "http://www.mydomain/images/myimage.png" :caption "Image 1 data") ;; <button id="btn-submit" class="my-style-button" >Submit</button> (button :label "Submit" :id "btn-submit" :class "my-style-button" :onclick (fn (event) (display "Button was clicked") )) )
File: cpplisp.cpp (Lisp-like interpreter) - about 1600 lines of code.
#include <iostream> #include <string> #include <sstream> #include <fstream> #include <cctype> #include <optional> #include <cassert> #include <memory> #include <vector> #include <map> #include <iomanip> #include <functional> #include <cmath> #include <stack> enum class TokenType { SYM // Symbol , KEYW // Keyword Examples => :keyword, :x , STR // String literal , NUM // Number literal , BOOL // Boolean , QUOTE // Quote , RPAREN // '(' Right parenthesis , LPAREN // ')' Left parenthesis , EOFF // End of File (It was named 'EOFF' since it conflicts with #define 'EOF') , ERR // Indicates error }; struct Token { TokenType type; std::string text; int pos; int lin; int col; Token(): type(TokenType::ERR), text(""), pos(0), lin(0), col(0) {} Token(TokenType type_, std::string const& text_, int pos_, int lin_, int col_): type(type_), text(text_), pos(pos_), lin(lin_), col(col_) { } bool isEOF() { return type == TokenType::EOFF; } }; bool isNumber(std::string const& text) { if( text[0] != '-' && !std::isdigit(text[0]) ) { return false; } auto n = text.length(); for(size_t k = 1; k < n; k++){ if( !std::isdigit( text[k] ) ){ return false;} } return true; } class Tokenizer { std::istream& _is; // Position of cursor in the inout stream int _pos = 0; // Current line in the input stream int _lin = 0; // Column in the input stream int _col = 0; public: Tokenizer(std::istream& is): _is(is){ } // Read a single character and consumes it from the stream char next() { char chr = _is.get(); _pos++; _col++; // Assumes that the new line character is just '\n' if( chr == '\n' ){ _col = 0; _lin++; } return chr; } // Read next character from input string without reading it char peek(){ return _is.peek(); } bool isBlankChar(){ char ch = peek(); return ch == ' ' || ch == '\t' || ch == '\r' || ch == '\n'; } bool isParenthesis(){ char ch = peek(); return ch == '(' || ch == ')'; } // Is end of line bool isEOL(){ return peek() == '\n'; } bool isEOF(){ return _is.eof(); } void skipBlankChars() { while( isBlankChar() ){ next(); } } void skipLineComment(){ char ch = peek(); while( ch != '\n' && !this->isEOF() ){ ch = next(); } } // Consume stream until a blank character is found std::string consumeUntilBlank() { // std::fprintf(stderr, " [TRACE] Consuming until blank \n"); char ch = peek(); std::string text; while( !this->isBlankChar() && !this->isParenthesis() && !this->isEOF()) { ch = this->next(); // fprintf(stderr, " [TRACE] ConsumeUntilBlank ch = %d \n", ch); text = text + ch; } // std::fprintf(stderr, " [TRACE] consumeUntilBlank = %s \n", text.c_str()); return text; } std::vector<Token> readTokens() { std::vector<Token> out{}; Token tok; while( !tok.isEOF() ){ tok = this->nextToken(); if( !tok.isEOF() ){ out.push_back(tok); } } auto stack = std::stack<Token>{}; // Check if parentheses are balanced for(auto const& tok: out) { // Push left parentheses to the stack if(tok.type == TokenType::LPAREN){ stack.push(tok); } // If the stack is not empty and current parenthesis is a // right parentheses, if the top of stack is left parentheses // then pop the stack. if(tok.type == TokenType::RPAREN ) { if( !stack.empty() && stack.top().type == TokenType::LPAREN ) { stack.pop(); } else { throw std::runtime_error("Error: unbalanced parentheses."); } } } if( !stack.empty() ){ auto tok = stack.top(); std::fprintf(stderr, " [ERROR] Unbalanced parentheses at line %d and column %d \n", tok.lin, tok.col); throw std::runtime_error("Error: unbalanced parentheses."); } return out; } Token nextToken() { // std::fprintf(stderr, " [TRACE] Removing blank characters \n"); skipBlankChars(); char ch = next(); // std::fprintf(stderr, " [TRACE] nextChar = %c \n", ch); if( this->isEOF() ){ return Token(TokenType::EOFF, "", _pos, _lin, _col); } // Skip comments if( ch == ';' ){ // std::fprintf(stderr, " [TRACE] Skiping comment \n"); skipLineComment(); // std::fprintf(stderr, " [TRACE] Skiping comment 2 \n"); return nextToken(); } if( ch == '(') { return Token(TokenType::LPAREN, "(", _pos, _lin, _col); } if( ch == ')') { return Token(TokenType::RPAREN, ")", _pos, _lin, _col); } if( ch == '\'') { return Token(TokenType::QUOTE, "'", _pos, _lin, _col); } // Boolean value if( ch == '#'){ auto c = this->next(); auto text = std::string("#") + c; if( c!= 'f' && c !='t' ){ return Token(TokenType::ERR, std::string("Error - invalid boolean literal => ") + text, _pos, _lin, _col); } return Token(TokenType::BOOL, text, _pos, _lin, _col); } // String literal if( ch == '"' ) { // std::fprintf(stderr, " [TRACE] Reading string literal \n"); std::string out = ""; int col = _col; int pos = _pos; int lin = _lin; char c = 'x'; while( c != '"' && !this->isEOF() ) { c = this->next(); if(c == '"'){ break; } out = out + c; } if( c!= '"'){ return Token(TokenType::ERR, "Error - non closed quote.", _pos, _lin, _col); } else { return Token(TokenType::STR, out, pos, lin, col); } } // assert( ) int col = _col; int pos = _pos; int lin = _lin; std::string text = ch + this->consumeUntilBlank(); if( text == ":"){ return Token(TokenType::ERR, "illegal terminating character after a colon:" ,pos, lin, col); } // Keyworkd if( text[0] == ':' ) { // std::fprintf(stderr, " [TRACE] Tokenizer =>> keyword = %s \n", text.c_str() ); return Token(TokenType::KEYW, text, pos, lin, col); } if( text == "-"){ return Token(TokenType::SYM, text, pos, lin, col); } if( std::isdigit(text[0]) ) { if( isNumber(text) ) { return Token(TokenType::NUM, text, pos, lin, col); } else { return Token(TokenType::ERR, std::string("Error - invalid number => ") + text, pos, lin, col); } } else if( text[0] == '-' ) { if( isNumber(text) ) { return Token(TokenType::NUM, text, pos, lin, col); } else { return Token(TokenType::ERR, std::string("Error - invalid number => ") + text, pos, lin, col); } } // Anything else is a symbpl return Token(TokenType::SYM, text, pos, lin, col); } }; std::ostream& operator<<(std::ostream& os, TokenType type) { if(type == TokenType::BOOL) { os << "[BOOL]"; } if(type == TokenType::EOFF) { os << "[EOF]"; } if(type == TokenType::STR) { os << "[STR]"; } if(type == TokenType::LPAREN) { os << "[LPAREN]"; } if(type == TokenType::RPAREN) { os << "[RPAREN]"; } if(type == TokenType::NUM) { os << "[NUM]"; } if(type == TokenType::SYM) { os << "[SYM]"; } if(type == TokenType::ERR) { os << "[ERR]"; } return os; } enum class ExprType { ERR // Error , NUM // Number literal , STR // String literal , BOOL // Boolean literal , SYM // Symbol , KEY // Keyword , NIL // Nil or Null value , FUN // Function , LST // List literal }; // Forward declaration struct ExprNil; struct ExprStr; struct ExprNum; struct ExprSym; struct ExprKey; struct ExprErr; struct ExprBool; struct ExprFun; struct ExprLst; // Visitor design pattern struct IVisitor{ public: virtual ~IVisitor() = default; virtual void visit(ExprNil& expr) = 0; virtual void visit(ExprStr& expr) = 0; virtual void visit(ExprNum& expr) = 0; virtual void visit(ExprSym& expr) = 0; virtual void visit(ExprKey& expr) = 0; virtual void visit(ExprBool& expr) = 0; virtual void visit(ExprErr& expr) = 0; virtual void visit(ExprFun& expr) = 0; virtual void visit(ExprLst& expr) = 0; }; // Lisp-like AST - Abstract syntax tree // IEXpr stands for Interface Expression struct IExpr { virtual ~IExpr() = default; virtual ExprType type() const = 0; // Returns true if AST node is ATOM virtual bool isAtom() const { return true; } // Returns true if AST node is list virtual bool isList() const { return false; } // Returns true if AST node is error virtual bool isErr() const { return false; } // Returns true if AST node is symbol virtual bool isSym() const { return false; } // Returns true if AST node is string virtual bool isStr() const { return false; } // Returns true if AST node is keyword virtual bool isKey() const { return false; } // Returns true if AST node is number virtual bool isNum() const { return false; } // Returns true if AST node is boolean virtual bool isBool() const { return false; } // Returns true if AST node is nil virtual bool isNil() const { return false; } // Returns true if AST node is function virtual bool isFun() const { return false; } /// Returns the string value of a node virtual std::string strValue() const = 0; virtual double numValue() const = 0; // Evaluates to false if nil or #f and evaluates to true if it is anything else. virtual bool boolValue() const = 0; // Method for accepting visitor (Related to visitor design pattern) virtual void accept(IVisitor& v) = 0; // Evaluate AST node and return a value (AST Node) virtual std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) = 0; }; // Represents a null value (empty value) equivalent to void // In Lisp just as other functional-like programming languages // everything is evaluated to something and can be replaced be a value. struct ExprNil: public IExpr { ExprNil(){ } ExprType type() const { return ExprType::NIL; } bool isNil() const { return true; } std::string strValue() const { return "";} double numValue() const { return 0; } bool boolValue() const { return false;} void accept(IVisitor& v) { v.visit(*this); } virtual std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprNil>(); } }; // Represents a string literal struct ExprStr: public IExpr { std::string value; ExprStr(std::string const& value_): value(value_){ } ExprType type() const { return ExprType::STR; } std::string strValue() const { return value;} double numValue() const { return -1; } bool boolValue() const { return true;} bool isStr() const { return true; } void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprStr>(this->value); } }; // Represents a number struct ExprNum: public IExpr { double value; ExprNum(double value_): value(value_){ } ExprType type() const { return ExprType::NUM; } bool isNum() const { return true; } std::string strValue() const { return "";} double numValue() const { return value; } bool boolValue() const { return true;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprNum>(this->value); } }; // Represents a boolean literal (logical value) struct ExprBool: public IExpr { bool value; ExprBool(bool value_): value(value_){ } ExprType type() const { return ExprType::BOOL; } bool isBool() const { return true; } std::string strValue() const { return "";} double numValue() const { return -1; } bool boolValue() const { return value;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprBool>(this->value); } }; // Represents a lisp symbol struct ExprSym: public IExpr { std::string value; ExprSym(std::string value_): value(value_){ } ExprType type() const { return ExprType::SYM; } bool isSym() const { return true; } std::string strValue() const { return value;} double numValue() const { return -1; } bool boolValue() const { return true;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { auto it = env.find(value); if (it == env.end()) { throw std::runtime_error(" [ERROR] Unbound variable " + value); //return std::make_shared<ExprErr>(10, "Error - unbound variable " + value); // return nullptr; } return it->second; } }; // Represents a lisp keyword such as :x, :y, :keyword1 struct ExprKey: public IExpr { std::string value; ExprKey(std::string value_): value(value_){ } ExprType type() const { return ExprType::KEY; } bool isKey() const { return true; } std::string strValue() const { return value;} double numValue() const { return -1; } bool boolValue() const { return true;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprKey>(this->value); } }; // Represents an Error (Reserved for future use) struct ExprErr: public IExpr { int code; std::string info; ExprErr(){} ExprErr(int code_, std::string const& info_) : code(code_), info(info_) { } ExprType type() const { return ExprType::ERR; } bool isErr() const { return true; } std::string strValue() const { return info;} double numValue() const { return -1; } bool boolValue() const { return false;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { return std::make_shared<ExprErr>(code, info); } }; using LispFunc = std::function<std::shared_ptr<IExpr> (std::vector<std::shared_ptr<IExpr>> const& args)>; using LispEnv = std::map<std::string, std::shared_ptr<IExpr>>; // Represents a function struct ExprFun: public IExpr { std::string name; std::string doc; // std::vector<std::string> args; // std::shared_ptr<IExpr> body; // ExprFun(std::vector<std::string> const& args_): args(args_) { } ExprType type() const { return ExprType::FUN; } bool isFun() const { return true; } double numValue() const { return -1; } bool boolValue() const { return true;} virtual bool isNativeFun() const = 0; void accept(IVisitor& v) { v.visit(*this); } virtual std::shared_ptr<IExpr> call( std::vector<std::shared_ptr<IExpr>>& params , std::map<std::string, std::shared_ptr<IExpr>>& env_) = 0; ~ExprFun() = default; }; // Non-native lisp function implemented in lisp struct ExprFunLisp: public ExprFun { std::vector<std::string> args; std::vector<std::shared_ptr<IExpr>> body; std::map<std::string, std::shared_ptr<IExpr>> env; std::string strValue() const { return "(function) " + this->name;} bool isNativeFun() const { return false; } ExprFunLisp(){} ExprFunLisp(std::vector<std::string> const& args_, std::vector<std::shared_ptr<IExpr>> const& body_) : args(args_), body(body_) { } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { // Invoke copy constructor return std::make_shared<ExprFunLisp>(*this); } std::shared_ptr<IExpr> call( std::vector<std::shared_ptr<IExpr>>& params ,std::map<std::string, std::shared_ptr<IExpr>>& env_) { if( params.size() != args.size() ) { throw std::runtime_error("Error: Function call error. "); } // Create local environment auto envv = std::map<std::string, std::shared_ptr<IExpr>>{}; // Copy global environment to temporary environment for(auto& k : env_){ envv[k.first] = k.second; } // Copy local environment to temporary environment for(auto& k : env){ envv[k.first] = k.second; } // Pass parameters to local environment for(size_t n = 0; n < args.size(); n++ ) { auto a = args[n]; envv[a] = params[n]; } auto N = body.size(); for(size_t n = 0; n < N - 1; n ++ ) { body[n]->eval(envv); } auto result = body[N - 1]->eval(envv); if( result->isFun() && !std::static_pointer_cast<ExprFun>(result)->isNativeFun() ) { auto func = std::static_pointer_cast<ExprFunLisp>(result); for(size_t n = 0; n < args.size(); n++ ) { auto a = args[n]; func->env[a] = params[n]; } } return result; } }; // Native function implemented in C or C++ struct ExprFunNative: public ExprFun { LispFunc func; std::string strValue() const { return "(native function) " + this->name;} bool isNativeFun() const { return true; } ExprFunNative(LispFunc func_): func(func_){ } std::shared_ptr<IExpr> eval( std::map<std::string, std::shared_ptr<IExpr>>& env) { // Invoke copy constructor return std::make_shared<ExprFunNative>(func); } std::shared_ptr<IExpr> call( std::vector<std::shared_ptr<IExpr>>& params ,std::map<std::string, std::shared_ptr<IExpr>>& env_) { // std::fprintf(stderr, " [TRACE] Calling native function \n"); return this->func(params); } }; // Represents a list struct ExprLst: public IExpr { std::vector<std::shared_ptr<IExpr>> lst; ExprLst(std::vector<std::shared_ptr<IExpr>> const& lst_) : lst(lst_){ } ExprType type() const { return ExprType::LST; } bool isAtom() const { return false; } bool isList() const { return true; } std::string strValue() const { return "";} double numValue() const { return -1; } bool boolValue() const { return true;} void accept(IVisitor& v) { v.visit(*this); } std::shared_ptr<IExpr> eval(std::map<std::string, std::shared_ptr<IExpr>>& env) { if (lst.size() == 0) { return std::make_shared<ExprNil>(); } auto type = lst[0]->type(); if ( type != ExprType::SYM && type != ExprType::LST ) { throw std::runtime_error(" [ERROR] =>> Object is not applicable."); } std::string symbol = lst[0]->strValue(); // Ignore comment special form if(symbol == "comment") { return std::make_shared<ExprNil>(); } if (symbol == "set") { if (lst.size() != 3) { throw std::runtime_error("Ill-formed syntax for SET special form"); } if (lst[1]->type() != ExprType::SYM) { throw std::runtime_error("Ill-formed syntax for SET special form. Expected symbol. "); } auto res = lst[2]->eval(env); std::string name = lst[1]->strValue(); std::fprintf(stderr, " [TRACE] Defining symbol = %s . \n", name.c_str()); env[name] = res; //;std::make_shared<EnvVal>(res); return std::make_shared<ExprNil>(); } // Handle special form (if <cond> <then-expr>) or (if <cond> <then-expr> <else-expr>) if( symbol == "if" ) { if( lst.size() != 3 && lst.size() != 4) { throw std::runtime_error("Ill-formed syntax for IF special form"); } auto cond = lst[1]; auto cond_result = cond->eval(env); if( lst.size() == 3 && !cond_result->boolValue() ) { return std::make_shared<ExprNil>(); } if( lst.size() == 4 && !cond_result->boolValue() ) { // Evaluate the else-expr auto res = lst[3]->eval(env); // this->eval( *lst[3] ); return res; } assert( cond_result->boolValue() ); auto res = lst[2]->eval(env); //this->eval( *lst[2] ); return res; } // Handle quote special form if( symbol == "quote" || symbol == "$q") { if( lst.size() != 2 ) { throw std::runtime_error("Ill-formed syntax for QUOTE special form"); } auto arg = lst[1]; return arg; } // Equivalent to scheme or lisp lambda or scheme lambda // (fn (a0 a1 ... a[N-1]) (expr1) (expr2) ... (expr[K-1]) ) if( symbol == "fn" ) { if (lst.size() < 3) { throw std::runtime_error("Ill-formed syntax for lambda special form"); } if( !lst[1]->isList() ) { throw std::runtime_error("Ill-formed syntax for lambda. Expected arguments."); } // Function/lambda arguments auto args = std::static_pointer_cast<ExprLst>(lst[1]); // Function/lambda body (list of expressions) auto body = lst; // Remove first element ('lambda' symbol) body.erase(body.begin()); // Remove frst element - function arguments body.erase(body.begin()); std::vector<std::string> _args{}; for(auto& a: args->lst) { if( !a->isSym() ) { throw std::runtime_error("Ill-formed syntax for lamba. Arguments must be strings."); } _args.push_back( a->strValue() ); } auto fun = std::make_shared<ExprFunLisp>(_args, body); return fun; } /** Sepcial form def * (def function (a0 a1 ... a[N-1]) (expr1) (expr2) ... (expr[K-1]) ) * * Example: * (def addxy(x y ) * (disp x) * (disp y) * (+ x y)) * */ if( symbol == "def" ) { if (lst.size() < 4) { throw std::runtime_error("Ill-formed syntax for def() special form"); } if( !lst[1]->isSym() ) { throw std::runtime_error("Ill-formed syntax for def(). Expected function name."); } if( !lst[2]->isList() ) { throw std::runtime_error("Ill-formed syntax for def(). Expected arguments."); } // Function name auto name = lst[1]->strValue(); // Function/lambda arguments auto args = std::static_pointer_cast<ExprLst>(lst[2]); // Function/lambda body (list of expressions) auto body = lst; // Remove first element ('lambda' symbol) body.erase(body.begin()); body.erase(body.begin()); body.erase(body.begin()); std::fprintf(stderr, " [TRACE] Defining function = %s . \n", name.c_str()); std::vector<std::string> _args{}; for(auto& a: args->lst) { if( !a->isSym() ) { throw std::runtime_error("Ill-formed syntax for lamba. Arguments must be strings."); } _args.push_back( a->strValue() ); } auto fun = std::make_shared<ExprFunLisp>(_args, body); fun->name = name; env[name] = fun; return std::make_shared<ExprNil>(); } // Handle special form (begin (action1 ....) (action 2) (action N-1)) if( symbol == "begin" ) { // arguments auto args = lst; // Remove first element args.erase(args.begin()); if( args.size() == 0) { return std::make_shared<ExprNil>(); } for(size_t n = 0; n < args.size() - 1; n++) { args[n]->eval(env); } // Revaluate last element auto result = args[ args.size() -1 ]->eval(env); return result; } /** (let (binding lists) (exp1) (exp2) .... (expN-1) ) * * (let ( (a 20) * (b 30 )) * (display a) (display b) (+ a b) ) * * The previous expression evaluates to 30. */ if( symbol == "let" ) { if( lst.size() < 3 ) { throw std::runtime_error("Error: ill-formed sytax for let expression."); } if( !lst[1]->isList() ) { throw std::runtime_error("Error: ill-formed sytax for let expression. Expected list of bindings."); } auto bindings = std::static_pointer_cast<ExprLst>(lst[1])->lst; // Create a new temporary environment auto envv = std::map<std::string, std::shared_ptr<IExpr>>{}; auto envs = std::map<std::string, std::shared_ptr<IExpr>>{}; // Copy environment for(auto& k : env){ envv[k.first] = k.second; } for(auto const& b: bindings){ if( !b->isList() ) { throw std::runtime_error("Error: ill-formed sytax for let expression. Expected list as binding argument."); } auto bb = std::static_pointer_cast<ExprLst>(b)->lst; if( bb.size() != 2) { throw std::runtime_error("Error: ill-formed sytax for let expression."); } if( !bb[0]->isSym() ) { throw std::runtime_error("Error: ill-formed sytax for let expression. Expected symbol."); } auto sym = bb[0]->strValue(); auto val = bb[1]->eval(envv); // Add symbols from let binding to the new temporary environment envv[sym] = val; envs[sym] = val; } for(size_t k = 2; k < lst.size() - 1; k ++) { lst[k]->eval(envv); } auto result = lst[lst.size() - 1]->eval(envv); // Implement closure - function cature variables from local environment if( result->isFun() && !std::static_pointer_cast<ExprFun>(result)->isNativeFun() ) { auto func = std::static_pointer_cast<ExprFunLisp>(result); func->env = envs; } return result; } /** For-loop similar to common lisp => (repeat 10 n (action n)) */ if( symbol == "repeat" ) { if( lst.size() != 4 ) { throw std::runtime_error("Error: ill-formed syntax for repeat special form. Expected 4 arguments"); } if( !lst[2]->isSym() ) { throw std::runtime_error("Error: ill-formed syntax for repeat special form. Expected symbol."); } auto n = lst[1]->eval(env); if( !n->isNum() ) { throw std::runtime_error("Error: expected number as first argument."); } auto ntimes = (int) n->numValue(); if( ntimes < 0) { throw std::runtime_error("Error: ill-formed syntax for repeat special form . Iterations cannot be negative."); } auto sym = lst[2]->strValue(); auto action = lst[3]; // Copy environment auto envv = env; for(int k = 0; k < ntimes; k++) { envv[sym] = std::make_shared<ExprNum>(k); action->eval(envv); } return std::make_shared<ExprNil>(); } // ---------- Function Application --------------------// auto it = lst[0]->eval(env); if( !it->isFun() ){ throw std::runtime_error("Error - object not callable."); } auto function = std::static_pointer_cast<ExprFun>(it); // Function arguments auto args = lst; // Remove first element args.erase(args.begin()); std::vector<std::shared_ptr<IExpr>> evaluatedArgs; // evaluatedArgs.reserve(args.size()); for (auto const &arg : args) { auto res = arg->eval(env); evaluatedArgs.push_back(res); } auto result = function->call(evaluatedArgs, env); return result; } }; // Parser auto readTokens(std::vector<Token>& tokens) -> std::shared_ptr<IExpr> { // auto tokens = tokens_; if( tokens.size() == 0){ throw std::runtime_error("Error: Unexpected EOF - End of File"); } auto tok = tokens[0]; // Remove first element of vector tokens.erase(tokens.begin()); // std::cout << " [readToken] " << tok.type << " ; " << tok.text // << " ; line = " << tok.lin << " ; col = " << tok.col // << "\n"; if( tok.type == TokenType::ERR ) { throw std::runtime_error("[ERROR] " + tok.text); } if( tok.type == TokenType::QUOTE ) { auto lst = std::vector<std::shared_ptr<IExpr>>{}; lst.push_back( std::make_shared<ExprSym>("quote") ); auto next = readTokens(tokens); lst.push_back(next); return std::make_shared<ExprLst>(lst); } if( tok.type == TokenType::LPAREN ) { auto lst = std::vector<std::shared_ptr<IExpr>>{}; while( tokens[0].type != TokenType::RPAREN ) { auto expr = readTokens(tokens); lst.push_back( expr ); } // Drop off ')' right parentheis tokens.erase(tokens.begin()); return std::make_shared<ExprLst>(lst); } if( tok.type == TokenType::RPAREN ) { throw std::runtime_error("Error: Unexpected ')' right parenthesis "); } if( tok.type == TokenType::NUM) { return std::make_shared<ExprNum>( std::stoi(tok.text) ); } if( tok.type == TokenType::SYM && tok.text == "nil" ) { return std::make_shared<ExprNil>(); } if( tok.type == TokenType::SYM ) { return std::make_shared<ExprSym>( tok.text ); } if( tok.type == TokenType::KEYW ) { return std::make_shared<ExprKey>( tok.text ); } if( tok.type == TokenType::BOOL ) { return std::make_shared<ExprBool>( tok.text == "#t" ? true : false ); } if( tok.type == TokenType::STR ) { return std::make_shared<ExprStr>( tok.text ); } return std::make_shared<ExprErr>(-1, "BUG Found => Edge case found in the parser."); } // Parse Lisp-like (scheme-like) S-Expression by reading a string source code auto parseSexp(std::string const& code) -> std::shared_ptr<IExpr> { std::stringstream ss; ss << code; Tokenizer tokenizer(ss); auto tokens = tokenizer.readTokens(); auto result = readTokens(tokens); return result; } // Parse Lisp-like (scheme-like) S-Expression by reading an input stream auto parseSexp(std::istream& is) -> std::shared_ptr<IExpr> { Tokenizer tokenizer(is); auto tokens = tokenizer.readTokens(); auto result = readTokens(tokens); return result; } class PrintVisitor: public IVisitor { public: void visit(ExprNil& expr) override { std::cout << "nil"; } void visit(ExprStr& expr) override { std::cout << "\"" << expr.value << "\""; } void visit(ExprNum& expr) override { std::cout << expr.value; } void visit(ExprBool& expr) override { std::cout << (expr.value ? "#t" : "#f"); } void visit(ExprSym& expr) override { std::cout << expr.value; } void visit(ExprKey& expr) override { std::cout << expr.value; } void visit(ExprErr& expr) override { std::cout << "<< [ERROR] " << expr.info; } void visit(ExprFun& expr) override { std::cout << expr.strValue(); } void visit(ExprLst& expr) override { std::cout << "( "; for(auto node : expr.lst){ std::cout << " "; node->accept(*this); } std::cout << " )"; } }; class WriterVisitor: public IVisitor { std::ostream& out; public: WriterVisitor(std::ostream& os_): out(os_){} void visit(ExprNil& expr) override { out << "nil"; } void visit(ExprStr& expr) override { out << "\"" << expr.value << "\""; } void visit(ExprNum& expr) override { out << expr.value; } void visit(ExprBool& expr) override { out << (expr.value ? "#t" : "#f"); } void visit(ExprSym& expr) override { out << expr.value; } void visit(ExprKey& expr) override { out << expr.value; } void visit(ExprErr& expr) override { out << "<< [ERROR] " << expr.info; } void visit(ExprFun& expr) override { out << expr.strValue(); } void visit(ExprLst& expr) override { // out << "\n ( "; for(auto const& node : expr.lst) { out << " "; node->accept(*this); //if( node->isList() ){ out << '\n'; } } out << " )"; } }; // Print S-Expression void printSexp(IExpr& expr) { PrintVisitor visitor; expr.accept(visitor); std::cout << std::endl; } /// Lisp interpreter's primitive functions namespace primitives { // All Lisp functions passed to the interpreter must have this type signature // Lisp function primitive 'list' => Example: (list 10 20 (+ 1 2 5) "Hello" #f #t nil) std::shared_ptr<IExpr> list(std::vector<std::shared_ptr<IExpr>> const& args) { return std::make_shared<ExprLst>(args); } // Return the number of elements of alist std::shared_ptr<IExpr> length(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: length() requires 1 argument of type list."); } if( !args[0]->isList() ) { throw std::runtime_error("Error: length() expects 1 argument of type list."); } auto node = static_cast<ExprLst const&>( *args[0] ); auto lst = node.lst; return std::make_shared<ExprNum>(lst.size()); } // car => Returns first element of the list or nil std::shared_ptr<IExpr> car(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: car() requires 1 argument of type list."); } if( !args[0]->isList() ) { throw std::runtime_error("Error: car() expects 1 argument of type list."); } auto node = static_cast<ExprLst const&>( *args[0] ); auto lst = node.lst; if( lst.size() == 0 ){ return std::make_shared<ExprNil>(); } return lst[0]; } std::shared_ptr<IExpr> cdr(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: cdr() requires 1 argument of type list."); } if( !args[0]->isList() ) { throw std::runtime_error("Error: cdr() expects 1 argument of type list."); } auto node = static_cast<ExprLst const&>( *args[0] ); auto lst = node.lst; if( lst.size() == 0 ){ return std::make_shared<ExprNil>(); } // Remove first list argument lst.erase(lst.begin()); return std::make_shared<ExprLst>(lst); } // cons => List constructor std::shared_ptr<IExpr> cons(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: cons() requires 2 arguments."); } if( !args[1]->isList() && !args[1]->isNil() ) { throw std::runtime_error("Error: pairs construction not allowed. Second argument can only be a list or nil."); } if( args[1]->isList() ) { auto lst = std::static_pointer_cast<ExprLst>( args[1] )->lst; lst.insert(lst.begin(), args[0]); return std::make_shared<ExprLst>(lst); } assert( args[1]->isNil() ); auto lst = std::vector<std::shared_ptr<IExpr>>{ args[0] }; return std::make_shared<ExprLst>(lst); } // Get the nth element of a list std::shared_ptr<IExpr> nth(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: cons() requires 2 arguments."); } if( !args[0]->isNum() ) { throw std::runtime_error("Error: Expected number "); } if( !args[1]->isList() && !args[1]->isNil() ) { throw std::runtime_error("Error: Expected list or nil."); } if( args[1]->isNil() ) { return std::make_shared<ExprNil>(); } auto n = (int) args[0]->numValue(); auto lst = std::static_pointer_cast<ExprLst>(args[1])->lst; if( n < lst.size() && n >= 0 ){ return lst[n]; } return std::make_shared<ExprNil>(); } // (= 10 20) => #f ; (= 10 10) => #t std::shared_ptr<IExpr> equal_number(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: = requires 2 arguments."); } if( !args[0]->isNum() && !args[1]->isNum() ) { throw std::runtime_error("Error: Expected number "); } auto a = args[0]->numValue(); auto b = args[1]->numValue(); return std::make_shared<ExprBool>(a == b); } std::shared_ptr<IExpr> less_than(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: = requires 2 arguments."); } if( !args[0]->isNum() && !args[1]->isNum() ) { throw std::runtime_error("Error: Expected number "); } auto a = args[0]->numValue(); auto b = args[1]->numValue(); return std::make_shared<ExprBool>(a < b); } // Parse a string returning a S-Expression std::shared_ptr<IExpr> read(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: read() requires 1 argument of type string."); } if( args[0]->type() != ExprType::STR ) { throw std::runtime_error("Error: read() expects 1 argument of type string."); } std::string code = args[0]->strValue(); auto result = parseSexp(code); return result; } // Primitive predicate function list? => Return true if the argument is of type list std::shared_ptr<IExpr> is_list(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: list?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->isList() ); } // Primitive predicate function symbol? => Return true if the argument is of type symbol std::shared_ptr<IExpr> is_symbol(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: symbol?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->type() == ExprType::SYM ); } std::shared_ptr<IExpr> is_keyword(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: keyword?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->type() == ExprType::KEY ); } // Primitive predicate function number? => Return true if the argument is of type number std::shared_ptr<IExpr> is_number(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: number?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->isNum() ); } // Primitive predicate function boolean? => Return true if the argument is of type number std::shared_ptr<IExpr> is_boolean(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: boolean?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->type() == ExprType::BOOL ); } // Primitive predicate function nil? => Return true if the argument is of type nil std::shared_ptr<IExpr> is_nil(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: nil?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->isNil() ); } // Primitive predicate function string? => Return true if the argument is of type string std::shared_ptr<IExpr> is_string(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: string?() requires 1 argument."); } return std::make_shared<ExprBool>( args[0]->type() == ExprType::STR ); } // Display an S-Expression std::shared_ptr<IExpr> disp(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ){ throw std::runtime_error("Error: function disp() requires 1 argument."); } PrintVisitor visitor; args[0]->accept(visitor); std::cout << "\n"; return std::make_shared<ExprNil>(); } // Lisp function (+) => Example: (+ 1 2 3 4 5) = 15 std::shared_ptr<IExpr> add(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() == 0 ){ throw std::runtime_error("Error: this function requires at least 1 argument."); } double acc = 0.0; for(auto const& a: args){ if( a->type() != ExprType::NUM){ throw std::runtime_error("Invalid argument. Expected a number"); } acc = acc + a->numValue(); } return std::make_shared<ExprNum>(acc); } // Lisp function (*) => Example: (* 1 2 3 4 5) = 120 std::shared_ptr<IExpr> mul(std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() == 0 ){ throw std::runtime_error("Error: this function requires at least 1 argument."); } double acc = 1.0; for(auto const& a: args){ if( a->type() != ExprType::NUM){ throw std::runtime_error("Invalid argument. Expected a number"); } acc = acc * a->numValue(); } return std::make_shared<ExprNum>(acc); } // List function (-) => Example: (- 100 20 30 ) = 100 - 20 - 30 = 50 std::shared_ptr<IExpr> sub(std::vector<std::shared_ptr<IExpr>> const& args) { // std::fprintf(stderr, " [TRACE] Called Lispfun sub() \n "); if (args.size() == 0) { throw std::runtime_error("Error: this function requires at least 1 argument."); } if (args.size() == 1) { if (args[0]->type() != ExprType::NUM) { throw std::runtime_error("Error: invalid argument type. Expected number"); } assert(args[0]->type() == ExprType::NUM); return std::make_shared<ExprNum>(-args[0]->numValue()); } if (args[0]->type() != ExprType::NUM) { throw std::runtime_error("Error: invalid argument type. Expected number"); } double acc = args[0]->numValue(); // std::fprintf(stderr, " [TRACE] acc = %f \n", acc); auto args_ = args; // Remove first element args_.erase(args_.begin()); for (auto const &a : args_) { if (a->type() != ExprType::NUM) { throw std::runtime_error("Invalid argument. Expected a number"); } double x = a->numValue(); // std::fprintf(stderr, " [TRACE] x = %f \n", x); acc = acc - x; } return std::make_shared<ExprNum>(acc); } std::shared_ptr<IExpr> div(std::vector<std::shared_ptr<IExpr>> const& args) { // std::fprintf(stderr, " [TRACE] Called Lispfun sub() \n "); if (args.size() == 0) { throw std::runtime_error("Error: this function requires at least 1 argument."); } if (args.size() == 1) { if (args[0]->type() != ExprType::NUM) { throw std::runtime_error("Error: invalid argument type. Expected number"); } assert(args[0]->type() == ExprType::NUM); return std::make_shared<ExprNum>(1.0 / args[0]->numValue()); } if (args[0]->type() != ExprType::NUM) { throw std::runtime_error("Error: invalid argument type. Expected number"); } double acc = args[0]->numValue(); // std::fprintf(stderr, " [TRACE] acc = %f \n", acc); auto args_ = args; // Remove first element args_.erase(args_.begin()); for (auto const &a : args_) { if (a->type() != ExprType::NUM) { throw std::runtime_error("Invalid argument. Expected a number"); } double x = a->numValue(); // std::fprintf(stderr, " [TRACE] x = %f \n", x); acc = acc / x; } return std::make_shared<ExprNum>(acc); } } class Eval { // Reference to itself in oder to provide code completion Eval& self = *this; public: using Env = std::map<std::string, std::shared_ptr<IExpr>>; // Lisp environment Env _env; Eval() { self.addFunction("disp", &primitives::disp); // Primitive list functions self.addFunction("list", &primitives::list); self.addFunction("length", &primitives::length); self.addFunction("car", &primitives::car); self.addFunction("cdr", &primitives::cdr); self.addFunction("cons", &primitives::cons); self.addFunction("nth", &primitives::nth); // Primitive predicate functions self.addFunction("number?", &primitives::is_number); self.addFunction("boolean?", &primitives::is_boolean); self.addFunction("symbol?", &primitives::is_symbol); self.addFunction("nil?", &primitives::is_nil); self.addFunction("keyword?", &primitives::is_keyword); self.addFunction("list?", &primitives::is_list); // Primtive math functions self.addFunction("=", &primitives::equal_number); self.addFunction("<", &primitives::less_than); self.addFunction("+", &primitives::add); self.addFunction("*", &primitives::mul); self.addFunction("-", &primitives::sub); self.addFunction("/", &primitives::div); self.addMath1ArgFun("sin", static_cast<double (*) (double)>(&std::sin)); self.addMath1ArgFun("cos", static_cast<double (*) (double)>(&std::cos)); self.addMath1ArgFun("tan", static_cast<double (*) (double)>(&std::tan)); self.addMath1ArgFun("sqrt", static_cast<double (*) (double)>(&std::sqrt)); self.addMath1ArgFun("inv", [](double x){ return 1.0 / x; }); self.addVariable("PI", 3.1415); // Parse string into a SExp - S-Expression self.addFunction("read", &primitives::read); // Evaluate a List (AST - Abstract Syntax Tree) or SEXP returning a SEXP (S-Expression) self.addFunction("eval", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 ) { throw std::runtime_error("Error: eval(sexp) requires 1 argument of type list."); } if( !args[0]->isList() ) { throw std::runtime_error("Error: eval(sexpr) expects 1 argument of type list."); } auto expr = static_cast<ExprLst const&>( *args[0] ); return expr.eval(this->_env); }); // load a file self.addFunction("load-file", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 && !args[0]->isStr() ) { throw std::runtime_error("Error: load-file() requires 1 argument of type string."); } auto file = args[0]->strValue(); auto expr = this->evalFile(file); return expr; }); // Read a single S-Expression from file without evaluating it self.addFunction("read-file", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 && !args[0]->isStr() ) { throw std::runtime_error("Error: eval(sexp) requires 1 argument of type string."); } auto file = args[0]->strValue(); auto ifs = std::ifstream(file); auto sexp = parseSexp(ifs); return sexp; }); // Write a single S-Expression to a file self.addFunction("write-file", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 && !args[0]->isStr() ) { throw std::runtime_error("Error: eval(sexp) requires 1 argument of type string."); } auto file = args[0]->strValue(); auto ofs = std::ofstream(file); WriterVisitor visitor(ofs); args[1]->accept(visitor); return std::make_shared<ExprNil>(); }); // Read a multiple S-Expression from file without evaluating it self.addFunction("read-file-multiple", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1 && !args[0]->isStr() ) { throw std::runtime_error("Error: eval(sexp) requires 1 argument of type string."); } auto file = args[0]->strValue(); std::ifstream is(file); if( !is.good() ){ throw std::runtime_error("Error: stream has errors."); } std::stringstream ss; ss << is.rdbuf(); auto code = "( \n" + ss.str() + "\n )"; auto sexp = parseSexp(code); return sexp; }); self.addFunction("apply", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: apply() requires 2 arguments."); } if( !args[0]->isFun() ) { throw std::runtime_error("Error: expected function as first argument."); } if( !args[1]->isList() ) { throw std::runtime_error("Error: expected list as second argument."); } auto func = std::static_pointer_cast<ExprFun>(args[0]); auto params = std::static_pointer_cast<ExprLst>(args[1]); return func->call(params->lst, this->_env); }); self.addFunction("map", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: apply() requires 2 arguments."); } if( !args[0]->isFun() ) { throw std::runtime_error("Error: expected function as first argument."); } if( !args[1]->isList() ) { throw std::runtime_error("Error: expected list as second argument."); } auto func = std::static_pointer_cast<ExprFun>(args[0]); auto params = std::static_pointer_cast<ExprLst>(args[1]); auto result = std::vector<std::shared_ptr<IExpr>>{}; auto args_ = std::vector<std::shared_ptr<IExpr>>{ std::make_shared<ExprNil>() }; result.reserve( params->lst.size() ); for(auto const& p: params->lst){ args_[0] = p; result.push_back( func->call( args_, this->_env ) ); } return std::make_shared<ExprLst>(result); }); self.addFunction("for-each", [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 2 ) { throw std::runtime_error("Error: apply() requires 2 arguments."); } if( !args[0]->isFun() ) { throw std::runtime_error("Error: expected function as first argument."); } if( !args[1]->isList() ) { throw std::runtime_error("Error: expected list as second argument."); } auto func = std::static_pointer_cast<ExprFun>(args[0]); auto params = std::static_pointer_cast<ExprLst>(args[1]); for(auto const& p: params->lst){ auto args_ = std::vector<std::shared_ptr<IExpr>>{ p }; func->call( args_, this->_env ); } return std::make_shared<ExprNil>(); }); } void addVariable(std::string const& name, double value) { auto expr = std::make_shared<ExprNum>(value); _env[name] = expr; } void addVariable(std::string const& name, std::shared_ptr<IExpr> expr) { _env[name] = expr; } void addFunction(std::string const& name, LispFunc func) { auto fn = std::make_shared<ExprFunNative>(func); fn->name = name; _env[name] = fn; } template<typename Func> void addMath1ArgFun(std::string const& name, Func funcMath) { auto func = [=](std::vector<std::shared_ptr<IExpr>> const& args) { if( args.size() != 1){ throw std::runtime_error(" [ERROR] This function requires exactly one argument."); } auto x = args[0]; if( x->type() != ExprType::NUM ){ throw std::runtime_error(" [ERROR] Invalid arguument type. Number required. "); } double res = funcMath(x->numValue()); return std::make_shared<ExprNum>(res); }; auto val = std::make_shared<ExprFunNative>(func); val->name = name; _env[name] = val; } std::shared_ptr<IExpr> eval(IExpr& expr) { return expr.eval( this->_env ); } // Interactive read-print eval loop0 void repl() { std::string line; std::string arg, command; for(;;) { std::cout << " $ lisp>> "; if( !std::getline(std::cin, line) ){ break; } if( line == "" ){ continue; } std::stringstream ss(line); ss >> command >> arg; try { if( command == ":file") { auto res = this->evalFile(arg); continue; } if( command == ":block") { std::fprintf(stderr, " [INFO] Type a multi-line s-expression and type (;;) as the last line when you are done. \n"); std::string code; while( line != ";;" ){ std::getline(std::cin, line); code = code + "\n" + line; } auto res = this->evalCode(code); self.addVariable("ans", res); printSexp(*res); continue; } auto res = this->evalCode(line); self.addVariable("ans", res); printSexp(*res); } catch(const std::exception& e) { std::cerr << e.what() << '\n'; } } } /// Eval lisp-like code given as string std::shared_ptr<IExpr> evalCode(std::string const& code) { auto expr = parseSexp(code); auto result = expr->eval(_env); return result; } std::shared_ptr<IExpr> evalFile(std::string const& file) { std::ifstream is(file); if( !is.good() ){ throw std::runtime_error("Error: stream has errors."); } std::stringstream ss; ss << is.rdbuf(); auto code = "(begin \n" + ss.str() + "\n )"; auto expr = parseSexp(code); auto result = expr->eval(_env); return result; } }; int main(){ std::cout << " [TRACE] Testing " << std::endl; const char* text = R"( ; comment1 ; comment 2 ; comment 3 ( print->list ; comment 4 " Hello world" A8 B100 ; Next comment #t #f (+ 20 30 -200(* 20 -20 x y z 8000)) ) )"; std::stringstream ss; #if 0 std::cout << " =========== Tokenizer ======================== " << std::endl; ss << text; ss << "100"; Tokenizer tokenizer(ss); Token tok; while( !tok.isEOF() ){ tok = tokenizer.nextToken(); std::cout << " => " << tok.type << " ; " << tok.text << " ; line = " << tok.lin << " ; col = " << tok.col << "\n"; } #endif auto expr = parseSexp(text); std::cout << std::boolalpha << " [TRACE] expr->isList() = " << expr->isList() << "\n"; printSexp(*expr); std::cout << " ====== Evaluate Code ======================== " << std::endl; const char* code1 = R"( (list 10 #t #f "hello world" ; Operation 1 (+ 10 25 6) ; Operation 2 (* 4 2 6) ) )"; Eval eval; auto res = eval.evalCode(code1); std::cout << " [TRACE] Result = "; printSexp(*res); std::cout << "============ Start LISP REPL ======================" << std::endl; eval.repl(); return 0; }
Building the lisp interpreter:
$ g++ cpplisp.cpp -o cpplisp.bin -g -Wall -Wextra -std=c++1z
Running:
$ >> rlwrap ./cpplisp.bin [TRACE] Testing [TRACE] expr->isList() = true ( print->list " Hello world" A8 B100 #t #f ( + 20 30 -200 ( * 20 -20 x y z 8000 ) ) ) ====== Evaluate Code ======================== [TRACE] Result = ( 10 #t #f "hello world" 41 48 ) ============ Start LISP REPL ====================== $ lisp>> (list '(+ 1 2 3 4 5) (+ 1 2 3 4 5) '(* 4 100) (* 4 100) :keyword #t #f nil "Hello world") ( ( + 1 2 3 4 5 ) 15 ( * 4 100 ) 400 :keyword #t #f nil "Hello world" ) $ lisp>> '(list (+ 1 2 3 4 5) (+ 1 2 3 4 5) '(* 4 100) (* 4 100) :keyword #t #f nil "Hello world") ( list ( + 1 2 3 4 5 ) ( + 1 2 3 4 5 ) ( quote ( * 4 100 ) ) ( * 4 100 ) :keyword #t #f nil "Hello world" ) $ lisp>> (cons 10 20) Error: pairs construction not allowed. Second argument can only be a list or nil. $ lisp>> (cons 20 nil) ( 20 ) $ lisp>> (cons 10 (cons 20 nil)) ( 10 20 ) $ lisp>> (cons "hello" (cons 10 (cons 20 nil))) ( "hello" 10 20 ) $ lisp>> $ lisp>> (cons 'world (cons "hello" (cons 10 (cons 20 nil)))) ( world "hello" 10 20 ) $ lisp>> (cons :keyword (cons 'world (cons "hello" (cons 10 (cons 20 nil))))) ( :keyword world "hello" 10 20 ) $ lisp>> (cons sqrt (cons :keyword (cons 'world (cons "hello" (cons 10 (cons 20 nil)))))) ( (native function) sqrt :keyword world "hello" 10 20 ) $ lisp>> $ lisp>> (+ 1 2 3 4 5 6) 21 $ lisp>> (* 1 2 3 4 5 6) 720 $ lisp>> (list 1 2 3 5 6 7 8) ( 1 2 3 5 6 7 8 ) $ lisp>> (set x 10) [TRACE] Defining symbol = x . nil $ lisp>> ( (if (< x 5) + *) 100 20) 2000 $ lisp>> ( (if (< x 50) + *) 100 20) 120 $ lisp>> (set code "(apply + (list 1 3 4 5 6 7))") [TRACE] Defining symbol = code . nil $ lisp>> code "(apply + (list 1 3 4 5 6 7))" $ lisp>> (set expr (read code)); read SExp - S-Expression from string [TRACE] Defining symbol = expr . nil $ lisp>> expr ( apply + ( list 1 3 4 5 6 7 ) ) $ lisp>> (eval expr) ;; Evaluate S-Expression 26 $ lisp>> :block [INFO] Type a multi-line s-expression and type (;;) as the last line when you are done. ; Execute a let-binding expression (let ( (a 10) (b (+ (* a a) 10)) (c (sqrt (+ a b))) ) (disp a) (disp b) (disp c) (list :x a :y b :result3 c :z (+ a b c))) ;; 10 110 10.9545 ( :x 10 :y 110 :result3 10.9545 :z 130.954 ) $ lisp>> a [ERROR] Unbound variable a $ lisp>> b [ERROR] Unbound variable b $ lisp>> c [ERROR] Unbound variable c $ lisp>> $ lisp>> (load-file "code.lisp") ; Load a lisp file [TRACE] Defining symbol = mylist . [TRACE] Defining symbol = code-fact . [TRACE] Defining function = fib . [TRACE] Defining function = make-adder . [TRACE] Defining symbol = add10 . [TRACE] Defining symbol = add50 . [TRACE] Defining symbol = func . [TRACE] Defining function = compute-trig . nil $ lisp>> $ lisp>> mylist ( 10 #t #f "hello world" hello world :keyword1 41 ( + 10 25 6 ) :keyword2 48 ( * 4 2 6 ) ) $ lisp>> (car mylist) 10 $ lisp>> (cdr mylist) ( #t #f "hello world" hello world :keyword1 41 ( + 10 25 6 ) :keyword2 48 ( * 4 2 6 ) ) $ lisp>> (car (cdr mylist)) #t $ lisp>> (nth 5 mylist) world $ lisp>> (add10 20) 30 $ lisp>> (add50 80) 130 $ lisp>> (map add10 (list 1 2 3 4 5 6 7 )) ( 11 12 13 14 15 16 17 ) $ lisp>> (map add50 (list 1 2 3 4 5 6 7 )) ( 51 52 53 54 55 56 57 ) $ lisp>> (map (fn (x) (* x x)) (list 1 2 3 4 5 6 7 )) ; Map a lambda ( 1 4 9 16 25 36 49 ) $ lisp>> (compute-trig 45) ( :angle 45 :cos 0.707123 :sin 0.70709 :tan 0.999954 ) $ lisp>> (compute-trig 90) ( :angle 90 :cos 4.63268e-05 :sin 1 :tan 21585.8 ) $ lisp>> (compute-trig 30) ( :angle 30 :cos 0.866033 :sin 0.499987 :tan 0.57733 ) $ lisp>> (map compute-trig (list 30 90 45)) ( ( :angle 30 :cos 0.866033 :sin 0.499987 :tan 0.57733 ) ( :angle 90 :cos 4.63268e-05 :sin 1 :tan 21585.8 ) ( :angle 45 :cos 0.707123 :sin 0.70709 :tan 0.999954 ) ) $ lisp>> (func 10) 0.1502 $ lisp>> (func 1) 0.124383 $ lisp>> a [ERROR] Unbound variable a $ lisp>> b [ERROR] Unbound variable b $ lisp>> c [ERROR] Unbound variable c $ lisp>> $ lisp>> (for-each disp (list "Hello" 'world :lisp 100 nil #t #f)) "Hello" world :lisp 100 nil #t #f $ lisp>> (set make-multiplier-function (fn (mult) (fn (x) (* mult x)))) [TRACE] Defining symbol = make-multiplier-function . nil $ lisp>> (set mult-by-10 (make-multiplier-function 10)) [TRACE] Defining symbol = mult-by-10 . nil $ lisp>> (set mult-by-6 (make-multiplier-function 6)) [TRACE] Defining symbol = mult-by-6 . nil $ lisp>> (map mult-by-10 (list 1 2 3 4 5)) ( 10 20 30 40 50 ) $ lisp>> (map mult-by-6 (list 1 2 3 4 5)) ( 6 12 18 24 30 ) $ lisp>> ( (make-multiplier-functon 4) (list 1 2 3 4 5)) [ERROR] Unbound variable make-multiplier-functon $ lisp>> (map (make-multiplier-function 4) (list 1 2 3 4 5)) ( 4 8 12 16 20 ) $ lisp>>
Read file gui-dsl.lisp:
$ lisp>> (set expr (read-file "gui-dsl.lisp")) [TRACE] Defining symbol = expr . nil $ lisp>> expr ( div :class "myclass-csss-style" :bgcolor "#ff81a" ( p "Paragraph 1 " ) ( img :src "http://www.mydomain/images/myimage.png" :caption "Image 1 data" ) ( button :label "Submit" :id "btn-submit" :class "my-style-button" :onclick ( fn ( event ) ( display "Button was clicked" ) ) ) ) $ lisp>> (car expr) div $ lisp>> (car (cdr expr)) :class $ lisp>> (nth 5 expr) ( p "Paragraph 1 " ) $ lisp>> (car (nth 5 expr)) p $ lisp>> (cdr (nth 5 expr)) ( "Paragraph 1 " ) $ lisp>> (car (cdr (nth 5 expr))) "Paragraph 1 " $ lisp>> (nth 6 expr) ( img :src "http://www.mydomain/images/myimage.png" :caption "Image 1 data" ) $ lisp>> (nth 7 expr) ( button :label "Submit" :id "btn-submit" :class "my-style-button" :onclick ( fn ( event ) ( display "Button was clicked" ) ) )
1.11.3 dotted list sexp parser algorithm
In traditional Scheme and Lisp implementations, lists are not reprsented as vectors or arrays, instead they are represented as linked list of cons, also called dotted pairs cells. Each cons cell has two parts. The first part car points to another s-expression, that either can be an atomic element or other cons cell. The second part of a dotted pair, cons can point to another dotted pair, an empty list or an atomic element. If the cons of the final dotted pair of a list points to an empty list (nil in common lisp), the list is said to be a proper list. If the dotted pair points to an atomic element, the list is said to be an improper list.
The terms car, cdr and cons, come from the following abbreviations:
- car - comes from 'Contents of the Address part of the Register'
- cdr - comes from 'Contents of the Decrement part of the Register'
- cons - abbreviation of the word 'construct'
A single dotted pair (a . b) denoting an association of two atoms 'a' and 'b' can be represented with the following diagram.
(a . b) or (cons a b) +-----+-----+ | car | cdr | ---> b +-----+-----+ | \ / a
According to this dotted-pairs or cons-cell linked lista approach, a list (a b c) is represented as:
representation 1: (cons a (cons b (cons c nil))) representation 2: (a . (b . (c . nil))) +-----+-----+ +-----+-----+ +-----+-----+ | car | cdr | --->| car | cdr | --->| car | cdr |---> NIL +-----+-----+ +-----+-----| +-----+-----+ => Empty list in Scheme '()) | | | => nil (null pointer) in Common Lisp \ / \ / \ / a b c
A proper list, such as (a b c d), is terminated in nil as shown in the following diagram:
cons(a, cons(b, cons(c, cons(d, nil))) (a . (b . (c . (d . nil))))
An improper list, such as (a b . c), is terminated in a non nil or non empty-list atom. The next diagram show how this list is represented in memory.
cons(a, cons(b, c)) (a . (b . c)) +-----+-----+ +-----+-----+ | car | cdr | --->| car | cdr | ---> c +-----+-----+ +-----+-----| | | \ / \ / a b
The EBNF grammar of this dotted-pairs linked list can be specified as follow:
sexp := atom | pair | list pair := "(" sexp "." sexp ")" list := "(" sexp* ")" atom := NUMBER | SYMBOL | STRING | BOOLEAN | SYMBOL | NULL
The AST can be represnted using OCaml algebraic data type notation as:
type sexp = num of int // number literal | sym of string // symbol, identifier or keyword | str of string // string literal between quotes "" | bool of boolean // boolean literak #t or #f | nil // represents null value | cons of sexp * sexp // cons cell => It can represent a null-terminated
The previous grammar can be parsed using this pascal-like pseudocode:
// Token Types // //------------------------------------------------// // Delimiters T_LPAR "(" // left parenthesis T_RPAR ")" // right parenthesis T_DOT "." // Dot - found in dotted pair '(a . b) T_QUOTE "'" // quote (') // Literal token types (terminal elements of grammar) T_NUM // number T_STR // string literal "something else" T_SYM // symbol 'my-symbol T_NIL // nil => null element and empty list T_BOOL // boolean #t (true) or #f (false) // Sentinel Token Types T_EOF // End Of File T_ERR // Error T_NDF // Non Defined - non initialized token // -------------------------------------------------------// // Those tokens come from the tokenizer which skips blank characters, white spaces and new lines, and // breaks the source code into terminal elements of the grammar. tokens := [ Token(T_LPAR, "(", Token(T_NUM, "-1245.13e3"), Token(T_SYM, "x1") ,Token(T_STR, "hello world"), Token(T_NIL, "nil"), Token(T_BOOL, "#f") ,Token(T_BOOL, "#t"), Token(T_DOT, "."), Token(T_RPAR, ")") ] // ------- Global Variables ==> They should be contained in an object. // Current position of cursors index := 0 // Current token current := Token(T_NDF, "") // Advance to next token FUNCTION advance(): void IF index >= len(tokens) THEN RETURN Token(T_EOF, "") END current := tokens[index] index := index + 1 END // Check whether the current token type has the same type as the argument FUNCTION check(token_type): bool RETURN current.type == token_type END // Returns true if the current token matches the expected type and consume it. FUNCTION match(token_type): bool IF current.type == token_type THEN advance() RETURN true END RETURN false END // If the current token does not match the expected type, raise an error. FUNCTION expect(token_type) IF current.type == token_type advance() ELSE error("Expected token ", token_type, ", but given ", current) END END // Return true if the current token is EOF (End Of File) FUNCTION is_eof(): bool RETURN current.type == EOF END //--------------------------------------------------------// // Parse Lisp symbolic expression FUNCTION parse_sexp(): Ast BEGIN IF token.is_atom() THEN RETURN parse_atom() ELSE IF match(T_LPAR) THEN RETURN parse_list() ELSE IF match(T_QUOTE) THEN q = self.parse_sexp() RETURN AstCons( AstSymbol("quote"), AstCons(q, AstNil() ) ) ELSE error("Invalid token: ", token_string) END END FUNCTION parse_atom(): Ast BEGIN ast := null IF check(TOKEN_NUM) THEN ast := AstInt(parse_num(lexeme)) END ELSE IF check(TOKEN_STR) THEN ast := AstStr(lexeme) END ELSE IF check(TOKEN_SYM) THEN ast := AstSym(lexeme) END ELSE IF check(TOKEN_NULL) THEN ast := AstNil() END // If this line runs, it means that there is a bug. ELSE error("parse_atom() => invalid token: ", token_string) END // Advance token advance() RETURN ast END FUNCTION parse_list(): Ast IF match(T_RPAR) THEN RETURN AstNil() ELSE IF match(T_DOT) THEN IF match(T_RPAR) THEN error("Error: ill-formed dotted list.") END element := parse_sexp() expect(T_RPAR) RETURN element ELSE iF match(T_EOF) THEN error("Expected right parenthesis") END head := parse_sexp() tail := pase_list() pair := cons(head, tail) RETURN pair END //---------- Program entry point - main() --------// // FUNCTION main(): void // Lexical analysis, break text/source into tokens tokens := tokenize(source_code) // Initialize parser advance() // parser AST ast := parse_sexp() // Display result print(" Lisp AST: ", ast) END
The next code is an executable implementation of the presented parsing algorithm implemented in Python, but that can be easily translated to any other programming language.
File: sexp_parser1.py
from typing import Any, Callable, Dict, NamedTuple, Optional, List, Tuple # ============= >>>> Token types <<<<=================== # --- Literals ------------------------------------------------- T_INT = "T_INT" # Integer: 1002 -245 9025 T_FLT = "T_FLT" # Floating point number: -10.2522 9.25221 .0252e3 102.5E3 T_STR = "T_STR" # String literal: "world" T_SYM = "T_SYM" # Symbol: 'hello T_NIL = "T_NIL" # Nil element or end of list T_BOOL = "T_BOOL" # Boolean: #f or #t # --- Delimiters ----------------------------------------------- T_LPAR = "T_LPAR" # left parenthesis T_RPAR = "T_RPAR" # right parenthesis T_DOT = "T_DOT" # dot '.' T_QUOTE = "T_QUOTE" # quote (') # --- Sentinel tokens --------------------------------------- T_NDF = "T_NDF" # Non defined, non initialized sentinel token T_EOF = "T_EOF" # End of file sentinel token class Token(NamedTuple): type: str # Token type text: str # Token text, also known as lexeme pos: int # Token position col: int # Token column - starts with 1 lin: int # Token line - starts with 1 TOKEN_EOF = Token(T_EOF, "", 0, 0, 0) NULLCHAR = '\0' class Tokenizer: def __init__(self, source: str) -> None: # Source code to be tokenized self._source = source # Current position of cursor self._pos = 0 # Current line of cursor self._lin = 1 # Current column of cursor self._col = 1 # Position of first token character self._token_pos = 0 # Column of first token character self._token_col = 1 # Line of first token chracter self._token_lin = 1 # Current character self._chr = '' # Current token self.advance() def advance(self): if self._pos >= len(self._source): self._chr = '\0' return self._chr = self._source[self._pos] self._col += 1 if self._chr == '\n': self._lin += 1 self._col = 1 self._pos += 1 def is_eof(self): """Return true if character is end of input""" return self._chr == NULLCHAR def has_next(self) -> bool: """Return true if tokenizer has next token""" return not self.is_eof() def is_space(self): """Return true if current character is space or blank character. """ return self._chr.isspace() def is_digit(self): """Return true if current character is digit.""" return self._chr.isdigit() def is_delimiter(self): return self._chr in ["(", ")", ".", "'"] def check(self, chr: str): return self._chr == chr def peek(self): """Return current character""" return self._chr def lookahead(self, n: int = 1): """Returns next character in the stream without consuming it""" if self._pos + n >= len(self._source): return NULLCHAR return self._source[self._pos + n - 1] def expect(self, chr: str): if self._chr == chr: self.advance() return raise RuntimeError(f"Expected character {chr}, but given {self._chr}.") def match(self, chr: str): if self._chr == chr: self.advance() return True return False def next(self) -> Token: """Return next token""" if self.is_eof(): return self._token(T_EOF, "") # Skip whitespace self._skip_whitespace() self.set_token_position() if self.match("("): return self._token(T_LPAR, "(") elif self.match(")"): return self._token(T_RPAR, ")") elif self.match("."): return self._token(T_DOT, ".") elif self.match("'"): return self._token(T_QUOTE, "'") elif self.check('-') and self.lookahead().isdigit(): return self._next_number() elif self.is_digit(): return self._next_number() elif self.check("\""): return self._next_string() else: return self._next_symbol_or_bool() def tokens(self) -> List[Token]: lst = [] while not self.is_eof(): t = self.next() if t.type == T_EOF: break lst.append(t) return lst def set_token_position(self): """Store position of beginning of token until this function is called again.""" self._token_pos = self._pos - 1 self._token_col = self._col - 1 self._token_lin = self._lin def _skip_whitespace(self): while self.is_space() and not self.is_eof(): self.advance() def _token(self, type: str, lexeme: str) -> Token: return Token(type, lexeme ,self._token_pos ,self._token_col ,self._token_lin) def _next_symbol_or_bool(self) -> Token: lexeme = "" self.set_token_position() while not self.is_eof() and not self.is_space() and not self.is_delimiter(): lexeme += self.peek() self.advance() if lexeme == "": return self._token(T_EOF, "") if lexeme == "#t" or lexeme == "#f": return self._token(T_BOOL, lexeme) return self._token(T_SYM, lexeme) def _next_number(self) -> Token: lexeme = "" self.set_token_position() # In grammars where '-' negative sign is not part of literal # remove the next line if self.match('-'): lexeme += "-" if not self.check('.'): lexeme += self._next_integer() if not self.check('.') and not self.check('e') and not self.check('E'): if self.peek().isalpha(): raise RuntimeError("Error: invalid number, cannot end with letter") # Identified integer type return self._token(T_INT, lexeme) if self.match('.'): lexeme += "." + self._next_integer() letter = self.peek() if self.match('e') or self.match('E'): lexeme += letter if not self.match('-') and not self.is_digit(): raise RuntimeError("Error: invalid floating point number: " + lexeme) if self.match('-'): lexeme += "-" if not self.is_digit(): raise RuntimeError("Error: invalid floating point number. Expected digit, but got: " + lexeme) lexeme += self._next_integer() if self.peek().isalpha(): raise RuntimeError("Error: invalid number, it cannot end with letter") # Return a lexeme (string representation) of a floating point number literal return self._token(T_FLT, lexeme) def _next_integer(self) -> str: assert self.is_digit() lexeme = "" while self.is_digit() and not self.is_eof(): lexeme += self.peek() self.advance() return lexeme def _next_string(self) -> Token: self.expect('"') lexeme = "" self.set_token_position() while not self.is_eof() and not self.check('"'): if self.match('\\'): if self.match('n'): lexeme += "\n"; continue if self.match('t'): lexeme += "\t"; continue if self.match('v'): lexeme += "\v"; continue if self.match('"'): lexeme += "\""; continue if self.match('\\'): lexeme += "\\"; continue raise RuntimeError("Tokenizer error, invalid escape character: '" + self._ch + "'") lexeme += self.peek() self.advance() self.expect('"') return self._token(T_STR, lexeme) EXPR_CONS = "EXPR_CONS" EXPR_NIL = "EXPR_NIL" EXPR_BOOL = "EXPR_BOOL" EXPR_STR = "EXPR_STR" EXPR_SYM = "EXPR_SYM" EXPR_NUM = "EXPR_NUM" EXPR_ERR = "EXPR_ERR" class Expr: """Reprsents a S-Expression symbolic expression LISP linked list.""" def to_repr(self) -> str: return "<NOT IMPLEMENTED>" def to_str(self) -> str: return "<NOT IMPLEMENTED>" def type(self) -> str: return "<NOT IMPLEMENTED>" def to_num(self) -> float: return 0.0 def is_atom(self) -> bool: return False def is_pair(self) -> bool: return False def is_nil(self) -> bool: return False def is_num(self) -> bool: return False def is_sym(self) -> bool: return False def is_bool(self) -> bool: return False def is_str(self) -> bool: return False def is_err(sefl) -> bool: return False def car(self) -> 'Expr': raise RuntimeError("Not implemented") def cdr(self) -> 'Expr': raise RuntimeError("Not implemented") def __repr__(self) -> str: return self.to_repr() class ExprNil(Expr): """This AST node represents an empty list or null element.""" def is_atom(self) -> bool: return True def is_nil(self) -> bool: return True def to_repr(self) -> str: return "nil" def to_str(self) -> str: return "nil" def type(self) -> str: return EXPR_NIL class ExprCons(Expr): """AST node represents a cons cell.""" def type(self) -> str: return EXPR_CONS def is_pair(self) -> bool: return True def __init__(self, car: Expr, cdr: Expr) -> None: self._car = car self._cdr = cdr def car(self) -> 'Expr': return self._car def cdr(self) -> 'Expr': return self._cdr def to_repr(self) -> str: head = self._car.to_repr() tail = self._cdr.to_repr() return f"cons({head}, {tail})" def to_str(self) -> str: return self.to_repr() def type(self) -> str: return EXPR_CONS class ExprSym(Expr): """AST node represents a lisp symbol.""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_sym(self) -> bool: return True def to_repr(self): return "'" + self._value def to_str(self): return self._value def type(self) -> str: return EXPR_SYM class ExprNum(Expr): """AST node represents a lisp number.""" def __init__(self, value: int): self._value = value def is_atom(self) -> bool: return True def is_sym(self) -> bool: return True def to_repr(self): return str(self._value) def to_str(self): return str(self._value) def to_num(self) -> float: return self._value def type(self) -> str: return EXPR_NUM class ExprStr(Expr): """AST node represents a lisp symbol.""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_str(self) -> bool: return True def to_repr(self): return '"' + self._value + '"' def to_str(self): return self._value def type(self) -> str: return EXPR_STR class ExprBool(Expr): """AST node represents a lisp boolean value.""" def __init__(self, value: bool): self._value = value def is_atom(self) -> bool: return True def is_bool(self) -> bool: return True def to_repr(self): return "#t" if self._value else "#f" def to_str(self): return "#t" if self._value else "#f" def type(self) -> str: return EXPR_BOOL class ExprErr(Expr): """AST node represents a runtime error value (not an AST value).""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_err(sefl) -> bool: return True def to_repr(self): return "err => " + self._value def to_str(self): return self._value def type(self) -> str: return EXPR_ERR class Parser: def __init__(self, source: str) -> None: self._source = source tok = Tokenizer(source) self._tokens = tok.tokens() self._pos = 0 self._current = Token(T_NDF, "", 0, 0, 0) self.advance() def peek(self): return self._current def lookahead(self, n: int = 1): """Returns next character in the stream without consuming it""" if self._pos + n >= len(self._tokens): return Token(T_EOF, 0, 0, 0, 0) return self._tokens[self._pos + n - 1] def advance(self): """Advance to next token""" if self._pos >= len(self._tokens): self._current = Token(T_EOF, 0, 0, 0, 0) return self._current = self._tokens[self._pos] self._pos += 1 def match(self, type: str): """Returns true if the current token matches the expected type and consume it.""" if self._current.type == type: self.advance() return True return False def check(self, type: str): """Returns true if the current token matches the expected type and consume it.""" return self._current == type def expect(self, type: str): """Consume current token if it matches the expected type. Otherwise raise an error.""" if self._current.type == type: self.advance() return raise RuntimeError(f"Expected token of type {type}, but given {self._current}.") def is_eof(self): """Return true if current token is T_EOF""" return self._current.type == T_EOF # ------- End of reusable parser code ---------------_# def is_atom(self): """Return true if current token is atom""" return self.peek().type in [ T_BOOL, T_FLT, T_INT, T_SYM, T_NIL, T_STR ] def parse_sexp(self) -> Expr: if self.is_atom(): return self.parse_atom() elif self.match(T_LPAR): return self.parse_list() # 'x is parsed as(quote x) # '(+ x 10) is parsed as (quote (+ x 10)) elif self.match(T_QUOTE): q = self.parse_sexp() return ExprCons(ExprSym("quote"), ExprCons(q, ExprNil())) else: raise RuntimeError("Invalid token: " + str(self.peek())) def parse_atom(self) -> Expr: lexeme = self.peek().text if self.match(T_INT): return ExprNum(float(lexeme)) elif self.match(T_FLT): return ExprNum(float(lexeme)) elif self.match(T_STR): return ExprStr(lexeme) elif self.match(T_SYM): return ExprSym(lexeme) elif self.match(T_BOOL): return ExprBool(lexeme == "#t") else: raise RuntimeError("Invalid token: " + lexeme) def parse_list(self) -> Expr: # self.expect(T_LPAR) if self.match(T_RPAR): return ExprNil() elif self.match(T_DOT): if self.match(T_RPAR): raise RuntimeError("Ill-formed dotted list.") elem = self.parse_sexp() self.expect(T_RPAR) return elem elif self.match(T_EOF): raise RuntimeError("Expected right parenthesis") car = self.parse_sexp() cdr = self.parse_list() ast = ExprCons(car, cdr) return ast def sexp_parse(source: str) -> Expr: p = Parser(source) return p.parse_sexp() def sexp_equal(lhs: Expr, rhs: Expr) -> bool: """Determine whether two S-Expressions are equal""" if lhs.type() != rhs.type(): return False elif lhs.is_atom() and rhs.is_atom(): if lhs.is_num(): return lhs.to_num() == rhs.to_num() elif lhs.is_str(): return lhs.to_str() == rhs.to_str() elif lhs.is_sym(): return lhs.to_str() == rhs.to_str() elif lhs.is_bool(): return lhs._value == rhs._value elif lhs.is_nil(): return True else: raise RuntimeError("Error: edge case => code has bugs. ") else: t = sexp_equal(lhs.car(), rhs.car()) if not t: return False k = sexp_equal(lhs.cdr(), rhs.cdr()) return k def sexp_to_str(sexp: Expr) -> Expr: """Pretty printer of S-Expression""" ## import pdb; pdb.set_trace() if sexp.is_str(): return sexp.to_repr() elif sexp.is_atom(): return sexp.to_str() elif sexp.is_pair(): s = sexp text = "(" while s.is_pair(): head = s.car() s = s.cdr() text = text + " " + sexp_to_str(head) if s.is_nil(): text = text + ")" elif s.is_atom(): text = text + " . " + s.to_str() + " )" else: RuntimeError("Edge case => this branch is impossible to happen.") return text else: raise RuntimeError("Not implemented for this Ast element: " + sexp) def test_expr1(): """unit test of SEXP parser.""" src = ' ( 9.214 "world" a . end )' expected = ExprCons(ExprNum(9.214), ExprCons(ExprStr("world"), ExprCons( ExprSym('a'), ExprSym("end"))) ) ast = sexp_parse(src) assert sexp_equal(ast, expected) def test_expr2(): """Test in-meory AST representation of sexp""" src = '( "hello world" (100 a . b) x1 z )' expected = ExprCons( ExprStr("hello world") , ExprCons( ExprCons(ExprNum(100.0), ExprCons(ExprSym("a"), ExprSym("b"))) , ExprCons(ExprSym("x1") , ExprCons(ExprSym("z"), ExprNil()))) ) ast = sexp_parse(src) assert sexp_equal(ast, expected) def test_expr3(): """Test quote quoted expression""" src = ' \'(+ x y z)' lst = ExprCons( ExprSym("+") , ExprCons(ExprSym("x"), ExprCons(ExprSym("y"), ExprCons( ExprSym("z"), ExprNil())))) expected = ExprCons( ExprSym("quote"), ExprCons(lst, ExprNil())) ast = sexp_parse(src) assert sexp_equal(ast, expected) def test_expr4(): """Test quoted expression""" src = ' \'(+ x y z)' expected = "cons('quote, cons(cons('+, cons('x, cons('y, cons('z, nil)))), nil))" ast = sexp_parse(src) assert str(ast) == expected def test_exrp5(): """Test string representation of sexp""" src = ' \'(+ x y z)' s = sexp_parse(src) # Pretty print representation of SEXP p = sexp_to_str(s) expected = '( quote ( + x y z))' assert p == expected def test_expr6(): src = """ (fn myfunction(x y . lst) (print x) (print y) (map print lst) ) """ expected = '( fn myfunction ( x y . lst ) ( print x) ( print y) ( map print lst))' ast = sexp_parse(src) # Pretty print representation of SEXP p = sexp_to_str(ast) assert p == expected
Run unit tests:
$ pytest -q -v -rA sexp_parser1.py ============================================== test session starts ============================================== platform linux -- Python 3.8.10, pytest-7.3.1, pluggy-1.0.0 rootdir: /home/user/formula-parser plugins: anyio-3.5.0, xonsh-0.13.3 collected 6 items sexp_parser1.py ...... [100%] ==================================================== PASSES ===================================================== ============================================ short test summary info ============================================ PASSED sexp_parser1.py::test_expr1 PASSED sexp_parser1.py::test_expr2 PASSED sexp_parser1.py::test_expr3 PASSED sexp_parser1.py::test_expr4 PASSED sexp_parser1.py::test_exrp5 PASSED sexp_parser1.py::test_expr6 =============================================== 6 passed in 0.04s ===============================================
Interact with the parser in the REPL:
$ python3 -i sexp_parser1.py src = """ ( a x->1 "My string literal" (m1 . w ) (1.25 -9.251e3 0.255 nil) z2->w . object:method ) """ ast = sexp_parse(src) >>> ast cons('a, cons('x->1, cons("My string literal", cons(cons('m1, 'w), cons(cons(1.25, cons(-9251.0, cons(0.255, cons('nil, nil)))), cons('z2->w, 'object:method)))))) >>> ast.car() 'a >>> ast.cdr() cons('x->1, cons("My string literal", cons(cons('m1, 'w), cons(cons(1.25, cons(-9251.0, cons(0.255, cons('nil, nil)))), cons('z2->w, 'object:method))))) >>> ast.cdr().cdr() cons("My string literal", cons(cons('m1, 'w), cons(cons(1.25, cons(-9251.0, cons(0.255, cons('nil, nil)))), cons('z2->w, 'object:method)))) >>> ast.cdr().cdr().cdr() cons(cons('m1, 'w), cons(cons(1.25, cons(-9251.0, cons(0.255, cons('nil, nil)))), cons('z2->w, 'object:method))) >>> sexp_to_str(ast) '( a x->1 "My string literal" ( m1 . w ) ( 1.25 -9251.0 0.255 nil) z2->w . object:method )' >>> sexp_to_str(ast.cdr()) '( x->1 "My string literal" ( m1 . w ) ( 1.25 -9251.0 0.255 nil) z2->w . object:method )' >>> sexp_to_str(ast.cdr().cdr()) '( "My string literal" ( m1 . w ) ( 1.25 -9251.0 0.255 nil) z2->w . object:method )' >>> sexp_to_str(ast.cdr().cdr().cdr()) '( ( m1 . w ) ( 1.25 -9251.0 0.255 nil) z2->w . object:method )' >>> sexp_to_str(ast.cdr().cdr().cdr().car()) '( m1 . w )' >>> sexp_to_str(ast.cdr().cdr().cdr().car().car()) 'm1' >>> sexp_to_str(ast.cdr().cdr().cdr().car().cdr()) 'w'
1.11.4 sexp parser algorithm without dotted list
Besides the dotted-pair approach that represents S-expressions as linked lists of cons cells, S-expresions can also be represented by nested lists or arrays. This approach is more suitable for data exchange and serialization. Other advantage is the easier and quickier implementation of a lisp-like embedded scripting languages hosted in a garbage collected statically typed languages such as Java or C#. Unlike the dotted-pair AST - abstract syntax tree, this type of AST cannot represent improper lists.
EBNF for Lisp/Scheme Sexp S-Expression without cons pairs linked list:
sexp := atom | list list := "(" sexp* ")" atom := NUMBER | BOOLEAN | SYMBOL | STRING | KEYWORD
Pascal-like pseudocde for S-expression parser implementation:
// Token Types // //------------------------------------------------// // Delimiters T_LPAR "(" // left parenthesis T_RPAR ")" // right parenthesis T_DOT "." // Dot - found in dotted pair '(a . b) T_QUOTE "'" // Quote // Literal token types (terminal elements of grammar) T_NUM // number T_STR // string literal "something else" T_SYM // symbol 'my-symbol T_NIL // nil => null element and empty list T_BOOL // boolean #t (true) or #f (false) // Sentinel Token Types T_EOF // End Of File T_ERR // Error T_NDF // Non Defined - non initialized token // -------------------------------------------------------// // Those tokens come from the tokenizer which skips blank characters, white spaces and new lines, and // breaks the source code into terminal elements of the grammar. tokens := [ Token(T_LPAR, "(", Token(T_NUM, "-1245.13e3"), Token(T_SYM, "x1") ,Token(T_STR, "hello world"), Token(T_NIL, "nil"), Token(T_BOOL, "#f") ,Token(T_BOOL, "#t"), Token(T_DOT, "."), Token(T_RPAR, ")") ] // ------- Global Variables ==> They should be contained in an object. // Current position of cursors index := 0 // Current token current := Token(T_NDF, "") // Advance to next token FUNCTION advance(): void IF index >= len(tokens) THEN RETURN Token(T_EOF, "") END current := tokens[index] index := index + 1 END // Check whether the current token type has the same type as the argument FUNCTION check(token_type): bool RETURN current.type == token_type END // Returns true if the current token matches the expected type and consume it. FUNCTION match(token_type): bool IF current.type == token_type THEN advance() RETURN true END RETURN false END // If the current token does not match the expected type, raise an error. FUNCTION expect(token_type) IF current.type == token_type advance() ELSE error("Expected token ", token_type, ", but given ", current) END END // Return true if the current token is EOF (End Of File) FUNCTION is_eof(): bool RETURN current.type == EOF END //--------------------------------------------------------// // Algebraic data types in languages like OCaml or Scala; // or class hierarchy in languages like Java Ast is root class -> AstNum(val: int) // number value -> AstStr(val: str) // string value -> AstSym(val: str) // symbol value -> AstBool(val: bool) // #t (true) or #f (false) -> AstNil() // nil -> AstList(val: [Ast]] // List instead of a cons cell -> AstErr(val: str) // Represents parser error or runtime error FUNCTION parse_sexp(): Ast BEGIN IF check(T_LPAR) THEN RETURN parse_list() ELSE IF is_atom(token) THEN RETURN parse_atom() ELSE IF check(QUOTE) THEN sexp := parse_sexp() RETURN AstList([ AStSym("quote"), sexp]) ELSE error("Invalid token") END FUNCTION parse_atom(): Ast BEGIN ast := null // Lexeme is that string that a tokens represents IF check(TOKEN_NUM) THEN ast := AstInt(parse_num(lexeme)) END ELSE IF check(TOKEN_STR) THEN ast := AstStr(lexeme) END ELSE IF check(TOKEN_SYM) THEN ast := AstSym(lexeme) END ELSE IF check(TOKEN_NULL) THEN ast := AstNil() END // If this line runs, it means that there is a bug. ELSE error("parse_atom() => invalid token: ", token_string) END // Advance token advance() RETURN ast END FUNCTION parse_list(): Ast expect(T_LPAR) list: [Ast] = [] WHILE NOT check(T_RPAR) and NOT is_eof() THEN sexp = parse_sexp() ASSERT sexp != error append_to_list(list, sexp) END expect(T_RPAR) RETURN list END //---------- Program entry point - main() --------// // FUNCTION main(): void // Lexical analysis, break text/source into tokens tokens := tokenize(source_code) // Initialize parser advance() // parser AST ast := parse_sexp() // Display result print(" Lisp AST: ", ast) END
Parser algorithm implementation in Python:
File: sexp_parser2.py
from tokenize import Double from typing import Any, Callable, Dict, NamedTuple, Optional, List, Tuple # ============= >>>> Token types <<<<=================== # --- Literals ------------------------------------------------- T_INT = "T_INT" # Integer: 1002 -245 9025 T_FLT = "T_FLT" # Floating point number: -10.2522 9.25221 .0252e3 102.5E3 T_STR = "T_STR" # String literal: "world" T_SYM = "T_SYM" # Symbol: 'hello T_NIL = "T_NIL" # Nil element or end of list T_BOOL = "T_BOOL" # Boolean: #f or #t # --- Delimiters ----------------------------------------------- T_LPAR = "T_LPAR" # left parenthesis T_RPAR = "T_RPAR" # right parenthesis T_DOT = "T_DOT" # dot '.' T_QUOTE = "T_QUOTE" # quote (') # --- Sentinel tokens --------------------------------------- T_NDF = "T_NDF" # Non defined, non initialized sentinel token T_EOF = "T_EOF" # End of file sentinel token class Token(NamedTuple): type: str # Token type text: str # Token text, also known as lexeme pos: int # Token position col: int # Token column - starts with 1 lin: int # Token line - starts with 1 TOKEN_EOF = Token(T_EOF, "", 0, 0, 0) NULLCHAR = '\0' class Tokenizer: def __init__(self, source: str) -> None: # Source code to be tokenized self._source = source # Current position of cursor self._pos = 0 # Current line of cursor self._lin = 1 # Current column of cursor self._col = 1 # Position of first token character self._token_pos = 0 # Column of first token character self._token_col = 1 # Line of first token chracter self._token_lin = 1 # Current character self._chr = '' # Current token self.advance() def advance(self): if self._pos >= len(self._source): self._chr = '\0' return self._chr = self._source[self._pos] self._col += 1 if self._chr == '\n': self._lin += 1 self._col = 1 self._pos += 1 def is_eof(self): """Return true if character is end of input""" return self._chr == NULLCHAR def has_next(self) -> bool: """Return true if tokenizer has next token""" return not self.is_eof() def is_space(self): """Return true if current character is space or blank character. """ return self._chr.isspace() def is_digit(self): """Return true if current character is digit.""" return self._chr.isdigit() def is_delimiter(self): return self._chr in ["(", ")", ".", "'"] def check(self, chr: str): return self._chr == chr def peek(self): """Return current character""" return self._chr def lookahead(self, n: int = 1): """Returns next character in the stream without consuming it""" if self._pos + n >= len(self._source): return NULLCHAR return self._source[self._pos + n - 1] def expect(self, chr: str): if self._chr == chr: self.advance() return raise RuntimeError(f"Expected character {chr}, but given {self._chr}.") def match(self, chr: str): if self._chr == chr: self.advance() return True return False def next(self) -> Token: """Return next token""" if self.is_eof(): return self._token(T_EOF, "") # Skip whitespace self._skip_whitespace() self.set_token_position() if self.match("("): return self._token(T_LPAR, "(") elif self.match(")"): return self._token(T_RPAR, ")") elif self.match("."): return self._token(T_DOT, ".") elif self.match("'"): return self._token(T_QUOTE, "'") elif self.check('-') and self.lookahead().isdigit(): return self._next_number() elif self.is_digit(): return self._next_number() elif self.check("\""): return self._next_string() else: return self._next_symbol_or_bool() def tokens(self) -> List[Token]: lst = [] while not self.is_eof(): t = self.next() if t.type == T_EOF: break lst.append(t) return lst def set_token_position(self): """Store position of beginning of token until this function is called again.""" self._token_pos = self._pos - 1 self._token_col = self._col - 1 self._token_lin = self._lin def _skip_whitespace(self): while self.is_space() and not self.is_eof(): self.advance() def _token(self, type: str, lexeme: str) -> Token: return Token(type, lexeme ,self._token_pos ,self._token_col ,self._token_lin) def _next_symbol_or_bool(self) -> Token: lexeme = "" self.set_token_position() while not self.is_eof() and not self.is_space() and not self.is_delimiter(): lexeme += self.peek() self.advance() if lexeme == "": return self._token(T_EOF, "") if lexeme == "#t" or lexeme == "#f": return self._token(T_BOOL, lexeme) return self._token(T_SYM, lexeme) def _next_number(self) -> Token: lexeme = "" self.set_token_position() # In grammars where '-' negative sign is not part of literal # remove the next line if self.match('-'): lexeme += "-" if not self.check('.'): lexeme += self._next_integer() if not self.check('.') and not self.check('e') and not self.check('E'): if self.peek().isalpha(): raise RuntimeError("Error: invalid number, cannot end with letter") # Identified integer type return self._token(T_INT, lexeme) if self.match('.'): lexeme += "." + self._next_integer() letter = self.peek() if self.match('e') or self.match('E'): lexeme += letter if not self.match('-') and not self.is_digit(): raise RuntimeError("Error: invalid floating point number: " + lexeme) if self.match('-'): lexeme += "-" if not self.is_digit(): raise RuntimeError("Error: invalid floating point number. Expected digit, but got: " + lexeme) lexeme += self._next_integer() if self.peek().isalpha(): raise RuntimeError("Error: invalid number, it cannot end with letter") # Return a lexeme (string representation) of a floating point number literal return self._token(T_FLT, lexeme) def _next_integer(self) -> str: assert self.is_digit() lexeme = "" while self.is_digit() and not self.is_eof(): lexeme += self.peek() self.advance() return lexeme def _next_string(self) -> Token: self.expect('"') lexeme = "" self.set_token_position() while not self.is_eof() and not self.check('"'): if self.match('\\'): if self.match('n'): lexeme += "\n"; continue if self.match('t'): lexeme += "\t"; continue if self.match('v'): lexeme += "\v"; continue if self.match('"'): lexeme += "\""; continue if self.match('\\'): lexeme += "\\"; continue raise RuntimeError("Tokenizer error, invalid escape character: '" + self._ch + "'") lexeme += self.peek() self.advance() self.expect('"') return self._token(T_STR, lexeme) EXPR_LIST = "EXPR_LIST" EXPR_NIL = "EXPR_NIL" EXPR_BOOL = "EXPR_BOOL" EXPR_STR = "EXPR_STR" EXPR_SYM = "EXPR_SYM" EXPR_NUM = "EXPR_NUM" EXPR_ERR = "EXPR_ERR" class Expr: """Reprsents a S-Expression, Lisp symbolic expression (but not linked list)""" # Get Representation showing object type, for instance Sym[x] if x is a symbol # Num[10], Str["my string"] def to_repr(self) -> str: return "<NOT IMPLEMENTED>" # Get S-Expression representation def to_sexp(self) -> str: return "<NOT IMPLEMENTED>" def type(self) -> str: return "<NOT IMPLEMENTED>" def to_num(self) -> float: return 0.0 def to_bool(self) -> bool: return True # Attempt to convert AST node to string def to_str(self) -> str: raise RuntimeError("Not implemented.") # Return nth element of expr list def nth(self, n: int) -> 'Expr': raise RuntimeError("Not valid for this type") # Return nodes of a ExprList, but this method is only valid for this type def nodes(self) -> List['Expr']: raise RuntimeError("Method only valid for ExprList") def is_atom(self) -> bool: return False def is_list(self) -> bool: return False def is_nil(self) -> bool: return False def is_num(self) -> bool: return False def is_sym(self) -> bool: return False def is_bool(self) -> bool: return False def is_str(self) -> bool: return False def is_err(sefl) -> bool: return False def car(self) -> 'Expr': raise RuntimeError("Not implemented") def cdr(self) -> 'Expr': raise RuntimeError("Not implemented") def __repr__(self) -> str: return self.to_repr() class ExprNil(Expr): """This AST node represents an empty list or null element.""" def is_atom(self) -> bool: return True def is_nil(self) -> bool: return True def to_repr(self) -> str: return "nil" def to_sexp(self) -> str: return "nil" def type(self) -> str: return EXPR_NIL def to_bool(self) -> bool: return False class ExprList(Expr): """AST node represents a list.""" def type(self) -> str: return EXPR_LIST def is_list(self) -> bool: return True def __init__(self, nodes: List[Expr]) -> None: self._nodes = nodes def car(self) -> 'Expr': return self._car def cdr(self) -> 'Expr': return self._cdr def nth(self, n: int) -> 'Expr': return self._nodes[n] def nodes(self) -> List[Expr]: return self._nodes def to_repr(self) -> str: nodes = ", ".join([ n.to_repr() for n in self._nodes ]) return f"List[ {nodes} ]" def to_sexp(self) -> str: nodes = " ".join([ n.to_sexp() for n in self._nodes ]) return f"({nodes})" def type(self) -> str: return EXPR_LIST class ExprSym(Expr): """AST node represents a lisp symbol.""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_sym(self) -> bool: return True def to_repr(self): return f"Sym[{self._value}]" def to_sexp(self): return self._value def to_str(self) -> str: return self._value def type(self) -> str: return EXPR_SYM class ExprNum(Expr): """AST node represents a lisp number.""" def __init__(self, value: int): self._value = value def is_atom(self) -> bool: return True def is_sym(self) -> bool: return True def to_repr(self): return f"Num[{self._value}]" def to_sexp(self): return str(self._value) def is_num(self) -> bool: return True def to_num(self) -> float: return self._value def to_str(self) -> str: return str(self._value) def type(self) -> str: return EXPR_NUM class ExprStr(Expr): """AST node represents a lisp symbol.""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_str(self) -> bool: return True def to_repr(self): return 'Str["' + self._value + '"]' def to_sexp(self): return '"' + self._value + '"' def to_str(self) -> str: return str(self._value) def type(self) -> str: return EXPR_STR class ExprBool(Expr): """AST node represents a lisp boolean value.""" def __init__(self, value: bool): self._value = value def is_atom(self) -> bool: return True def is_bool(self) -> bool: return True def to_repr(self): return "Bool[" + ("#t" if self._value else "#f") + "]" def to_sexp(self): return "#t" if self._value else "#f" def type(self) -> str: return EXPR_BOOL def to_bool(self) -> bool: return self._value def to_str(self) -> str: return "#t" if self._value else "#f" class ExprErr(Expr): """AST node represents a runtime error value (not an AST value).""" def __init__(self, value: str): self._value = value def is_atom(self) -> bool: return True def is_err(sefl) -> bool: return True def to_repr(self): return "err => " + self._value def to_sexp(self): return self._value def type(self) -> str: return EXPR_ERR def to_str(self) -> str: return self._value class Parser: def __init__(self, source: str) -> None: self._source = source tok = Tokenizer(source) self._tokens = tok.tokens() self._pos = 0 self._current = Token(T_NDF, "", 0, 0, 0) self.advance() def peek(self): return self._current def lookahead(self, n: int = 1): """Returns next character in the stream without consuming it""" if self._pos + n >= len(self._tokens): return Token(T_EOF, 0, 0, 0, 0) return self._tokens[self._pos + n - 1] def advance(self): """Advance to next token""" if self._pos >= len(self._tokens): self._current = Token(T_EOF, 0, 0, 0, 0) return self._current = self._tokens[self._pos] self._pos += 1 def match(self, type: str): """Returns true if the current token matches the expected type and consume it.""" if self._current.type == type: self.advance() return True return False def check(self, type: str): """Returns true if the current token matches the expected type and consume it.""" return self._current.type == type def expect(self, type: str): """Consume current token if it matches the expected type. Otherwise raise an error.""" if self._current.type == type: self.advance() return raise RuntimeError(f"Expected token of type {type}, but given {self._current}.") def is_eof(self): """Return true if current token is T_EOF""" return self._current.type == T_EOF # ------- End of reusable parser code ---------------_# def is_atom(self): """Return true if current token is atom""" return self.peek().type in [ T_BOOL, T_FLT, T_INT, T_SYM, T_NIL, T_STR ] def parse_sexp(self) -> Expr: if self.is_atom(): return self.parse_atom() elif self.check(T_LPAR): return self.parse_list() # 'x is parsed as(quote x) # '(+ x 10) is parsed as (quote (+ x 10)) elif self.match(T_QUOTE): q = self.parse_sexp() return ExprList([ExprSym("quote"), q]) else: raise RuntimeError("Invalid token: " + str(self.peek())) def parse_atom(self) -> Expr: lexeme = self.peek().text if self.match(T_INT): return ExprNum(float(lexeme)) elif self.match(T_FLT): return ExprNum(float(lexeme)) elif self.match(T_STR): return ExprStr(lexeme) elif self.match(T_SYM): if lexeme == "nil": return ExprNil() return ExprSym(lexeme) elif self.match(T_BOOL): return ExprBool(lexeme == "#t") else: raise RuntimeError("Invalid token: " + lexeme) def parse_list(self) -> Expr: self.expect(T_LPAR) nodes: List[Expr] = [] while not self.check(T_RPAR) and not self.is_eof(): sexp = self.parse_sexp() nodes.append(sexp) self.expect(T_RPAR) ast = ExprList(nodes) return ast def sexp_parse(source: str) -> Expr: """Parse S-Expression.""" p = Parser(source) ast = p.parse_sexp() return ast def sexp_equal(lhs: Expr, rhs: Expr) -> bool: """Compare two S-expressions returning true if they are equal. This function is useful for unit testing. """ if rhs.type() != lhs.type(): return False elif rhs.is_atom() and lhs.is_atom(): if rhs.is_num(): return rhs.to_num() == lhs.to_num() elif rhs.is_bool(): return rhs.to_bool() == lhs.to_bool() elif rhs.is_sym(): return rhs.to_sexp() == lhs.to_sexp() elif rhs.is_str(): return rhs.to_sexp() == lhs.to_sexp() elif rhs.is_nil(): return True elif rhs.is_list(): a = lhs.nodes() b = rhs.nodes() if len(a) != len(b): return False for n in range(0, len(a)): if not sexp_equal(a[n], b[n]): return False return True else: raise RuntimeError("Error - reached edge case.") def sexp_eval(sexp: Expr) -> Expr: """Non-turing complete S-Expression evaluator. It can only evaluate simple expressions without branching. """ # import pdb; pdb.set_trace() if sexp.is_num(): return sexp elif sexp.is_str(): return sexp elif sexp.is_nil(): return sexp elif sexp.is_bool(): return sexp elif sexp.is_err(): return sexp # Function or special-form evaluation elif sexp.is_list(): # import pdb; pdb.set_trace() if len(sexp.nodes()) == 0: return ExprNil() nodes = sexp.nodes() head = nodes[0] args = nodes[1:] if not head.is_sym(): return ExprErr(f"Error: {func} not callable.") # Obtain function name func = head.to_sexp() # Special form quote if func == "quote": if len(args) != 1: return ExprErr("Error: ill-formed quote special form") return args[0] # Sepcial form if: (if <cond> <then_form> <else_form>) if func == "if": if len(args) > 4 or len(args) < 1: return ExprErr("Syntax error - Ill-formed special form if") cond = args[0] cond_eval = sexp_eval(cond) if cond_eval.to_bool(): # evaluate the then-form res = sexp_eval(args[1]) return res if len(args) == 3: # evaluate the else-form res = sexp_eval(args[2]) return res return ExprNil() # Evaluate all arguments before function application evaluated_args = [] for a in args: res = sexp_eval(a) # print(" [TRACE] res = ", res) # Abort computation if the evaluation of any argument is an error if res.is_err(): return res evaluated_args.append(res) result = ExprNil() if func == "list": result = ExprList(evaluated_args) elif func == "car": if len(evaluated_args) != 1: return ExprErr("Function car() expects 1 argument of type list or nil") lst: Expr = evaluated_args[0] if lst.is_nil(): return ExprNil() if not lst.is_list(): return Expr("Function car() expects argument of type list.") n = lst.nodes() if len(n) == 0: return ExprNil() return n[0] elif func == "cdr": if len(evaluated_args) != 1: return ExprErr("Function cdr() expects 1 argument of type list or nil") lst: Expr = evaluated_args[0] if lst.is_nil(): return ExprNil() if not lst.is_list(): return ExprErr("Function cdr() expects argument of type list.") n = lst.nodes() if len(n) == 0: return ExprNil() return ExprList(n[1:]) elif func == "eval": if len(evaluated_args) != 1: return ExprErr("Function eval()) expects 1 argument.") expr = evaluated_args[0] result = sexp_eval(expr) elif func == "read": if len(evaluated_args) != 1: return ExprErr("Function eval()) expects 1 argument.") src = evaluated_args[0].to_str() ast = sexp_parse(src) return ast elif func == "+": result = fold_list(lambda x, acc: x + acc, evaluated_args) elif func == "-": result = fold_list(lambda x, acc: x - acc, evaluated_args) elif func == "*": result = fold_list(lambda x, acc: x * acc, evaluated_args) elif func == "/": result = fold_list(lambda x, acc: x / acc, evaluated_args) elif func == ">": if len(evaluated_args) != 2: return ExprErr("Error - function (>) expects two arguments") if evaluated_args[0].type() != EXPR_NUM: return ExprErr("Error - first argument should be a number") if evaluated_args[1].type() != EXPR_NUM: return ExprErr("Error - first argument should be a number") a = evaluated_args[0].to_num() b = evaluated_args[1].to_num() result = ExprBool(a > b) elif func == "<": if len(evaluated_args) != 2: return ExprErr("Error - function (>) expects two arguments") if evaluated_args[0].type() != EXPR_NUM: return ExprErr("Error - first argument should be a number") if evaluated_args[1].type() != EXPR_NUM: return ExprErr("Error - first argument should be a number") a = evaluated_args[0].to_num() b = evaluated_args[1].to_num() result = ExprBool(a < b) else: return ExprErr(f"Error unbound function '{func}' ") return result else: return ExprErr(f"Evaluator not implemented for {sexp}") #def fold_list(func: Callable[[Double, Double], Double], args: List[Expr]) -> Expr: def fold_list(func, args: List[Expr]) -> Expr: #print(" [TRACE] args = ", args) if len(args) < 1: return ExprErr("Error: expected at least 1 argument") if not args[0].is_num(): return ExprErr("Error: expected numeric argument") acc = args[0].to_num() for a in args[1:]: if a.is_err(): return a if not a.is_num(): return ExprErr("Expected numeric argument, but given: ", a) value = a.to_num() try: acc = func(acc, value) except ZeroDivisionError as ex: return ExprErr(str(ex)) result = ExprNum(acc) return result def repl(): show_ast = False while True: line = input(" SEXP> ").strip() if line == "": continue elif line == ":show_ast": show_ast = True continue elif line == ":hide_ast": show_ast = False continue ast = sexp_parse(line) if show_ast: print(" [TRACE] ast = ", ast) print(" [TRACE] ast = ", ast.to_sexp()) result = sexp_eval(ast) print(" := ", result.to_sexp()) # ----------- Unit Testing --------------------------------------------------# def test_sexp_parser(): """Test S-Expression parser""" src = "(list 'expr1 (* 3 5 6) 'expr2 (+ 1 2 (* 3 4)) )" expr1 = ExprList([ ExprSym("quote"), ExprSym("expr1")] ) expr2 = ExprList([ ExprSym("quote"), ExprSym("expr2")] ) lst1 = ExprList([ ExprSym("*"), ExprNum(3.0), ExprNum(5.0), ExprNum(6.0) ]) lst2 = ExprList([ ExprSym("+"), ExprNum(1), ExprNum(2), ExprList([ ExprSym("*"), ExprNum(3.0), ExprNum(4.0)])]) expected = ExprList([ ExprSym("list"), expr1, lst1, expr2, lst2 ]) ast = sexp_parse(src) assert sexp_equal(ast, expected) def test_sexp_evaluator1(): """Test S-expression interpreter""" src = "(list 'expr1 (* 3 5 6) 'expr2 (+ 1 2 (* 3 4)) )" expected = ExprList([ExprSym("expr1"), ExprNum(90.0), ExprSym("expr2"), ExprNum(15.0)]) ast = sexp_parse(src) output = sexp_eval(ast) assert sexp_equal(output, expected) def test_sexp_evaluator2(): """Test S-expression interpreter""" src = " (read \n \"(+ 1 2 3 4)\" )" expected = ExprList([ExprSym("+"), ExprNum(1.0), ExprNum(2.0), ExprNum(3.0), ExprNum(4.0)]) ast = sexp_parse(src) output = sexp_eval(ast) assert sexp_equal(output, expected) #-------------------------- End of Unit Testing ------------------------------# src = """ (fn myfunc(x y z) "Compute the distance from origin to point x, y, z" (sqrt (+ (* x x) (* y y) (* z z)) ) ) """ ast = sexp_parse(src)
Run the unit testing:
$ pytest -q -v -rA sexp_parser2.py =============================== test session starts ================================ platform linux -- Python 3.8.10, pytest-7.3.1, pluggy-1.0.0 rootdir: /home/user/formula-parser collected 3 items sexp_parser2.py ... [100%] ====================================== PASSES ====================================== ============================= short test summary info ============================== PASSED sexp_parser2.py::test_sexp_parser PASSED sexp_parser2.py::test_sexp_evaluator1 PASSED sexp_parser2.py::test_sexp_evaluator2 ================================ 3 passed in 0.04s =================================
Test the parser in a interactive way:
>>> ast List[ Sym[fn], Sym[myfunc], List[ Sym[x], Sym[y], Sym[z] ] , Str["Compute the distance from origin to point x, y, z"] , List[ Sym[sqrt], List[ Sym[+], List[ Sym[*], Sym[x], Sym[x] ] , List[ Sym[*], Sym[y], Sym[y] ], List[ Sym[*], Sym[z], Sym[z] ] ] ] ] >>> type(ast) <class '__main__.ExprList'> >>> ast.is_num() False >>> ast.is_list() True >>> ast.is_bool() False >>> ast.is_sym() False >>> ast.nth(0) Sym[fn] >>> ast.nth(1) Sym[myfunc] >>> ast.nth(2) List[ Sym[x], Sym[y], Sym[z] ] >>> ast.nth(2).to_sexp() '(x y z)' >>> ast.nth(2).nth(0) Sym[x] >>> ast.nth(2).nth(1) Sym[y] >>> ast.nth(2).nth(2) Sym[z] >>> ast.nth(3) Str["Compute the distance from origin to point x, y, z"] >>> ast.nth(4) List[ Sym[sqrt], List[ Sym[+], List[ Sym[*], Sym[x], Sym[x] ], List[ Sym[*], Sym[y], Sym[y] ], List[ Sym[*], Sym[z], Sym[z] ] ] ] >>> ast.nth(4).to_sexp() '(sqrt (+ (* x x) (* y y) (* z z)))'
Test the REPL (Read-Print-Eval-Loop). In order to keep, the code as short as possible, a turing-complete Lisp-evaluator was not included. As a result, the evaluator can only deal with simple arithmetic expressions.
>>> repl() SEXP> (list 'x (* 3 5 1) 'y (+ (* 3 5) (/ 100 5) 10) '(+ 100 x (* z w)) ) := (x 15.0 y 45.0 (+ 100.0 x (* z w))) SEXP> SEXP> (read "(+ (* 3 5) (+ 1 2 3 ))") := (+ (* 3.0 5.0) (+ 1.0 2.0 3.0)) SEXP> (eval (read "(+ (* 3 5) (+ 1 2 3 ))")) := 21.0 SEXP> (/ 10 0) := float division by zero SEXP> (if (> 3 15) (+ 2 5) (/ 10 0)) := float division by zero SEXP> (if (> 20 15) (+ 2 5) (/ 10 0)) := 7.0 SEXP> (if (> 3 15) (+ 2 5) (/ 10 0)) := float division by zero SEXP> (+ 100 (if (> 20 15) (+ 2 5) (/ 10 0))) := 107.0 SEXP> (+ 100 (if (> 10 15) (+ 2 5) (/ 10 0))) := float division by zero
1.11.5 Interpreter for Kotlin - Java Platform
There ara many lisp-like programming languages for the Java Platform and JVM, including Clojure, ABCL common lisp and Kawa Scheme. All those implementations compile to JVM bytecodes, instructions of Java Virtual Machine at runtime, but they don't work on Android, that uses Dalvik virtual machine, a different virtual machine than JVM.
This code is an incomplete Scheme-like embedded tree walking interpreter implemented in Kotlin language, that is able to compile to both JVM and Android Dalvik virtual machine. As a result, the code can be embedded in Java desktop application or android apps for faster experimentation and iteration since the interpreter allows instanting and calling Java objects at runtime without any time wasted on compilation.
File: sexpr.kt (Almost: 2130 lines of code)
// Embedded lisp-like interpreter hosted on Kotlin/JVM (Java Virtual Machine) // // Compile this program with: // $ kotlinc sexpr.kt -include-runtime -d sexpr.ja // // Run this sample program with: // $ java -jar sexpr.jar // $ rlrwrap java -jar sexpr.jar // // ---------------------------------------------------------------------------- typealias Func = (List<Ast>) -> Ast enum class AstType { INT, FLT, STR, SYM, KEY, OBJ, LST, ENV, BOOL, NIL , VOID, ERR, FUNC_NATIVE, FUNC_UDF } // Lisp Ast - Abstract Syntax Tree and Runtime values // This is equivalent to an AGDT - Algebraic Data Type sealed class Ast { open fun type(): AstType { throw Exception("Not implemented for this type of AST node.") } open fun toInt(): Int { return 0 } open fun toFlt(): Double { return 0.0 } open fun toBool(): Boolean { throw Exception("Not valid for this type of AST node.") } open fun toStr(): String { throw Exception("Not valid for this type of AST node.") } open fun toObj(): Object { throw Exception("Not valid for this type of AST node.") } open fun nth(n: Int): Ast { throw Exception("Not valid for this type of AST node. ") } open fun nodes(): List<Ast> { throw Exception("Not valid for this type of AST node. ") } open fun size(): Int { throw Exception("Not valid for this type of AST node.")} open fun head(): Ast { throw Exception("Not valid for this type of AST node. ")} open fun tail(): List<Ast> { throw Exception("Not valid for this type of AST node. ")} open fun isAtom(): Boolean { return false } // Return true if object is callable, in other words, the // object is a primtive function (defined in the host language) // or a UDF (User-Defined-Function) defined in the hosted language open fun isCallable(): Boolean { return false } // Return true if AST node is a runtime error open fun isErr(): Boolean { return false } // Return true is AST node is a symbol open fun isSym(): Boolean { return false } // Return true is AST node is a string literal open fun isStr(): Boolean { return false } // Return true if AST node is a keyword literal open fun isKey(): Boolean { return false } // Return true if AST node is a list open fun isList(): Boolean { return false } // Return true if AST node is a integer or floating point open fun isNum(): Boolean { return false} // Return true if AST node is a nil (null) object open fun isNil(): Boolean { return false } // Return true if AST node is void open fun isVoid(): Boolean { return false } // If true, then the runtime value represents a wrapped Java Object open fun isObj(): Boolean { return false } // Anything that is not false or nil is regarded as true // in if-else special forms (if <cond> <then> <else>) open fun asBool(): Boolean { return true } // Null-element ASt - emtpy AST for initialializing variables of type AST object AstVoid: Ast() { override fun isVoid(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun type(): AstType { return AstType.VOID } } // Lisp integer literal data class AstInt(val value: Int): Ast(){ override fun toInt(): Int { return value } override fun toFlt(): Double { return value.toDouble() } override fun isNum(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun type(): AstType { return AstType.INT } } // Lisp floating point literal data class AstFlt(val value: Double): Ast(){ override fun toFlt(): Double { return value } override fun isNum(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun toStr(): String { return value.toString() } override fun type(): AstType { return AstType.FLT } } // Lisp string literal data class AstStr(val value: String): Ast(){ override fun toStr(): String { return value } override fun isAtom(): Boolean { return true } override fun isStr(): Boolean { return true } override fun type(): AstType { return AstType.STR } } // Lisp symbol literal => 'hello-world, string->num data class AstSym(val value: String): Ast() { override fun toStr(): String { return value } override fun isSym(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun type(): AstType { return AstType.SYM } } // Lisp keyworkd literal, example => :width, :size data class AstKey(val value: String): Ast() { override fun toStr(): String { return value } override fun isAtom(): Boolean { return true } override fun isKey(): Boolean { return true } override fun type(): AstType { return AstType.KEY } } // Lisp boolean value data class AstBool(val value: Boolean): Ast(){ override fun toBool(): Boolean { return value } override fun asBool(): Boolean { return value } override fun isAtom(): Boolean { return true } override fun toStr(): String { return if(value) "#t" else "#f" } override fun type(): AstType { return AstType.BOOL } } // Ast - indicates null or empty list object AstNil: Ast() { override fun asBool(): Boolean { return false } override fun isNil(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun toStr(): String { return "nil"} override fun type(): AstType { return AstType.NIL } } // Lisp list data class AstLst(val value: List<Ast>): Ast() { override fun nth(n: Int): Ast { return value[n] } override fun size(): Int { return value.size } override fun nodes(): List<Ast> { return value } override fun head(): Ast { return value[0] } override fun tail(): List<Ast> { return value.takeLast( value.size - 1 )} override fun isList(): Boolean { return true } override fun type(): AstType { return AstType.LST } } // ------ Intepreter Runtime Values ---------------------------// // They don't represent AST nodes. They are the result of AST evaluation // by the interpreter. // Runtime error data class AstErr(val value: String): Ast() { override fun toStr(): String { return value } override fun isErr(): Boolean { return true } override fun type(): AstType { return AstType.ERR } } // Lisp environment contains bindings between variables and // runtime values class AstEnv(val parent: AstEnv? = null) : Ast() { public val env: MutableMap<String, Ast> = mutableMapOf<String, Ast>() public var name: String = "" override fun type(): AstType { return AstType.ENV } override fun isAtom(): Boolean { return true } fun has(name: String): Boolean { if( env.contains(name) ){ return true} if( parent == null ){ return false } val r = parent!!.has(name) return r } fun set(name: String, value: Ast){ env.put(name, value) } // Set only an existing binding fun setExisting(name: String, value: Ast): Boolean { if( env.contains(name) ){ env.put(name, value) return true } else if( parent != null){ val b = parent!!.setExisting(name, value) return b } return false } // Remove all elements from current environment fun reset(){ env.clear() } fun get(name: String): Ast? { // print(" [TRACE] (109) AstEnv.get() => name = ${name} => ${this}") if(env.contains(name)) { return env!!.get(name) } else if(parent == null) { return AstErr("Error: (line 113) unbound variable ${name}") } val r = parent!!.get(name) // println(" [TRACE] (116) AstEnv.get() => ${name} = ${r}") return r } } data class AstFunc(val name: String, val func: Func): Ast() { override fun isCallable(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun type(): AstType { return AstType.FUNC_NATIVE } } // Represents a user-defined function data class AstFuncUDF( val name: String , val args: List<String> , val body: List<Ast> , var env: AstEnv ): Ast() { override fun isCallable(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun type(): AstType { return AstType.FUNC_UDF } } // This AST node represents a Java object data class AstObj(val value: Object): Ast() { override fun isObj(): Boolean { return true } override fun isAtom(): Boolean { return true } override fun toObj(): Object { return value } override fun type(): AstType { return AstType.OBJ } } } enum class Type { T_Flt // Floating point number literal , T_Int // Integer , T_Str // String literal , T_Sym // Symbol literal ` , T_Key // keyword literal , T_Nil // Nil literal , T_Bool // Boolean literal , T_Quote // (') , T_Lpar // '(' Left parenthesis , T_Rpar // ')' Right parenthesis , T_EOF // Type EOF token , T_ERR // Token error , T_EMPTY } // Strings that matches the terminal element of a grammar data class Token( val type: Type , val text: String , val pos: Int , val lin: Int , val col: Int ) { fun isErr(): Boolean { return type == Type.T_ERR } fun isEof(): Boolean { return type == Type.T_EOF } } class Tokenizer(val source: String) { // Null character constant (sentinel value) val NULL_CHAR = 0.toChar() val QUOTE_CHAR = 34.toChar() // '"' breaks the syntax highlight // Initialize current character with NullChar var ch: Char = NULL_CHAR // Current position var pos: Int = 0 // Current line var lin: Int = 1 // Current column var col: Int = 1 private val TOKEN_EMPTY = Token(Type.T_EMPTY, "", 0, 0, 0) init { advance() } fun peek(): Char { return ch } fun lookahed(n: Int): Char { if (pos + n >= source.length){ return NULL_CHAR } return source[pos + n - 1] } fun isEof(): Boolean { return ch == NULL_CHAR } fun match(c: Char): Boolean { if(c == ch){ advance() return true } return false } fun check(c: Char): Boolean { return c == ch } fun expect(c: Char): Token { if(c == ch){ advance() return TOKEN_EMPTY } // throw Exception("Lexical error. Invalid character. Expected '${c}', but given '${ch}' ") return error("Lexical error. Invalid character. Expected '${c}', but given '${ch}' ") } fun advance() { if (pos >= source.length) { ch = 0.toChar() } else { ch = source[pos] pos = pos + 1 } } fun isSpace(): Boolean { return ch == ' ' || ch == '\t' || ch == '\r' || ch == '\n' } fun isDelimiter(): Boolean { return ch == ')' || ch == '(' || ch == '\'' || ch == '`' || ch == ',' || ch == '@' } fun isDigit(): Boolean { return ch.isDigit() } fun token(type: Type, lexeme: String): Token { val tok = Token(type, lexeme, pos, lin, col) return tok } // Returns an error sentinel token. It is an alternative to exceptions. fun error(message: String): Token { val tok = token(Type.T_ERR, message) return tok } fun tokens(): List<Token> { val tokens = mutableListOf<Token>() while( !isEof() ){ val t = next() if(t.type == Type.T_EOF){ break } tokens.add(t) } return tokens } fun next(): Token { // Return sentinel token when at end of input if (isEof()) { return token(Type.T_EOF, "") } while (isSpace() && !isEof()) { advance() } assert( !isSpace() ) // Skip comment if( match(';')){ while( !match('\n') && !this.isEof() ){ advance() } return next() } if ( match('(') ) return token(Type.T_Lpar, "(") if ( match(')') ) return token(Type.T_Rpar, ")") if ( match('\'') ) return token(Type.T_Quote, "'") if ( check('-') && lookahed(1).isDigit() ){ return nextNumber() } if ( isDigit() ) { return nextNumber() } if ( check(QUOTE_CHAR) ) { return nextString() } return nextSymbolOrBool() // throw Exception("edge case") } private fun nextSymbolOrBool(): Token { var lexeme = "" while( !isEof() && !isSpace() && !isDelimiter() ) { lexeme += peek() advance() } if( lexeme == ""){ return token(Type.T_EOF, "") } if( lexeme == "#t" || lexeme == "#f"){ return token(Type.T_Bool, lexeme)} if( lexeme.startsWith(":") ){ return token(Type.T_Key, lexeme.drop(1)) } return token(Type.T_Sym, lexeme) } private fun nextString(): Token { // println(" [TRACE] peek() = " + peek()) var err = expect(QUOTE_CHAR) if(err.isErr()){ return err} var lexeme = "" while( !isEof() && !check(QUOTE_CHAR)) { if( match('\\') ){ if( match('n') ){ lexeme += "\n"; continue } if( match('t') ){ lexeme += "\t"; continue } // if( match('v') ){ lexeme += "\v"; continue } if( match('\\') ){ lexeme += "\\"; continue } if( match(QUOTE_CHAR) ){ lexeme += "\""; continue } //throw Exception("Tokenizer error, invalid escape character.") return error("Tokenizer error, invalid escape character.") } lexeme += peek() advance() } // println(" [TRACE] lexeme = '${lexeme}' ") err = expect(QUOTE_CHAR) if(err.isErr()){ return err} return token(Type.T_Str, lexeme) } private fun nextNumber(): Token { var lexeme = "" if( match('-')){ lexeme += "-" } if(!check('.')){ lexeme += nextInteger() } if(!check('.') && !check('e') && !check('E') ) { if( peek().isLetter() ){ // throw Exception("Invalid number. Numbers cannot terminate with letter.") return error("Invalid number. Numbers cannot terminate with letter.") } return token(Type.T_Int, lexeme) } if ( match('.') ) { lexeme += '.' + nextInteger() } val letter = peek() if( match('e') or match('E')) { lexeme += letter if( !match('-') && !isDigit()){ // throw Exception("Invalid floating point number: " + lexeme) return error("Invalid floating point number: " + lexeme) } if( match('-') ){ lexeme += "-" } if( !isDigit() ){ // throw Exception("Invalid floating point. Expected digit, but got: " + lexeme) return error("Invalid floating point. Expected digit, but got: " + lexeme) } lexeme += nextInteger() } if( peek().isLetter() ){ // throw Exception("Invalid number. It cannot end with letter.") return error("Invalid number. It cannot end with letter.") } return token(Type.T_Flt, lexeme) } private fun nextInteger(): String { var lexeme = "" while( this.peek().isDigit() && !this.isEof() ) { lexeme += peek() advance() } return lexeme } } // ----- End of class Tokenizer() -------------- // class Parser(val source: String) { // List of tokens to be processed val tokens = Tokenizer(source).tokens() // Initialize current token with a null element var current: Token = Token(Type.T_EMPTY, "", 0, 0, 0) // Position of current tokens var pos = 0 // Initialize parser state init { advance() } fun peek(): Token { return current } fun advance() { if( pos >= tokens.size ){ current = Token(Type.T_EOF, "", 0, 0, 0) return } current = tokens[pos] pos = pos + 1 } fun match(type: Type): Boolean { if( current.type == type ){ advance() return true } return false } fun check(type: Type): Boolean { return current.type == type } fun expect(type: Type): Ast { if( current.type == type ) { advance(); return Ast.AstVoid } return Ast.AstErr("Parser Error: expected token " + type + ", but given " + current) } private fun isEof(): Boolean { return current.type == Type.T_EOF } private fun isAtom(): Boolean { return peek().type == Type.T_Bool || peek().type == Type.T_Flt || peek().type == Type.T_Int || peek().type == Type.T_Sym || peek().type == Type.T_Str || peek().type == Type.T_Key } fun parseSexp(): Ast { if ( isAtom() ) { return parseAtom() } else if( check(Type.T_Lpar) ){ return parseList() } else if( match(Type.T_Quote) ){ val s = parseSexp() val lst = listOf<Ast>( Ast.AstSym("quote"), s ) val r = Ast.AstLst(lst) return r } else if( check(Type.T_ERR) ){ return Ast.AstErr(" [LEXICAL ERROR] " + current ) } return Ast.AstErr("Parser Error: unexpected invalid token: " + peek() ) } private fun parseAtom(): Ast { val lexeme = peek().text if ( match(Type.T_Int) ){ return Ast.AstInt( lexeme.toInt() ) } else if( match(Type.T_Flt) ){ return Ast.AstFlt( lexeme.toDouble() ) } else if( match(Type.T_Str) ){ return Ast.AstStr(lexeme) } else if( match(Type.T_Bool) ){ return Ast.AstBool(lexeme == "#t") } else if( match(Type.T_Key) ){ return Ast.AstKey(lexeme)} else if( match(Type.T_Sym) ){ if( lexeme == "nil" ){ return Ast.AstNil } return Ast.AstSym(lexeme) } else { return Ast.AstErr("Parser Errror: Invalid token: ${peek()}") } } private fun parseList(): Ast { var err = expect(Type.T_Lpar) if(err.isErr()){ return err} val nodes = mutableListOf<Ast>() while( !check(Type.T_Rpar) && !isEof() ) { val sexp = parseSexp() nodes.add(sexp) } err = expect(Type.T_Rpar) if(err.isErr()){ return err} val ast = Ast.AstLst(nodes) return ast } } // --- End of class Parser() ---------// object Sexp { // Get S-Expression lisp-like string representation of AST fun toStr(r: Ast): String = when(r) { // Emtpy AST => Null is Ast.AstVoid -> "<void>" // Boolean value is Ast.AstBool -> if(r.value) "#t" else "#f" // Floating point literal is Ast.AstFlt -> r.value.toString() // Integer point literal is Ast.AstInt -> r.value.toString() // Symbol literal is Ast.AstSym -> r.value // Keyword literal is Ast.AstKey -> ":" + r.value // Nul - null value is Ast.AstNil -> "nil" // String is Ast.AstStr -> "\"" + r.value + "\"" // List is Ast.AstLst -> "(" + r.value.map { toStr(it) }.joinToString( separator = " ") + ")" // Runtime error is Ast.AstErr -> "EROR: " + r.value // Environment - contains bindings between variables and values is Ast.AstEnv -> "<Env: ${r.env}>" // Primitive function is Ast.AstFunc -> "<FuncPRM: ${r.name}>" // User-defined function is Ast.AstFuncUDF -> "<FuncUDF: ${r.name}>" // Java Object is Ast.AstObj -> { // val klass = r.value::class.java.getName() // "<Object: ${klass}>" "<Obj: ${r.value}>" } //else -> "Not implemented" } fun parse(source: String): Ast { val p = Parser(source) val ast = p.parseSexp() return ast } // Checks whether two S-Expressions are equal fun equal(lhs: Ast, rhs: Ast): Boolean { if( lhs.type() != rhs.type() ){ return false } else if ( rhs.isAtom() && lhs.isAtom() ) { val t = rhs.type() if ( t == AstType.INT ){ return lhs.toInt() == rhs.toInt() } else if ( t == AstType.FLT ){ return lhs.toFlt() == rhs.toFlt() } else if ( t == AstType.STR ){ return lhs.toStr() == rhs.toStr() } else if ( t == AstType.SYM ){ return lhs.toStr() == rhs.toStr() } else if ( t == AstType.KEY ){ return lhs.toStr() == rhs.toStr() } else if ( t == AstType.ERR ){ return lhs.toStr() == rhs.toStr() } else if ( t == AstType.BOOL ){ return lhs.toBool() == rhs.toBool() } else if ( t == AstType.NIL ){ return true } else if ( t == AstType.VOID ){ return true } else { throw Exception("Edge case reached not implemented for => lhs = ${lhs} and rhs ${rhs}") } } else if( rhs.type() == AstType.LST ) { val a = lhs.nodes() val b = rhs.nodes() if( a.size != b.size){ return false } for(n in 0..(a.size - 1)) { if( !equal(a[n], b[n]) ){ return false } } return true } else { throw Exception("Reached edge case => This exception is never supposed to be thrown.") } } } class Evaluator { var logOnError = false // Default environment val _env = Ast.AstEnv() // Special forms val _forms = mutableMapOf<String, (List<Ast>, Ast.AstEnv) -> Ast>() init { _env.name = "global" reset() } fun reset() { this._env.reset() // ---- Register primitive special forms --------------// // Sepcial form for updating an already defined variable addSpecialForm("set", { args, env -> this.form_set(args, env) } ) // Special form for defining a new variable addSpecialForm("def", { args, env -> this.form_def(args, env )} ) // If-else special form addSpecialForm("if", { args, env -> this.form_if(args, env) } ) // Function defintion or lambda special form addSpecialForm("fn", { args, env -> this.form_fn(args, env) } ) // Quote (quote (args ...)) sepcial form addSpecialForm("quote", { args, env -> this.form_quote(args, env) } ) // Do special-form borrowed from Clojure: (do (println "x") (println "z") (+ x y z)) addSpecialForm("do", { args, env -> this.form_do(args, env)} ) // Call object default constructor (new javax.swing.Swing.JFrame) addSpecialForm("new", { args, env -> this.form_new(args, env) } ) // Clojure-like doto special form addSpecialForm("doto", { args, env -> this.form_doto(args, env)} ) // // ---- Register primitive functions ------------------ // /// addFuncMathMany("+", { a, b -> a + b }) addPrimitive("+", ::primitive_add) addPrimitive("-", ::primitive_sub) addPrimitive("*", ::primitive_mul) addPrimitive("/", ::primitive_div) addPrimitive("eval", ::primitive_eval) addPrimitive("load-script", ::primitive_loadScript) addPrimitive("read", ::primitive_read) addPrimitive("list", { args -> Ast.AstLst(args) }) addPrimitive("car", ::primitive_car ) addPrimitive("cdr", ::primitive_cdr ) addPrimitive("cons", ::primitive_cons ) addPrimitive("first", ::primitive_car ) addPrimitive("rest", ::primitive_cdr ) addPrimitive("nth", ::primitive_nth ) // Apply a function over a list addPrimitive("apply", ::primitive_apply) // Map a function over list addPrimitive("map", ::primitive_map ) // Get element of a plist (property list) that matches a keyword argument // (getf :x (list :z 200 :h 300 :x "hello")) => returns "hello" addPrimitive("getf", ::primitive_getf) // Concatenate strings addPrimitive("concat", ::primitive_concat) // addPrimitive("for-each", ::primitive_foreach ) addPrimitive("println", ::primitive_println ) addMath1Arg("m/sin", { java.lang.Math.sin(it) } ) addMath1Arg("m/cos", { java.lang.Math.cos(it) } ) addMath1Arg("m/tan", { java.lang.Math.tan(it) } ) addMath1Arg("m/exp", { java.lang.Math.exp(it) } ) addMath1Arg("m/sqrt", { java.lang.Math.sqrt(it) } ) addMath1Arg("m/log", { java.lang.Math.log(it) } ) addMath1Arg("m/log10", { java.lang.Math.log10(it) } ) // Show all avaialabe methods of a Java object AstObj addPrimitive("ref/show-methods", ::primitive_showMethods ) // Get a java class meta-object by its name as string addPrimitive("ref/class-by-name", ::primitive_getClassByName ) // Get class meta-object of a java object addPrimitive("ref/get-obj-class", ::primitive_getObjClass ) //Cast a java object to an interface by reflection // (ref/cast java.awt.Container object-here-Ast.AstObj) addSpecialForm("ref/cast", { args, env -> this.form_cast(args, env) } ) addSpecialForm("ref/static-field", { args, env -> this.form_staticField(args, env) } ) addPrimitive("action-listener", ::primitive_toActionListener) addPrimitive("java-null", ::primitive_javaNull) // PI constant set("m/PI", Ast.AstFlt( 3.141592653589793)) // Variable that references this environment object set("env", _env) } // Eval AST (Abstract Syntax Tree) by walking over it // and generate a interpreter runtime value fun eval(s: Ast, env: Ast.AstEnv): Ast = when(s){ is Ast.AstErr -> s is Ast.AstVoid -> s is Ast.AstBool -> s is Ast.AstFlt -> s is Ast.AstInt -> s is Ast.AstKey -> s is Ast.AstNil -> s is Ast.AstStr -> s is Ast.AstSym -> { // println(" [TRACE] envName = ${env.name} ; Evaluator.get =>> ${s} \n") env.get(s.toStr()) ?: Ast.AstErr("Error: (line 549) unbound symbol ${s.toStr()} in ${env.name}") } is Ast.AstEnv -> s is Ast.AstFunc -> s is Ast.AstFuncUDF -> s is Ast.AstObj -> s is Ast.AstLst -> evalApp(env, s) } fun eval(source: String): Ast { val ast = Sexp.parse(source) val res = eval(ast, _env) return res } fun eval(ast: Ast): Ast { val res = eval(ast, _env) return res } fun loadScript(fileName: String): Ast { var source: String? = null try { source = java.io.File(fileName).inputStream().readBytes().toString(Charsets.UTF_8) } catch(ex: java.io.FileNotFoundException) { return error(ex.toString()) } source = "(do\n" + source + "\n)" val ast = Sexp.parse(source) if( ast.isErr() ){ return ast } val res = this.eval(ast, _env) return res } // Get value from environment fun get(name: String): Ast? { return _env.get(name) } // Set value in environment fun set(name: String, value: Ast) { _env.set(name, value) } // Interactive Read-Print-Eval-Loop fun repl() { val sc = java.util.Scanner(System.`in`) var count = 0 while(true){ System.out.print(" In[${count}]> ") System.out.flush() // val line = sc.nextLine() var line = readLine() ?: "" if(line == ""){ continue } if(line == ".h"){ println(""" The interpreter has the following commands: + .h or .help => To display this help menu. + .q or .quit => To quit the interpreter. + .multi => To paste multi-line S-Expression in the repl + .reset or .r => To reset the interpreter state + .vars => To display all variables + .funcs => To show all functions + .verbose-on => Enable verbose mode + .verbose-off => Disable verbose mode """) continue } if(line == ".verbose-on"){ println(" [INFO] Verbose mode enabled.") this.logOnError = true continue } if(line == ".verbose-off"){ println(" [INFO] Verbose mode disabled.") this.logOnError = true continue } if(line == ".quit" || line == ".q"){ System.exit(0); break } if(line == ".reset" || line == ".r"){ this.reset() println(" [INFO] Reset repl. All variables removed. Ok.") continue } if(line == ".multi"){ println(" [INFO] Type a multi-line expression and then type (;;) when you are done.") var entry = "" while(true){ val lin = readLine() ?: "" if(lin == ";;"){ break } entry = entry + lin } println(" [INFO] Exit multi-line mode. ") line = entry } // Display all defined variables in current environment if(line == ".vars") { println(" =>> Variables: ") _env.env.forEach{ (k, v) -> if( !v.isCallable()){ println(" " + k) } } continue } // Display all defined functions in current environment if(line == ".funcs") { println(" =>> Functions: ") _env.env.forEach{ (k, v) -> if( v.isCallable()){ println(" " + k) } } continue } val result = this.eval(line) if( result.isErr() ){ println("Error: " + result.toStr()) continue } // History variables $0, $1, $2, ... // containing the evaluation results of // previous inputs entered by the user val label = "\$${count}" println(" ${label} = " + Sexp.toStr(result)) count += 1 // Variable answer always contains the result // of the last computation this.set("ans", result) this.set(label, result) } } // Evaluate application of function or special form // (function-or-form arg0 arg1 ... argN-1) private fun evalApp(env: Ast.AstEnv, lst: Ast): Ast { if(lst.size() == 0){ return Ast.AstNil } val head = lst.head() val args = lst.tail() if( head is Ast.AstSym ){ val x = head.toStr() if(x == "comment") { return Ast.AstVoid } if( _forms.contains(x) ){ val handler = _forms!!.get(x) // Apply Special form val res = handler!!.invoke(args, env) return res } } // Check whether argument is function val obj = eval(head, env) // Abort computation on error if( obj.isErr() ){ return obj } // Check whether object is a Java object (instance of a Java class) if( obj.isObj() ){ // Perform method call val r = invokeObjectMethod(obj as Ast.AstObj, args, env) return r } // Abort computation if object is not callable if( !obj.isCallable() ){ return error("Object not callable ${head}") } val res = funCall(env, obj, args) return res } // Handle function call and evalute all arguments before function application. // funCall(environment, function, list-of-arguments) private fun funCall(env: Ast.AstEnv, obj: Ast, args: List<Ast>): Ast { val evaluatedArgs = mutableListOf<Ast>() // Evaluate arguments and abort evaluation on error for(arg in args){ val r = this.eval(arg, env) if( r.isVoid() ){ continue } if( r.isErr() ){ return r} evaluatedArgs.add(r) } // ---- Evaluation of primtive function call in host language -------- // // Object is primtive function (defined in host language) if(obj is Ast.AstFunc){ // println(" [TACE] Call primitive function ${obj}") val func = obj as Ast.AstFunc val result = func.func(evaluatedArgs) return result } // ------_ Evaluation of user-defined function call -----------// // Object is UDF - User-Defined Function val func = obj as Ast.AstFuncUDF val nargs = func.args.size if(nargs != args.size ){ return error("Invalid number of arguments passed to function. It expects $nargs arguments. ") } // Create temporary environment for function evaluation val tenv = Ast.AstEnv(func.env) // Set arguments in the temporary environment for(k in (0..(nargs - 1))){ val argName = func.args[k] tenv.set(argName, evaluatedArgs[k]) } var last: Ast = Ast.AstNil // Evaluate function body and return evaluation of last form for(form in func.body) { last = this.eval(form, tenv) // abort evaluation on error if(last.isErr()){ return last } } return last } private fun error(message: String): Ast { if(this.logOnError){ System.err.println(" [ERROR] " + message) } return Ast.AstErr(message) } // Add primitive function fun addPrimitive(name: String, func: (List<Ast>) -> Ast) { _env.set(name, Ast.AstFunc(name, func)) } // Register primitive math function of 1 argument fun addMath1Arg(name: String, func: (Double) -> Double) { val func = { args: List<Ast> -> if(args.size != 1){ error("Function ${name} expects 1 argument. ") } else if( !args[0].isNum() ) { error("Expected numeric argument") } else { val a = args[0].toFlt() val r = func(a) Ast.AstFlt( r ) } } addPrimitive(name, func) } // Add special form (macro or control structure) fun addSpecialForm(name: String, form: (List<Ast>, Ast.AstEnv) -> Ast) { _forms.put(name, form) } // ----- Functions and special forms called by the interpreter --------// // Add math function of many arguments that is evaluated as: // (func a b c d e f g h ...) = // a func b func c func d func e ... // func(a, func(b, func(c, func(e, ...)))) // // For instance: '+' // (+ 10 2 5 6) = 10 + 2 + 5 + 6 private fun addFuncMathMany(name: String, func: (Double, Double) -> Double) { val value = Ast.AstFunc(name, { args -> fold1(func, args) } ) _env.set(name, value) } private fun fold1(func: (Double, Double) -> Double, args: List<Ast>): Ast { if(args.size == 0){ return Ast.AstErr("Error: function expects at leat 1 argument.") } val head = args[0] if( !(head is Ast.AstInt || head is Ast.AstFlt) ){ return error("Error: function requires numeric argument, given ${head}.") } var acc = head.toFlt() for(k in 1..(args.size - 1)){ val a = args[k] if( !(a is Ast.AstInt || a is Ast.AstFlt) ){ return error("Error: function requires numeric argument, given ${a}.") } val x = a.toFlt() acc = func(acc, x) } return Ast.AstFlt(acc) } // Read S-expression from string private fun primitive_read(args: List<Ast>): Ast { if(args.size != 1){ return error("Error: function read() requires 1 (one) agument of type string.") } if( !(args[0] is Ast.AstStr)){ return error("Function read() expects 1 argument of type string.") } val code = args[0].toStr() val ast = Sexp.parse(code) return ast } private fun primitive_eval(args: List<Ast>): Ast { if(args.size != 1){ return error("Error: function eval() requires 1 (one) agument of type string.") } val result = this.eval(args[0], _env) return result } // Get list head private fun primitive_car(args: List<Ast>): Ast { if(args.size != 1){ return error("Error: function 'car' requires at least 1 (one) agument.") } if( !args[0].isList() ){ return error("Function car expects 'list' argument. ") } val lst = args[0] as Ast.AstLst if( lst.isNil() || lst.size() == 0){ return Ast.AstNil } // The rendudant variable is 'r' is useful for debugging // since debuggers work in a line by line manner. val r = lst.head() return r } // Get list tail private fun primitive_cdr(args: List<Ast>): Ast { if(args.size != 1){ return error("Error: function 'car' requires at least 1 (one) agument.") } if( !args[0].isList() ){ return error("Function car expects list argument. ") } val lst = args[0] as Ast.AstLst if( lst.isNil() || lst.size() == 0){ return Ast.AstNil } // The rendudant variable is 'r' is useful for debugging // since debuggers work in a line by line manner. val r = Ast.AstLst(lst.tail()) return r } private fun primitive_cons(args: List<Ast>): Ast { if(args.size != 2) { return error("Error: function 'cons' requires at least 2 agument.") } if( !args[0].isAtom() ) { return error("Function cons() expects first argument be an atom.") } if( !args[1].isList() ) { return error("Function cons() expects second argument be a list or nil.") } val result = listOf(args[0]).plus( args[1].nodes() ) return Ast.AstLst(result) } // Get nth element of a list // (nth 0 lst) => Get element of list 'lst' private fun primitive_nth(args: List<Ast>): Ast { if(args.size != 2){ return error("Error: function 'nth' requires at least 2 agument.") } if( !(args[0] is Ast.AstInt) ){ return error("Function nth expects number as 1st argument. ") } if( !args[1].isList() ){ return error("Function nth expects list as 2nd argument. ") } val n = args[0].toInt() val lst = args[1] as Ast.AstLst if( lst.isNil() || lst.size() == 0){ return Ast.AstNil } if(n >= lst.size()){ return error("Error: out of bounds access.") } val r = lst.nth(n) return r } // Apply => Apply a function to a list of arguments // (apply + (list 1 2 3 5 6)) // ( (fn (x y) (+ x y)) (list 3 5)) => 7 private fun primitive_apply(args: List<Ast>): Ast { if(args.size != 2){ return error("Error: function apply() requires at least 2 agument.") } if( !args[0].isCallable() ){ return error("Function apply() expects a function as 1st argument. ") } if( !(args[1] is Ast.AstLst) ){ return error("Function apply() expects a list as 2nd argument. ") } val res = funCall(_env, args[0], args[1].nodes()) return res } // Example: (+ 1 2 5 100.252) fun primitive_add(args: List<Ast>): Ast { if(args.size == 0){ return error("Function + (add) expects at least 1 argument. No argument was given.")} var result: Ast = Ast.AstInt(0) for(x in args