Attila's Blog

About writing your own programming language in C, Go and Swift

The interpreter framework

Aug 6, 2018 Comments

Categories: aspl

Before we start, I’d like to say a few words about the environment I’m gonna work with. This project is developed on MacOS. Also, I’m using CMake for the build system, and CLion as my IDE. Of course, this does not mean the source code cannot be ported to other platforms, I’m just saying that I’m not going to do it for you. ☺

The language is standard C11, and there are no external dependencies other than the C standard library.

With that said, you should be able to follow the blog working on Linux, BSD or even Windows as well, but be prepared to reconfigure CMake when needed.

Setting up the interpreter framework

No more small talk, let’s start writing some code! We have to start writing some boilerplate first to set up the basic structure of our application. After that, we can start working on lexing the ASPL language.

I’m not going to show and explain every single line of code, just the important details. Everything else can be easily decoded from the source code which I will provide at the bottom of every post that deals with actual programming tasks.

Let’s start with defining some common constants we will use throughout the application.

//
// Created by Attila Haluska on 8/1/18.
//

#ifndef ASPL_C_COMMON_H
#define ASPL_C_COMMON_H

#define CANNOT_READ_FILE (55)
#define INVALID_ARGUMENTS (60)

#endif //ASPL_C_COMMON_H

The first constant, CANNOT_READ_FILE will serve as the return code when we try to read a source file but it fails.

The other one, INVALID_ARGUMENTS is used to indicate that the interpreter was started with an invalid number of arguments.

Now, let’s see the entry point of our application.

//
// Created by Attila Haluska on 8/1/18.
//

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>

#include "common.h"

#define MAX_LINE_LENGTH (1024)

int main(int argc, const char* argv[]) {
    signal(SIGINT, &signal_handler);
    signal(SIGTERM, &signal_handler);

    if (argc > 2) {
        fprintf(stderr, "Usage: aspl [script]\n");
        exit(INVALID_ARGUMENTS);
    } else if (argc == 2) {
        run_file(argv[1]);
    } else {
        run_repl();
    }

    return EXIT_SUCCESS;
}

The constant MAX_LINE_LENGTH is the maximum number of characters we can read from the keyboard. It would be nicer to allow arbitrary length, but it is not likely that anyone would use it and it is easier this way.

The main function is quite simple. We install some signal handlers to be able to gracefully shut down the interpreter when the user presses Ctrl+C. After that, check the input arguments, and either run a source file or start the REPL.

The signal handler looks like this:

static void signal_handler(int signal) {
    fprintf(stdout, "Bye\n");
    exit(EXIT_SUCCESS);
}

It is simple at the moment, all it does is exits the process.

The file reading mechanism is a bit trickier in C.

static size_t file_size(FILE* file) {
    fseek(file, 0L, SEEK_END);
    size_t fileSize = (size_t) ftell(file);
    rewind(file);
    return fileSize;
}

static char* read_file(const char* file_path) {
    FILE* file = fopen(file_path, "rb");

    if (file == NULL) {
        fprintf(stderr, "Could not read file \"%s\"\n", file_path);
        exit(CANNOT_READ_FILE);
    }

    size_t size = file_size(file);

    char* buffer = (char*) malloc(size + 1);

    if (buffer == NULL) {
        fprintf(stderr, "Could not read file \"%s\"\n", file_path);
        exit(CANNOT_READ_FILE);
    }

    size_t bytes_read = fread(buffer, sizeof(char), size, file);

    if (bytes_read != size) {
        fprintf(stderr, "Could not read file \"%s\"\n", file_path);
        exit(CANNOT_READ_FILE);
    }

    buffer[bytes_read] = '\0';

    fclose(file);

    return buffer;
}

static void run_file(const char* file_path) {
    const char* source = read_file(file_path);

    // TODO Implement running aspl source files

    free((void*) source);
}

First we need a way to tell the size of the file, because we need to know the size of the buffer when we read the file into memory. This can be done by seeking to the end of the file and reading the current position in the stream. This is what file_size does.

Next, in read_file we read the file into the memory buffer. Here the only thing to be aware of is that we have to allocate enough space to include the string terminator character (\0) as well, and to actually append it to the end of the string buffer[bytes_read] = '\0';.

The function that actually executes our source file (run_file) will be implemented later.

And lastly, the implementation of our REPL.

static void run_repl() {
    char line[MAX_LINE_LENGTH];
    memset(line, 0, MAX_LINE_LENGTH);

    fprintf(stdout, "\nASPL REPL\n\nType 'Ctrl+C', 'quit' or 'q' to exit.\n\naspl> ");

    while (true) {
        if (fgets(line, MAX_LINE_LENGTH, stdin) == NULL) {
            fprintf(stdout, "\n");
            break;
        }

        if (memcmp(line, "quit", 4) != 0 && memcmp(line, "q", 1) != 0) {
            // TODO Implement interpreting source lines
            fprintf(stdout, "aspl> ");
        } else {
            fprintf(stdout, "Bye\n");
            break;
        }
    }
}

This will keep reading strings from the keyboard until the user enters quit, q or presses Ctrl+C.

How to build from sources using CMake

Here is a little help with compiling the sources:

First, you will need to install CMake, Ninja and ccache. Ninja and ccache are optional, but CMake is configured to use them, so if you don’t want them you will have to reconfigure CMake.

To do that, open <project folder>/CMakeLists.txt, and disable or delete the following lines:

set(CMAKE_MAKE_PROGRAM /usr/local/bin/ninja)
set(MAKECOMMAND /usr/local/bin/ninja -k 100)
set(CMAKE_C_COMPILER /usr/local/opt/ccache/libexec/clang)
set(CMAKE_CXX_COMPILER /usr/local/opt/ccache/libexec/clang++)

This will disable Ninja and ccache as well.

On MacOS it’s the easiest to use Homebrew to install the dependencies. Install it if you don’t yet have it.

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

After that, install all the dependencies I mentioned.

brew install cmake ninja ccache

If everything goes well, you can compile the sources.

cd <your project folder/build>
bash-3.2$ cmake -G Ninja ..
-- The C compiler identification is AppleClang 10.0.0.10001040
-- Check for working C compiler: /Applications/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/attilahaluska/Downloads/x/aspl-c/build
bash-3.2$ ninja
[5/5] Linking C executable ../bin/aspl
bash-3.2$ 

Now, you will have the binary in <project folder/bin>. If not, please let me know in the comment section, so I can help you fix it.

Grab the source code of the current state of the project if you like.

Next time, we implement lexing.

Stay tuned. I’ll be back.

← The ASPL language - part 4 Lexing ASPL →

comments powered by Disqus