Writing a custom linter in Go

Go, Gin, Linter, React, React Router, MongoDB

tl;dr

  • Using the go/ast package to traverse the Abstract Syntax Tree of a given file
  • Traversing the file system to collect results of all files in a directory
  • Define a validator data structure to add and modify linting rules
  • Store the results using MongoDB and display them using a webpage using React and React Router

As the time had arrived for me to work on my bachelor thesis, I decided on implementing a linter for source code written in Go - primarily because I wanted to learn a new language and always wondered if I could write a linting application myself.

I want to explain some of the concepts I have picked up while working on the project.

Linting using an Abstract Syntax Tree (AST)

Abstract Syntax Trees decompose your source code and represent it in a tree form. This way, you can easily traverse the resulting tree, enabling you to collect the information you are interested in.

In Go, the standard package provides us with go/ast, which already implements many functions you would typically need for building and traversing a tree.

For one of my use cases, I needed to check how many conditional operands a specific condition has. I then used the information to suggest decomposing the condition to smaller pieces if the count exceeds a limit.

The following code shows how a conditional statement gets evaluated to eventually obtain the total count of operands:

func (validator DecomposeConditionsValidator) evaluateConditionalExpression(stmt ast.Expr) int {
    switch expression := stmt.(type) {
    case *ast.BinaryExpr:
        return 1 + validator.evaluateConditionalExpression(expression.X) + validator.evaluateConditionalExpression(expression.Y)
    case *ast.ParenExpr:
        return validator.evaluateConditionalExpression(expression.X)
    default:
        return 0
    }
}

All the tree nodes are traversed recursively, providing us with the sum of operands when the recursive invocation stops.

Traversing the file system

Linting applications usually validate each directory recursively, starting from a root directory. For each file, an AST is constructed, used as the input for the author's rules/validators setup.

Usually, you can use packages for your programming language that abstract the file system by providing methods to interact with it. I was surprised to see how neatly Go is supporting recursive traversal with path/filepath.

Here's a simplified view of how I used recursive file traversal:

err = filepath.Walk(linter.BaseDir, func(path string, info os.FileInfo, err error) error {
  // construct AST tree
  // initialize data structure to store results

  // for each rule
  //    check the AST tree
  //    append to the results

  // add results to a linting report structure 
}

// return linting report

Defining the rules (validators)

Now we have established how a linter traverses the file system and how source code is represented by an abstract syntax tree. Another crucial part of the system is implementing validators (or whatever you want to call them).

For example, let's say you wanted to check the length of each variable name to suggest more expressive names when the length is only 1 (let's discard valid reasons you might use such short names - e.g., Point.X or Point.Y coordinates). You would then express this rule as functions/methods/classes (you choose), usually defined in separate files. As an example, you can check how ESLint organizes its rules.

For the use case above, you could, for example, represent "fail when the variable name length is below a limit" as follows:

for _, elem := range statement.Lhs {
    switch identifier := elem.(type) {
        case *ast.Ident:
            if len(identifier.Name) < validator.Rule.Limit {
                result = stats.Failed
                identifierNames = append(identifierNames, identifier.Name)
            }
    }
}

The basic idea is to build your project with separate validators where each of them represents a rule as source code that can be executed by your linting application.

Displaying the results

To decouple the presentation of the linting results from the creation, I defined a shared report structure to store the program's output. Here is a sample structure I have used for my linter:

type Report struct {
    Id           primitive.ObjectID `bson:"_id"`
    Timestamp    time.Time          `bson:"timestamp"`
    FileCount    int                `bson:"fileCount"`
    ErrorCount   int                `bson:"errorCount"`
    WarningCount int                `bson:"warningCount"`
    PassedCount  int                `bson:"passedCount"`
    Files        FileResultList     `bson:"files"`
}

Apart from the file results themselves, some metadata is stored for later processing. While the structure above may not be the most extensible one, it was sufficient for me to implement the project's requirements.

For displaying the results later, I stored them in MongoDB and retrieved them by using a REST API called by a React frontend application.

Conclusion

It is interesting to see how many components need to interact to provide valuable feedback for developers. First, we traverse through the file system while constructing an abstract syntax tree for each file. Then we can lint the source code with validators to eventually display the results to the users.

However, given the short period, I couldn't fully implement all of the capabilities a good linter usually provides. Some of them include:

  • Annotations to ignore lines or blocks of source code
  • Automatically fix warnings/errors found
  • Extensive configuration options
  • IDE integrations

While I have laid out some of the core components of a linter, I still hope to eventually return and learn more about the inner-workings of the mechanisms outlined above.