Anthony Corletti - Question Answering NLP in Go

I've wanted to do more software development in go but have found myself bouncing back to python or ruby due to familiarity with libraries, web frameworks, and ML/ AI tools.

About a week ago I stumbled onto, spago a ML library that is written in go that's designed to support neural network architectures in NLP based tasks.

Figured this is a great way to start teaching myself more about the language given that there are more and more tools like this that are enabling robust ML/ AI applications in golang applications. I'm unsure if anything will be as robust as something like tensorflow or pytorch, but for now working with something like spago and golearn is a great start. See my previous post on building a K-nearest-neighbors implementation with go and golearn.

So let's walk through an example that illustrates how we can build a simple service that does question answering NLP with spago.

In this walkthrough, I'm going to be using docker to build and run a spago service that takes in text content as a question and provides an answer.

For building models, transformers, and binaries I'm going to be working locally (on a Mac) as docker component builds aren't working as expected (at the time of this post ... I'll be opening up an issue soon).

To get spago all set-up, first clone the repo and build the container

cd ~
git clone https://github.com/nlpodyssey/spago.git
cd spago
docker build -t spago:main . -f Dockerfile

Now what's really cool about spago is that it allows you either to use a model in the inference phase or to train one from scratch, or fine-tune it. However, training a language model (i.e. the transformer objective) to get competitive results can become prohibitive. This applies in general, but even more so with spago as it does not currently use gpus.

Pre-trained transformer models fine-tuned for question-answering exist for several languages and are publicly hosted on the Hugging Face models repository. Particularly, these exist for BERT and ELECTRA, the two types of transformers currently supported by spago.

To import a pre-trained model, run the hugging_face_importer indicating both the model name you'd like to import (including organization), and a local directory where to store all your models (for our example ~/.spago).

Now we'll be building our bert_server and hugging_face_importer locally (not using docker).

I'm building this on a mac.

mkdir -p ~/.spago
GOOS=darwin GOARCH=amd64 go build -o hugging_face_importer cmd/huggingfaceimporter/main.go
GOOS=darwin GOARCH=amd64 go build -o bert_server cmd/bert/main.go
./hugging_face_importer --model=deepset/bert-base-cased-squad2 --repo=~/.spago

If you're feeling maverick and decide to do this with docker, I've noticed that once the pytorch model is downloaded, that the spago model is never built – it takes 30 minutes and leaves a zombie python process running. If this happens to you, kill the docker container and kill the zombie python process.

docker ps -a -q
# find the id for the container you want to kill
docker kill <CONTAINER_ID>
# to kill zombie processes
ps aux | grep -i spago | awk '{print $2}' | xargs kill -9 $1

Once that completes we can start our server and pass a request to it.

docker run --rm -it -p 1987:1987 -v ~/.spago:/tmp/spago spago:main ./bert_server run --model=/tmp/spago/deepset/bert-base-cased-squad2

PASSAGE="BERT is a technique for NLP developed by Google. BERT was created and published in 2018 by Jacob Devlin and his colleagues from Google."
QUESTION1="Who is the author of BERT?"
QUESTION2="When was BERT created?"
curl -s -k -H "Content-Type: application/json" "https://127.0.0.1:1987/answer?pretty" -d '
    {
        "question": "'"$QUESTION1"'",
        "passage": "'"$PASSAGE"'"
    }' | jq
{
  "answers": [
    {
      "text": "Jacob Devlin",
      "start": 91,
      "end": 103,
      "confidence": 0.9641588621246571
    }
  ],
  "took": 1712
}

SO COOL RIGHT?!

And for our second question

$ curl -s -k -H "Content-Type: application/json" "https://127.0.0.1:1987/answer?pretty" -d '
    {
        "question": "'"$QUESTION2"'",
        "passage": "'"$PASSAGE"'"
    }' | jq
{
  "answers": [
    {
      "text": "2018",
      "start": 83,
      "end": 87,
      "confidence": 0.9924210921706913
    }
  ],
  "took": 1544
}

And for some harder questions

FRANK_LYRICS="I've got the world on a string, sittin' on a rainbow. Got the string around my finger. What a world, what a life, I'm in love. I've got a song that I sing. I can make the rain go, anytime I move my finger. Lucky me, can't you see, I'm in love. Life is a beautiful thing, as long as I hold the string. I'd be a silly so and so, if I should ever let go. I've got the world on a string, sittin' on a rainbow. Got the string around my finger. What a world, what a life, I'm in love. Life is a beautiful thing, as long as I hold the string. I'd be a silly so and so, if I should ever let go. I've got the world on a string, sittin' on a rainbow. Got the string around my finger. What a world. Man this is the life. And now I'm so in love."
QUESTION="Where am I sitting?"
curl -s -k -H "Content-Type: application/json" "https://127.0.0.1:1987/answer?pretty" -d '
    {
        "question": "'"$QUESTION"'",
        "passage": "'"$FRANK_LYRICS"'"
    }' | jq
{
  "answers": [
    {
      "text": "a rainbow",
      "start": 43,
      "end": 52,
      "confidence": 0.4979075157972422
    },
    {
      "text": "rainbow",
      "start": 45,
      "end": 52,
      "confidence": 0.2740512170539793
    },
    {
      "text": "sittin' on a rainbow",
      "start": 32,
      "end": 52,
      "confidence": 0.2280412671487786
    }
  ],
  "took": 11391
}

QUESTION="What have I got the world on?"
curl -s -k -H "Content-Type: application/json" "https://127.0.0.1:1987/answer?pretty" -d '
    {
        "question": "'"$QUESTION"'",
        "passage": "'"$FRANK_LYRICS"'"
    }' | jq
{
  "answers": [
    {
      "text": "a string",
      "start": 22,
      "end": 30,
      "confidence": 0.35698908428398957
    },
    {
      "text": "on a string",
      "start": 19,
      "end": 30,
      "confidence": 0.31648355551323776
    }
  ],
  "took": 22928
}

QUESTION="Am I in love?"
curl -s -k -H "Content-Type: application/json" "https://127.0.0.1:1987/answer?pretty" -d '
    {
        "question": "'"$QUESTION"'",
        "passage": "'"$FRANK_LYRICS"'"
    }' | jq
{
  "answers": [
    {
      "text": "I'm in love",
      "start": 231,
      "end": 242,
      "confidence": 0.8533213569424318
    }
  ],
  "took": 19945
}

Kinda wonky but HOLY COW 🐮

So incredible. And I'm running this all on my mac, and it's nothing too wild. Specs for reference; Processor: 2.3 GHz Quad-Core Intel Core i7, Memory: 16 GB 1600 MHz DDR3.

In order to fully stop any process associated with spago, I've had to run ps aux | grep -i spago | awk '{print $2}' | xargs kill -9 $1 to stop everything.

What's great about spago is that it could easily be deployed on a kubernetes cluster and exposed as a service – and with the modularity of integrating pre-trained models with hugging_face_importer spago is an exemplary service for doing all sorts of cool ml in golang. For more examples of what spago can do, checkout the github repo.