Oct 25 2022 · Uppu Rajesh Kumar, Piotr Płoński

Convert Jupyter Notebook to spaCy NLP web application

Convert Jupyter Notebook to SpaCy Web ApplicationJupyter notebooks are a convenient way of prototyping our Machine Learning models. They are also very easy to use and provide a convenient way of teaching data science concepts while simultaneously showing the working of the code using the code cells. If we want to test several examples for some functionality, then we need to repeatedly change the code and run the cells to get the output. Also, anyone who doesn't have a good programming background may face difficulty handling the code.

Apart from these problems sharing that Jupyter Notebook among several users is also a tedious process. Each person requires a copy of the Jupyter Notebook and has to install the necessary tools to open that Jupyter Notebook. But what if we can share Jupyter Notebook as a web application? What if we could use the user-defined widgets to control the code without having to make changes to the code for every input value? This way, we don't need several copies of Jupyter Notebooks for several users, and we don't need to change the code for every input value.

Mercury solves all these problems. We can convert our Jupyter Notebook into a web application and host it on cloud platforms. We can add interactivity to the web application to see the output for several input values. We can share the Jupyter Notebook online and do the Jupyter Notebook demo in a simple way. In this article, we will convert the Jupyter Notebook into an NLP web application.

Installation of required libraries

Firstly we install the necessary libraries. Please create requirements.txt file as below:

mljar-mercury
simple_colors
spacy==3.0.8
spacytextblob=4.0.0
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

Please setup virtual envrionment and install required pacakges:

virtualenv spenv --python=python3.8
source spenv/bin/activate
pip install -r requirements.txt
python -m ipykernel install --user --name spenv

We use Mercury library to convert Jupyter Notebook into a web application. We use spaCy and spacytextblob to do several tasks on a given text like:

  • Named Entity Recognition,
  • Sentiment Analysis,
  • Dependency Parser.

The simple-colors library is used to print colored text in python. In the next step, please open a new Jupyter Notebook to start building our application. Remember to select spenv kernel for new notebook.

Create Jupyter Notebook

Open a Jupyter Notebook and import the necessary libraries as shown below

import spacy
from spacy import displacy
from spacytextblob.spacytextblob import SpacyTextBlob
from simple_colors import *

The application that we are building has three functionalities. It can do Sentiment Analysis, Named Entity Recognition, and Dependency Parser. So we will create a function that can perform these tasks so that when we call a task we get the output. The code for that is shown below

def tasks(task):
    if 'Dependency Parser' in task:
        options = {
          "compact":compact, 
          "collapse_punct":collapse_punctuation, 
          "collapse_phrases":collapse_phrases, 
          "color": "black", 
          "bg":"linear-gradient(90deg, #aa9cfc, #fc9ce7)",
          "distance":100
        }
        print(red('Dependency Parser', ['bold', 'underlined']))
        displacy.render(doc, style="dep", jupyter =True, options=options)
    if 'Named Entity Recognition' in task:
        try:
            options = {"ents": entities}
            print(red('Named Entities', ['bold', 'underlined']))
            if len(doc.ents)==0:
                print('Spacy has not detected any entities in the doc object.')
            else:
                displacy.render(doc, style="ent", jupyter=True, options=options)
        except:
            print(red('Named Entities', ['bold', 'underlined']))
            print('No entity is chosen. Please select atleast one entity from the list on the sidebar.')
    if 'Sentiment Analysis' in task:
        print(red('Sentiment Analysis', ['bold', 'underlined']))
        if doc._.blob.polarity > 0:
            print('Positive:'+ ' Polarity score is '+str(doc._.blob.polarity))
        elif doc._.blob.polarity < 0:
            print('Negative:'+ ' Polarity score is '+str(doc._.blob.polarity))
        else:
            print('Neutral:'+ ' Polarity score is '+str(doc._.blob.polarity))

Dependency Parser comes with options of compact, collapse_punctuation, collapse_phrases. For the task of Named Entity Recognition, we can choose the kind of entities that should be recognized. For Sentiment Analysis, we print Positive if the polarity score is greater than zero, Negative if it is less than zero and `Neutral' otherwise.

Dependency Parser outputs a dependency graph using spacy displacy. For the task of Named Entity recognition, we print a statement if no entities are detected. If the user doesn't choose an entity, then a prompt statement is printed to select at least one entity.

All this functionality is defined under a function called tasks(task). Now that we have defined a function to do the tasks we wanted, we need to define an NLP object and create a doc instance as it is needed in spacy. We do that as shown below in the next cell.

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe('spacytextblob')
doc = nlp(sent)

In the next cell, we call our function

tasks(task)

So far we have imported all the libraries and defined the function that performs the tasks we choose. We have also created a doc object for the texts we input and finally, we also called our function. To use Mercury we need to add YAML config to the Jupyter Notebook.

Add YAML to the Jupyter Notebook

Create a cell at the top of the notebook so that it becomes the first cell. Convert the type of this cell to Raw NBConvert. We add a title of our application under the title parameter and we add a description under the parameter description.

We need widgets like a text box for text input, a select box to choose the task to be performed, entities to choose, and we need checkboxes to check the options for the dependency parser. You can find more about these widgets at this official documentation link. We add the YAML to the Jupyter Notebook as shown below

---
title: NLP with Spacy
description: This application shows different tasks of Spacy
show-code: False
params:
    sent:
        input: text
        label: Enter the text to analyze
        value: Apple is looking at buying U.K. startup for $1 billion
    task:
        input: select
        label: Choose the tasks to do
        choices: [Dependency Parser, Named Entity Recognition, Sentiment Analysis]
        value: Named Entity Recognition
        multi: True
    compact:
        input: checkbox
        label: Compact(If you chose Dependency Parser, check the below boxes accordingly)
        value: False
    collapse_punctuation:
        input: checkbox
        label: Collapse Punctuation
        value: False
    collapse_phrases:
        input: checkbox
        label: Collapse Phrases
        value: False
    entities:
        input: select
        label: If you chose Named Entity Recognition task then select the named entities you want to see
        choices: [PERSON, ORG, DATE, PRODUCT, LOC, GPE, LANGUAGE, FAC, NORP, MONEY]
        multi: True
---

In the next code cell define the parameters as shown below

sent = 'Notebook in watch mode. All changes to Notebook will be automatically visible in Mercury'
compact = False
collapse_punctuation = True
collapse_phrases = False
entities = 'PERSON'

We can add some markdown text explaining different concepts that we use in this application and anything that needs an explanation. This will make things easy and we can communicate with the audience properly. Now that we made all the necessary changes to the Jupyter Notebook, let's convert it to web application.

spaCy notebook

Run Jupyter Notebook as Web Application

Please run the following command in the directory with notebook:

mercury run

That's all. Your notebook will be converted to web application. Please open your web browser at http://127.0.0.1:8000. You should see a website with your notebook. However, the code is hidden. Users can interact with notebook only by provided widgets in the left side panel. All executed notebook can be downloaded as PDF or HTML file.

spaCy web application

All code is available at public GitHub repository github.com/pplonski/nlp-spacy-web-app-mercury.

Conclusion

In this article, we saw the different usages of the SpaCy library. We converted the spaCy Jupyter Notebook into a web application and looked at the way spaCy functions for the different tasks.

Jupyter Notebooks play an important role in prototyping the Machine Learning models. Sometimes it can be difficult to tweak the code in the Jupyter Notebook. In such situations, a simple Jupyter Notebook can be converted into a web application and we can tweak the parameters using widgets. We can explore the functionality of a library in a simple way and can share a Jupyter Notebook tutorial. All this is made possible by mljar-mercury.