Docker Image: Lean Python 3.7 with NLTK

I published a new lean Docker image for Python 3.7 with the NLTK library that is only 38 MB small. It has a detailed write up with the Dockerfile attached (because you shouldn’t trust unsigned public images) The motivation was that I needed a simple Python container with NLTK, but some of the existing images […]

TeBaC-NET Design Considerations

In my previous post on TeBaC-NET, I talked about the reason why I created it. In this post, I talk about why I created it the way it is. Design Considerations #1 Cross Platform One of the most important consideration is that it should be platform agnostic. A simple tool that can run on any […]

Text Based Custom Named Entity Tagger (TeBaC-NET)

I was recently exploring spaCy¬†for some NLP work, and found that the default model was not sufficient for tagging entities in the domain I was exploring. The documentation was very helpful in explaining how I could train the statistical model of the named entity recognizer, but I needed training and evaluation data. While I could […]

Paddle-SG: Software Architecture and Infrastructure

In this post, I elaborate further on the Software Architecture, and Infrastructure that keeps Paddle-SG up and running. I will start explaining from the lowest level (i.e. hardware), and slowly move up the “stack”. Infrastructure Setup Physical Hardware I started developing and testing the scripts on my laptop. However, for the Minimal Viable Product (MVP), […]