Wavenet Inspired Language Model
GitHubA language model inspired by Wavenet architecture for efficient text generation.
In this project, i coded up a character level auto regressive language model purely using pytorch from Scratch and trained it on a large dataset of company names to produce novel company names for my imaginary startup :). I was really inspired by the Google DeepMind's WaveNet paper and thought it would be fun to integrate the idea of causal convolutions into a language models. Turns out it did improve the model performance and intuitively it makes sense because it is able to capture long range dependencies in the text (even though this works on a small size example at character level this idea can be scaled up to words and sentences). Also i wanted to use Marimo notebook to deploy my model and make it interactive/configurable so that anyone can tune the model hyperparameters and see how it affects the model performance.