Bart from Scratch

GitHub
NLPTransformersPyTorchSequence-to-SequenceBART

TLDR

Complete BART transformer implementation from scratch. Trained on CNN/Daily Mail dataset for text summarization.

Detailed

Tech Stack:

PyTorch, HuggingFace tokenizer, CNN/Daily Mail dataset, Google Colab

Goal:

Implement BART (Bidirectional and Auto-Regressive Transformers) model from scratch for text summarization.

What I did:

  • Built complete BART architecture without pre-trained models
  • Implemented greedy and beam search sampling strategies
  • Trained on CNN/Daily Mail dataset for article summarization
  • Made architecture configurable via JSON
  • Provided both Jupyter notebook and structured component implementation

What was achieved:

Working BART model that learns to summarize articles. References: BART paper and "Attention Is All You Need".