Skip to content

jagarapusivatanayakash/summarization

Repository files navigation

Here’s an updated README.md file including LLaMA 2 for text summarization:


Summarization using Transformer Models (Including LLaMA 2)

This repository contains code and resources for performing text summarization using various Transformer-based models, including the powerful LLaMA 2 model alongside other models like BERT, DistilBERT, Pegasus, T5, and RoBERTa. The project demonstrates the fine-tuning of these models on the CNN Daily Mail dataset to generate concise and coherent summaries from news articles.

Features

  • Fine-tuning of various Transformer models, including LLaMA 2 for text summarization tasks.
  • Memory-efficient training using QLoRa and LoRa quantization techniques.
  • Detailed evaluation metrics to measure summarization performance.
  • Instructions to replicate experiments with custom datasets.

Models Implemented

  • LLaMA 2 7b (Large Language Model Meta AI)
  • BERT (Bidirectional Encoder Representations from Transformers)
  • DistilBERT (A smaller, faster, cheaper version of BERT)
  • Pegasus (Pre-training with Extracted Gap-sentences for Abstractive Summarization)
  • T5 (Text-To-Text Transfer Transformer)
  • RoBERTa (Robustly Optimized BERT Pretraining Approach)

Dataset

The models are fine-tuned using the CNN Daily Mail dataset, which is widely used for text summarization tasks and consists of news articles paired with human-written summaries.

Dataset Structure

  • Articles: Long-form news articles.
  • Summaries: Short summaries corresponding to each article.

Quantization Techniques

To optimize the training process, the following quantization techniques are employed:

  • QLoRa: Quantization-aware LoRa, which reduces memory usage while maintaining performance.
  • LoRa: A low-rank adaptation technique that reduces trainable parameters, speeding up training and inference.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors