Skip to content

Latest commit

 

History

History
31 lines (17 loc) · 788 Bytes

File metadata and controls

31 lines (17 loc) · 788 Bytes
  1. Dataset & Prep

Dataset: IMDb Movie Reviews.

Why: Widely used for sentiment tasks.

Prep: Lowercasing, punctuation removal, tokenization.

  1. Prompt Engineering

Prompt 1: “Classify the sentiment of this review: [text]” → direct label.

Prompt 2: “Does the reviewer sound happy or upset? Review: [text]” → natural response.

Prompt 3: “Return JSON with {label, confidence} for review: [text]” → structured output.

  1. Evaluation

Model: distilbert-base-uncased-finetuned-sst-2-english.

Metrics (tiny dataset): Accuracy 1.00, Precision 1.00, Recall 1.00, F1 1.00.

  1. Troubleshooting

Issue: Sarcasm & negation confuse models.

Fix: Add sarcastic examples or prompt model to “consider sarcasm.”

  1. Run pip install transformers scikit-learn python assignment.py