A reproducible evaluation pipeline for assessing toxic content in language model outputs using Detoxify, developed as part of a 10-week research project.
python nlp machine-learning transformers text-analysis research-project language-model ai-safety responsible-ai detoxify ethical-ai toxicity-detection prompt-learning text-toxicity
-
Updated
Aug 17, 2025 - Python