Skip to content

somos-ubb/Lyrics_Gender_Violence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Corpus of Spanish song lyrics labeled for gender-based violence detection

Code folder

Includes a Google Colab notebook ([https://github.com/somos-ubb/Lyrics_Gender_Violence/blob/main/code/BETO/model.ipynb]) used to generate a model adjusted to our purpose, gender-based violence against women. We use as a base the Spanish BERT model available at ([https://github. com/dccuchile/beto])

Corpus folder

It includes two versions of GBV_dataset: GBV_dataset1000.csv, a corpus of 1,000 song lyrics, labeled as {0: without gender-based violence; 1: with gender-based violence}, and GBV_dataset1400_.csv, a corpus of 1,400 song lyrics. Its construction was based on new examples collected and previous work relabeled by an expert in gender-based approaches:

Sources

References

[1] Calbullanca Viluñir, R., Segura Navarrete, A., Vidal-Castro, C., & Martínez-Araneda, C. (2024). Corpus of song lyrics in Spanish labeled for gender-based violence against women (1.0.0) [Data set]. Zenodo.https://doi.org/10.5281/zenodo.13370289

[2] Gutiérrez, R., Segura Navarrete, A. A., Martínez-Araneda, C., & Vidal-Castro, C. (2024). Augmented DataSet [Data set]. Zenodo.https://doi.org/10.5281/zenodo.12802358

[3] Casanovas-Buliart, L., Álvarez-Cueva, P., & Castillo, C. (2024). Evolution over 62 years: an analysis of sexism in the lyrics of the most-listened-to songs in Spain. Cogent Arts & Humanities, 11(1). https://doi.org/10.1080/23311983.2024.2436723

How to cite the GBV_dataset1000 and GBV_dataset1400?

Calbullanca Viluñir, R., Segura-Navarrete, A., Vidal-Castro, C., & Martínez-Araneda, C. (2024). Corpus of Song Lyrics in Spanish Labeled for Gender-Based Violence against Women (Version 1.0.0) [Data set]. Zenodo. [https://doi.org/10.5281/zenodo.13370289]

Segura-Navarrete, A., Martínez-Araneda, C., Quintana-Reyes, C., Vidal-Castro, C., & Gómez-Meneses, P. (2026). somos-ubb/Lyrics_Gender_Violence: Gender-based violence DataSet (GBV_dataset1400) (1.0.1) [Data set]. Zenodo. [https://doi.org/10.5281/zenodo.18157160]

date-updated: January 5, 2026

Grupo de Investigación SoMoS (SOftware MOdelling & Science)

About

An evolving dataset corpus of Spanish song lyrics labeled for gender-based violence detection

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors