Skip to content

ba11b0y/candi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Understanding Privacy Preserving Knowledge in models via Mechanistic Interpretability

This project aims to study the privacy-preserving properties of LLMs with a real-world application in mind. The idea is to see if a model deployed in a setting where it has access to PII data and is asked to perform a task(summarization, drawing insights etc) without leaking PII data(by masking) can still potentially leak data by looking at internal activations and top k tokens generated.

About

Understanding Privacy Preserving Knowledge in models via Mechanistic Interpretability

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •