Skip to content
View aragy's full-sized avatar

Block or report aragy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
aragy/README.md

Hi, I'm Roberto Aragy! πŸ‘‹

Welcome to my GitHub profile! I am a Data Scientist and Machine Learning Engineer with over 10 years of experience in software development and more than 6 years in machine learning, currently working at the Court of Justice, where I am developing Generative AI (GenAI) and RAG systems for document summarization, QA, and legal text simplification.

πŸ”§ My Key Expertise:

  • Programming Languages: Python, SQL
  • Data Science & Machine Learning: pandas, NumPy, scikit-learn, TensorFlow, PyTorch, Keras, Hugginface, Spacy
  • NLP & LLMs: 6+ years of experience working with Natural Language Processing (NLP) and Large Language Models (LLMs), including fine-tuning models like Mistral and Llama for legal and business applications
  • Cloud Platforms & DevOps: Google Cloud Platform (GCP), Vertex AI, BigQuery, Docker, Kubernetes
  • MLOps & Deployment: Expertise in building and managing ML pipelines using MLOps principles, including model serving and orchestration with Kubernetes and Vertex AI
  • Legal AI Systems: Building AI-driven tools to simplify judicial text, generate case summaries, and automate decision-making insights

πŸ’Ό Professional Experience:

I have extensive experience contributing to cutting-edge AI projects in the legal tech space, helping organizations build smarter, more efficient processes. I've led projects that involve everything from data preparation and annotation to model deployment and API integration, following best practices in MLOps.

Recent projects include:

  • GenAI and RAG systems: Developing advanced generative models and retrieval-augmented generation systems to summarize legal documents, generate reports, and simplify complex legal language.
  • Automated Legal Summarization Systems: Implementing tools to extract meaningful insights from court case documents using NLP techniques and Langchain.

πŸ“š Ongoing Projects:

  • Fine-tuning LLMs: Continuously experimenting with Supervised Fine-Tuning (SFT) on models like Mistral and Llama to adapt them to specific legal and business contexts.

🌱 Personal Growth:

I'm always learning new tools and technologies to stay on top of the rapidly evolving data engineering landscape. Currently, I'm exploring LangGraph, CrewAI, and Ollama.

πŸ“« How to reach me:

I’m always open to new challenges and opportunities, so feel free to connect and explore collaboration opportunities!

Popular repositories Loading

  1. ProjetoGenJustica_chatDoc ProjetoGenJustica_chatDoc Public

    Python 1

  2. ProjetoGenJustica ProjetoGenJustica Public

    Python 1

  3. Anon Anon Public

  4. Introduction-to-Python Introduction-to-Python Public

    Jupyter Notebook

  5. pytorch-Deep-Learning pytorch-Deep-Learning Public

    Forked from Atcold/NYU-DLSP20

    Deep Learning (with PyTorch)

    Jupyter Notebook

  6. language-models language-models Public

    Forked from piegu/language-models

    pre-trained Language Models

    Jupyter Notebook