Skip to content
View thomasjmann23's full-sized avatar

Block or report thomasjmann23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
thomasjmann23/README.md

Hi, I'm Thomas Mann

Welcome to my GitHub. I'm a data-driven developer focused on Generative AI, currently based in Madrid and completing my Master’s in Business Analytics and Data Science at IE University with a specialization in Advanced AI.

My work bridges automation, machine learning, and financial systems. I have hands-on experience building apps with LLMs, RAG pipelines, and OCR-based automation tools. I enjoy streamlining processes, taking ideas from concept to deployment.

Current Focus

  • Generative AI applications using LLMs
  • Retrieval-Augmented Generation
  • Using LLMs & NLP to make internal process decisions
  • Continual learning in MLOps and large-scale model deployment

Featured Projects

A Streamlit app that extracts SEC 10-K filings and uses Gemini and LangChain to generate insights from key-value pairs. Built for competitive intelligence and financial analysis.
Tech: Python, LangChain, Gemini, FAISS, Streamlit, SEC EDGAR API, HTML/CSS

An AI-powered ESG risk assessment platform that uses IoT smart grid data to provide real-time environmental scoring and risk classification. Features machine learning models trained on IoT sensor data for transparent, data-driven ESG ratings.

  • Dashboard: Real-time risk monitoring with interactive visualizations and project classification
  • Assessment Tool: Individual project ESG scoring with personalized recommendations
  • ML Analysis: Random Forest model with feature importance analysis and performance metrics Tech: Python, Streamlit, Scikit-learn, Plotly, Pandas, IoT Data Processing, HTML/CSS

An NLP application that allows users to ask legal questions in any language about any country.

  • Query Pipeline: Detects the input language and target country, translates the question into the target language, retrieves the top-matching sections from the country's civil code using TF-IDF, summarizes results with Gemini, and translates the answer back to the original language.
  • Civil Code Ingestion: Uploads and processes civil code PDFs—parsing, chunking, and vectorizing them (TF-IDF) for accurate legal retrieval. Tech: Python, NLP, TF-IDF, Gemini, Streamlit, PDF Processing, Multi-language Support

Automated Credit Memos

Lead a team of interns to automate the generation of SBA credit memos and commitment letters using VBA and Python with LLM input. Reduced manual effort and streamlined underwriting.
Tech: VBA, Python, OpenAI API

Skills

Languages
Python, SQL, VBA, HTML, CSS, JavaScript Spanish (B2), English (Native)

ML and AI
LangChain, OpenAI API, Gemini API, FAISS, PyTorch, Keras, Scikit-learn, Transformers (Hugging Face), SentenceTransformers, NLP

Data Tools
Pandas, NumPy, Polars, Seaborn, Matplotlib, Plotly, Tableau, Power BI, Advanced Excel

Cloud and Deployment
AWS, Streamlit Cloud, Docker, Hugging Face Spaces, Render

Big Data Technologies
Apache Spark, Apache Kafka, Apache NiFi, MinIO

Education

IE University
Master in Business Analytics and Data Science, Specialization: Advanced AI
Madrid, Spain 2024 – 2025

University of St Andrews
BSc (Hons) in Management with an emphasis in Statistics
Scotland, UK 2019 – 2023

What I'm Learning Now

  • A/B Testing of Prompt Engineering
  • Use of GenAI tools to make internal decisions in automated processes
  • Scaling RAG pipelines with complex databases to store all company data/metadata
  • Using vector databases and model fine-tuning

Let's Connect

Reach out via email, connect with me on LinkedIn, or check out what I’m building here on GitHub.

Pinned Loading

  1. DeepDiligence DeepDiligence Public

    Automatic competitor analysis from reports filed within the SEC EDGAR database.

    Python

  2. mrtngo/MLOps_Group_Project mrtngo/MLOps_Group_Project Public

    Python

  3. sedewind/Stock_prices sedewind/Stock_prices Public

    Jupyter Notebook

  4. santirhbf/NLP_Project santirhbf/NLP_Project Public

    Python 1 1