Skip to content
View pranshuchaurasia's full-sized avatar

Block or report pranshuchaurasia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pranshuchaurasia/ReadMe.md

Pranshu Chaurasia

Data Scientist — Responsible & Applied AI Research @ SocGen AI

Specializing in Document Intelligence and Large Language Models. I build end-to-end AI solutions for complex document processing, RAG systems, and information retrieval at scale.

About Me

I am a Data Scientist at SocGen AI, focused on accelerating AI integration to automate and optimize banking processes. My work emphasizes responsible AI deployment, regulatory compliance (EU AI Act), and building innovative solutions with a strong focus on operational efficiency.

My core expertise involves:

  • Document AI: Multimodal transformers and layout analysis.
  • Information Retrieval: Enterprise-grade indexing systems and vector search.
  • Generative AI: LLM reasoning, Chain of Thought approaches, and RAG architectures.

Technical Skills

Languages

  • Python, SQL

NLP & LLMs

  • RAG Architectures, LLMOps
  • Prompt Engineering, Fine-tuning (PEFT, LoRA), RLHF
  • Frameworks: LangChain, LlamaIndex

Machine Learning & Frameworks

  • PyTorch, TensorFlow, Scikit-learn
  • Computer Vision: Document Layout Analysis, OCR, Vision Transformers (ViT), YOLO

Data Infrastructure & MLOps

  • Vector Databases: Qdrant, Vespa, FAISS, PostgreSQL
  • Cloud & Operations: AWS, Azure AI Studio, Docker, Git

Current Focus

  • Experimenting with reasoning models and advanced inference strategies.
  • Improving efficiency in document indexing and retrieval systems.
  • Exploring multimodal architectures for enhanced document understanding.

Connect


GitHub Stats Top Langs

Popular repositories Loading

  1. image-indexing-and-retrival-with-qdrant image-indexing-and-retrival-with-qdrant Public

    The repo provides the code for Qdrant for efficient image indexing and retrieval using models such as ColPali, ColQwen, and VDR-2B-Multi-V1, jina embeddings v4 etc enhancing multimodal search capab…

    Python 8 1

  2. layout-pdf-extraction layout-pdf-extraction Public

    Python 1

  3. Python-Classes-From-Basic-to-Advanced-Production-Usage Python-Classes-From-Basic-to-Advanced-Production-Usage Public

    1

  4. pranshuchaurasia pranshuchaurasia Public

    Config files for my GitHub profile.

  5. iNeuron-Full-Stack-Data-Science-Bootcamp-Course-Notes iNeuron-Full-Stack-Data-Science-Bootcamp-Course-Notes Public

    data-science-bootcamp

    Jupyter Notebook

  6. iNeuron-Full-Stack-Data-Science-Bootcamp-Assignment iNeuron-Full-Stack-Data-Science-Bootcamp-Assignment Public

    Data science Bootcamp Assignments

    Jupyter Notebook