Skip to content
View saradune6's full-sized avatar
🎯
Focusing
🎯
Focusing
  • GURUGRAM

Block or report saradune6

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
saradune6/README.md

πŸ‘‹ Hi, I'm Sara Choudhary

πŸš€ Ai Engineer | Data Scientist | LLM Engineer | Deep Learning & Transformers Enthusiast

I specialize in LLM Agents, Deep Learning, and Transformers, working on AI-driven solutions for fintech, analytics, and automation. My expertise spans NLP, ML models, and scalable data solutions that power intelligent systems.


🌍 Connect with Me LinkedIn


🏒 Work Experience

Associate Data Scientist III

πŸ“ CARS24 | Gurugram, India | Mar 2025 - Present
βœ… Aadhaar Masking Automation

  • Built a blurriness detection model using Swin Transformer to assess scanned document quality.
  • Automated Aadhaar number and QR code masking using RapidOCR, PyTesseract, and Google Vision API.
  • Enhanced KYC automation with real-time PII redaction, reducing manual effort.
  • Delivered β‚Ή10L/month cost savings through efficient document processing.

βœ… FCU (Fraud Control Unit) Automation

  • Designed a fully automated FCU pipeline combining image quality checks and OCR-based data extraction.
  • Built parallel Selenium automation APIs to validate user identity via government portals.
  • Integrated Google Gemini API for real-time CAPTCHA solving in the verification flow.
  • Improved fraud detection accuracy, resulting in β‚Ή85L/year cost savings.

βœ… Insurance Propensity Model

  • Developed and deployed a CatBoost classification model to predict insurance purchase likelihood.
  • Achieved a lift of 2 across deciles, enabling targeted sales interventions.
  • Drove β‚Ή50L/month incremental revenue impact through improved targeting.

βœ… Feature Mart by Agentic AI

  • Built an Agentic AI–powered feature mart with 10,000+ features across Bureau, Buy, Sell, Marketing, Calling, and Service data.
  • Automated Snowflake queries, data flattening, and LLM-driven feature creation, cutting analytics turnaround by 40%.
  • Delivered validated, LLM-generated features improving model accuracy and campaign targeting effectiveness.

AI Engineer

πŸ“ Legal-Pythia | Glasgow, UK | Jan 2025 - Mar 2025
βœ… Chatbot Development for FinTech

  • Built an AI chatbot using LangChain & RAG for financial queries.
  • Integrated Hugging Face & Ollama for enhanced contextual understanding.
  • Developed APIs for seamless integration with financial systems.

βœ… Automated Reconciliation & Analytics

  • Automated transaction reconciliation using BigQuery, SQL, and Python.
  • Designed Looker dashboards for real-time insights and anomaly detection.

Data Scientist

πŸ“ BharatPe | Gurugram, India | Jul 2024 - Jan 2025
βœ… KYC Stack (Name Matching with PAN & Aadhaar)

  • Built a BERT-based name-matching model, improving KYC accuracy by 15%.
  • Integrated Siamese BERT & FuzzyWuzzy for better transliteration handling.
  • Automated real-time KYC workflows, reducing verification time by 20%.

βœ… Loan Propensity Model

  • Developed an XGBoost-based loan propensity model for 400,000+ merchants.
  • Achieved AUC of 80-85%, increasing loan conversions by 6%.

βœ… Speaker Notification System

  • Built an NLP-powered notification system using Mistral & GLiNER for entity recognition.
  • Replaced regex-based classification, significantly improving alert accuracy.

πŸ’» Tech Stack

Languages & Tools I Use:

Python PowerShell Windows Terminal Bash Script JavaScript AWS Google Cloud Django FastAPI Flask OpenCV PyTorch TensorFlow Docker GitHub GitLabGoogle Cloud MySQL Microsoft SQL Server Canva Matplotlib NumPy Pandas Plotly scikit-learn Scipy GitHub Git Power BI Python


πŸ“Š GitHub Stats

Top Languages GitHub stats GitHub Streak


Pinned Loading

  1. Agentic-Chatbot Agentic-Chatbot Public

    Python

  2. Chatgbt-Clone Chatgbt-Clone Public

    Python

  3. BlindSight-AI-Scene-Describer BlindSight-AI-Scene-Describer Public

    Python

  4. AI_Conversation_Coach AI_Conversation_Coach Public

    HTML