Skip to content
View prakashjayy's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report prakashjayy

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
prakashjayy/README.md

Hi there 👋

My name is Prakash,

Current - AI Research Scientist at invideo.ai working as Generative AI team lead building multi-model systems from last 1.5 year, Previously

  • At qure.ai I worked as Senior Director - Data science leading 3D (CT scan - Chest and Brain) team for 3 years.
  • At fractal.ai, worked as Data scientist for one year and later worked as Senior Data scientist for 3 years.

You can find me at

X (Twitter) LinkedIn Medium

About me 🕴️

I have been working in the deep learning field for over 9 years. I enjoy learning things from scratch and love writing and teaching about them. I am known for my attention to detail—you'll often find me documenting my findings in thorough reports. For any project, I start by gathering information, collecting data, and establishing a solid evaluation framework. I focus on understanding the fundamentals and take a mathematical approach to every problem.

List of projects I worked and went into production.

  • Talking heads: Multimodal Generative AI system. Trained GAN models on 5k+ hours of audio+video data. experimenting with diffusion foundational models now [2025]
  • SyncNet: Contrastive learning, CLIP type model for audio+video alignment. Trained on 10k+ hours of audio+video data. [2024]
  • Identifying defects from Product X-rays. Non-destructive testing . self supervision (200k+ images) + segmentation (25k+ images).
  • 3D object detection: Finding nodules on Chest CT. Trained on 200k CT Scans. [2023]
  • 2D object detection: Identifying Humans in drone footage - thermal and normal cameras - Trained on 100+ hours of data. [2022]
  • 2D object detection and long tail classification: Quantifying brand presence in retail shelfs, Detecting 200+ objects from single image and identifying 5000+ SKUs. Trained on 100k+ images [2020-21]
  • Image classification: Identifying diabetic retinopathy from Eye Images. Trained on 50k images [2018]. This is more of a research project

Competitions

Below are some competations where I was in Top-25

Blogs

The following blogs are written by me on various platforms.

  • All my generative AI blogs are written here. details on diffusion, flow, score and GAN based models are detailed.
  • some of blogs on foundation models were written here

GenAI

Foundation models

object detection

image classification

Others

Engineering

languages and tools

linux docker python python pytorch

  • 💼 any freelance work? do reach, email :)
  • 💬 ask me about anything, i am happy to help;

Pinned Loading

  1. computer_vision computer_vision Public

    A principled approach to improving computer vision models.

    Jupyter Notebook 4

  2. av_july_2017 av_july_2017 Public

    Work on AV July 2017 Fractal Hiring Hackathon.

    Python 6 3

  3. av_mckinesy_recommendation_challenge av_mckinesy_recommendation_challenge Public

    Analytics vidya McKinsey Analytics Recommendation Challenge. Uses neural networks

    Python 6 1

  4. medium_blogs medium_blogs Public

    medium blog supplementaries | Backprop | Resnet & ResNext | RNN |

    Jupyter Notebook 72 41

  5. Time_Series_Course Time_Series_Course Public archive

    Time Series course material

    Jupyter Notebook