Multimodal

You Do the Math: Fine Tuning Multimodal Models (CLIP) to Match Cartoon Images to Joke Captions
Learn how to fine tune multimodal models like CLIP to match images to text captions.
  • Dave Berenbaum
  • Sep 12, 20249 min read