Artificial Intelligence

As GenAI Fever Fades - Time to Prioritize Robust Engineering Over Overblown Promises
Improved Engineering and Data Management will be what carries GenAI into maturity
  • Dmitry Petrov
  • Oct 23, 20243 min read
Scalable PDF Document Processing with DataChain and Unstructured.io
Extract and parse text from documents and create vector embeddings in a scalable and distributed way (and less than 70 lines of code).
  • Tibor Mach
  • Sep 30, 20247 min read
Post-modern AI Data Stack
How and Why Generative AI will change the modern data stack.
  • Daniel Kharitonov
  • Sep 24, 20247 min read
You Do the Math: Fine Tuning Multimodal Models (CLIP) to Match Cartoon Images to Joke Captions
Learn how to fine tune multimodal models like CLIP to match images to text captions.
  • Dave Berenbaum
  • Sep 12, 20249 min read
Announcing DataChain
Introducing DataChain - a new open-source tool to curate and process unstructured data using local ML models, and LLM calls.
  • Dmitry Petrov
  • Jul 23, 20244 min read