No results found for query ""
    Search by

    Iterative Blog

    Find here DataChain and DVC news, findings, interesting reads, community takeaways, deep dive into machine learning workflows from data versioning and processing to model productionization.

    As GenAI Fever Fades - Time to Prioritize Robust Engineering Over Overblown Promises
    Improved Engineering and Data Management will be what carries GenAI into maturity
    • Dmitry Petrov
    • Oct 23, 20243 min read
    Scalable PDF Document Processing with DataChain and Unstructured.io
    Extract and parse text from documents and create vector embeddings in a scalable and distributed way (and less than 70 lines of code).
    • Tibor Mach
    • Sep 30, 20247 min read
    Post-modern AI Data Stack
    How and Why Generative AI will change the modern data stack.
    • Daniel Kharitonov
    • Sep 24, 20247 min read
    You Do the Math: Fine Tuning Multimodal Models (CLIP) to Match Cartoon Images to Joke Captions
    Learn how to fine tune multimodal models like CLIP to match images to text captions.
    • Dave Berenbaum
    • Sep 12, 20249 min read
    Enforcing JSON Outputs in Commercial LLMs
    The results of our tests on the structured outputs of Google Gemini Pro, Anthropic Claude, and OpenAI GPT. DataChain used for evaluation.
    • Daniel Kharitonov
    • Sep 06, 202410 min read
    Announcing DataChain
    Introducing DataChain - a new open-source tool to curate and process unstructured data using local ML models, and LLM calls.
    • Dmitry Petrov
    • Jul 23, 20244 min read
    Dataset Factory - A Toolchain for Generative Computer Vision Datasets
    Learn about our latest approach to mastering your Unstructured Data and metadata.
    • Jeny De Figueiredo
    • Mar 25, 20241 min read