DataChain Blog

Find here DataChain news, findings, interesting reads, community takeaways, deep dive into machine learning workflows from data versioning and processing to model productionization.

Enforcing JSON Outputs in Commercial LLMs
The results of our tests on the structured outputs of Google Gemini Pro, Anthropic Claude, and OpenAI GPT. DataChain used for evaluation.
  • Daniel Kharitonov
  • Sep 06, 202410 min read
Announcing DataChain
Introducing DataChain - a new open-source tool to curate and process unstructured data using local ML models, and LLM calls.
  • Dmitry Petrov
  • Jul 23, 20244 min read
Dataset Factory - A Toolchain for Generative Computer Vision Datasets
Learn about our latest approach to mastering your Unstructured Data and metadata.
  • Jeny De Figueiredo
  • Mar 25, 20241 min read