LLM

Extract and parse text from documents and create vector embeddings in a scalable and distributed way (and less than 70 lines of code).

The results of our tests on the structured outputs of Google Gemini Pro, Anthropic Claude, and OpenAI GPT. DataChain used for evaluation.

Introducing DataChain - a new open-source tool to curate and process unstructured data using local ML models, and LLM calls.