Open Source Free | Teams Contact us | |
|---|---|---|
Core Features | ||
| Unstructured Storages | ||
| Unstructured Data Types | ||
| No Data Duplication | ||
| Metadata Extraction | ||
| Structured Data in DBs | ||
| Data Versioning & Lineage | ||
| Semantic Search & Filters | ||
| Flexible Python Pipelines | ||
| Parallel Processing | ||
High-Scale Datasets | ||
| Size of data | Terabytes scale | Petabytes scale |
| Dataset Cardinality | Up to 30M items | Up to 1B+ items |
| Metadata engine | ||
Coding Copilot | ||
| Cursor & GitHub Copilot | Coming soon | Early Access |
| Lineage-Aware Autocomplete | Coming soon | Early Access |
| MCP | Coming soon | Early Access |
High-Scale Processing | ||
| Distributed Processing | ||
| Cloud Support | ||
| Auto-scaling | ||
Team Collaboration | ||
| Shared Dataset Registry | ||
| Web UI | ||
| SSO/SAML | ||
| RBAC for Data | ||
Deployments | ||
| Local | ||
| SaaS | ||
| Bring Your Own Cloud | ||
| On-Premise | ||