Find here DataChain and DVC news, findings, interesting reads, community takeaways, deep dive into machine learning workflows from data versioning and processing to model productionization.
Monthly updates are here! You will find a link to Chip Huyen's new book, great guides and frameworks on the iterative nature of AI, tons of company news, Dmitry on TFIR, beyond machine learning use cases and more! Welcome to May!
In this final part, we will focus on leveraging cloud infrastructure with CML; enabling automatic reporting (graphs, images, reports and tables with performance metrics) for PRs; and the eventual deployment process.
In part 1, we talked about effective management and versioning of large datasets and the creation of reproducible ML pipelines.
Here we'll learn about experiment management: generation of many experiments by tweaking configurations and hyperparameters; comparison of experiments based on their performance metrics; and persistence of the most promising ones
In most cases, training a well-performing Computer Vision (CV) model is not the hardest part of building a Computer Vision-based system. The hardest parts are usually about incorporating this model into a maintainable application that runs in a production environment bringing value to the customers and our business.
In this guide we will show how you can use CML to automatically retrain a model and save its outputs to your Github repository using a provisioned AWS EC2 runner.