This holiday season, show your loved ones you care with our new shirt.
News
Welcome to the December Heartbeat! Let's dive in with some news from the team.
We're still hiring
Our search continues for two roles:
-
A Senior Software Engineer for the core DVC team- someone with strong Python development skills who can build and ship essential DVC features.
-
A Developer Advocate to support and inspire developers by creating new content like blogs, tutorials, and videos- plus lead outreach through meetups and conferences.
Does this sound like you or someone you know? Be in touch!
Video docs complete!
As you may have heard last month, we've been working on adding complete video docs to the "Getting Started" section of the DVC site. We now have 100% coverage! We have videos that mirror the tutorials for:
-
Data versioning - how to use Git and DVC together to track different versions of a dataset
-
Data access - how to share models and datasets across projects and environments
-
Pipelines - how to create reproducible pipelines to transform datasets to features to models
-
Experiments - how to do a
git diff
for models that compares and visualizes metrics
The full playlist is on our YouTube channel- where, by the way, we've recently passed 2,000 subscribers! Thanks so much for your support. There's much more coming up soon.
Collaboration with GitLab
We recently released a new blog with GitLab all about using CML with GitLab CI.
The team behind https://t.co/At942BC7sF released an open source project called CML (continuous machine learning).
— 🦊 GitLab (@gitlab) December 3, 2020
Learn more about GitLab ➕ @DVCorg! https://t.co/eD8loo4mT5
You may notice that the tweet spelled our name differently, and since Twitter doesn't have an edit button, I think that means we're "Interative" now. Hurry up and get your merch!
Workshops
We gave a workshop at a virtual meetup held by the Toronto Machine Learning Society, and you can catch a video recording if you missed it. This workshop was all about getting started with GitHub Actions and CML! It starts with some high-level overview and then gets into live-coding.
From the community
There's no shortage of cool things to report from the community:
The DVC Udemy Course
Now you can learn the fundamentals of machine learning engineering, from experiment tracking to data management to continuous integration, with DVC and Udemy! Data scientists/DVC ambassadors Mikhail Rozhkov and Marcel Ribeiro-Dantas created a course full of practical tips and tricks for learners of all levels.
Machine Learning Experiments and Engineering with DVC
A proposal for Git-flow with DVC
Fabian Rabe at Universität Augsburg wrote a killer doc about his team's tried-and-true approach to creating a workflow for a DVC project. He writes,
Over the past couple of months we have started using DVC in our small team. With a handful of developers all coding, training models & committing in the same repository, we soon realized the need for a workflow.
The post outlines three strategies his team adopted:
-
Create a "debugging dataset" containing a subset of your data, with which you can test your complete DVC pipeline locally on a developer's machine
-
Use CI-Runners to execute the DVC pipeline on the full dataset
-
Adopt a naming convention for Git branches that correspond to machine learning experiments, in addition to the usual feature branches
Agree? Disagree? Fabian is actively soliciting feedback on his proposal (and possible solutions for some unresolved issues), so please read and chime in on our discussion board.
Git Flow for DVC
Channel 9 talks Machine Learning and Python
The AI Show on Channel 9, part of the Microsoft DevRel universe, put out an episode all about ML and scientific computing with Python featuring Tania Allard and Seth Juarez. Their episode includes how DVC can fit in this development toolkit, so check it out!
A nice tweet
We'll end on a tweet we love:
I learned quite a bit in @visenger's talk about 10 fundamental practices for Machine Learning engineering.
— Joy Heron (@iamjoyheron) December 9, 2020
Here is my #sketchnote #INNOQTechnologyDay pic.twitter.com/tQjRrJq993
This beautiful diagram, made by Joy Heron in response to a talk by Dr. Larysa Visengeriyeva about MLOps, is a wonderful encapsulation of the many considerations (at many scales) that go into ML engineering. Do you see DVC in there? 🕵️
Thank you for reading, and happy holidays to you! ❄️ 🎁 ☃️