What I do as a Machine Learning Engineering

Henrique Peixoto Machado
3 min readJan 29, 2021

The data field is relatively new, and I still feel that a lot of people that are trying to break into the field have a lot of doubts about what each role does exactly.

So to explain what a machine learning engineering does I’ll need to explain the other two most common roles of data science.

Data engineering:

First let’s start with the data engineering, I see a lot of companies wanting to hire data scientist but what they actually need is a data engineering. They are responsible for structuring where and how the data will be stored. They do this by creating data pipelines so that the data can be storage on a tool that makes it of easy access and also before store correct some mistakes that the data might have.

It’s important to point out that this is one of the most important roles in the whole data world, because if you don’t have good structure and available data the machine learning engineering and the data scientist won’t be much of a help.

Once I was asked to build a time series model to predict monthly sales, I told that I needed at least 2 years of data and they said no problem, but when they shown me the data was all on excel files, like a 100 excel files, for this problem I spent a month to put everything organized on a single file and about a week to actually build the model.

Data scientist:

The second famous data science role is the data scientist, in most companies today the data scientists have only one goal, that is to build the best machine learning model he can think of, because most companies have custom problems so they need a custom solution, and believe me, that is not always easy.

Machine Learning Engineering:

So with these two concepts explained is easy to tell what a machine learning engineering does, that is a little bit of both.

Most of the data scientist today when are building a awesome custom model they don’t think how they will productize the model, some still code in R, that won’t go into production, so after they build the model, the machine learning engineering is responsible to make this model talk with all the other applications on the cloud, making sure that even with the changes the model stays the same but also talk with the cloud tools.

To make things more clear, let’s imagine a app that has can identify faces, how this product would be build:

1- First the DevOps team would go ahead and build the app, that would send the picture to the cloud.

2- On the cloud the data engineer would create a backend to save the pictures and also store the app that was created.

3- The Data scientist would focus only on building the best face recognition model.

4- And after all of that would come a machine learning engineer to put that model into the cloud, structure a pipeline that every time a photo would appear on the cloud they would trigger the model and send the response to the app.

I am one of those people who love a challenger, so I love that today part of my job is to deal a little bit with everything that is being made. But I gotta say if you ever want to become a good machine learning engineer you will have to be able to learn all kinds of different abilities all the time, some projects you will dive into DevOps, other times lose sleep over a pipeline and other times go crazy about which model to use.

I hope I could make clear a little bit what which data role does, and I hope you enjoy it. Until next time!

--

--