The data field is relatively new, and I still feel that a lot of people that are trying to break into the field have a lot of doubts about what each role does exactly.

So to explain what a machine learning engineering does I’ll need to explain the other two most common roles of data science.

Data engineering:

First let’s start with the data engineering, I see a lot of companies wanting to hire data scientist but what they actually need is a data engineering. They are responsible for structuring where and how the data will be stored. …


Data Fusion is a google cloud solution for building data pipelines without any code, although the solution has some limitations (so far) when used together with Cloud Composer becomes a really powerfull tool to build data lakes.

Image for post
Image for post

Let’s start talking about the problems with Data Fusion:

So for me the biggest limitation that I found with Data Fusion is that you cannot pass dinamic parameters, the only dinamic parameter that it accepts are with dates that works with the formula bellow:

${logicalStartTime(yyyy-MM-dd)}

This will return the current date, if you want something like yesterday you can obtain like this:

${logicalStartTime(dd/MM/yyyy,1d)}

But except from that, you cannot pass dinamic parameters…


If you are in the Data world, the sentence you gonna hear the most is always, 90% is data and 10% is model, but when you starting looking at tutorials on the internet 99% are about models, I’ve seen several data scientist that known how to make a great model and were total noobies when they need to check if there was missing values on the columns, so this post will be about the most important part of data, the data.

Image for post
Image for post

1- When exploring a new dataset, the first thing you should do is to check if the inputs…


I know that for the most experienced data scientists this question is all old as data science itself, but for beginners this is a real issue, so let’s begin the fight! 👊

Image for post
Image for post

In the beginning when there was no data science as we know today, statisticians started using R for their analysis, so in earlier 2000s when the computers started to be strong enough and the big datasets were being created it was only natural that R was the default language for Data. …


It’s normal at the beginning of the Data Science journey you start studying some real basic stuff like housing price trends, and then you study a little bit of time series and after that you go straight into computer vision and some other really hard stuff.

So far as a Data Analyst I have never faced a problem in the workplace that I used computer vision to solve it, but every week there is always a Time Series problem.

So let’s make a deep dive here into time series:

Image for post
Image for post

The life of all data beginners

First of all, this…


Image for post
Image for post

So if you are going to take the Tensorflow certificate exam, the first thing you gotta do is to prepare the enviroment as said in the handbook, believe or not I lost quite some time doing this especially because I wasn’t used to use Pycharm -anaconda for life-.

Here is some tips to prepare:

First things first, you need to update the pip install to the lastest version, you can do this on this arrow:

file -> settings ->project -> interpreter:


Image for post
Image for post

Semana passada eu fiz posts 1, 2 e 3 sobre visão computacional e como elas funcionam, entretanto refletindo faltou falar sobre uma função super importante para ConvNets, que é transfer learning.

Em visão computacional o grande desafio é encontrar grandes datasets com imagens corretamente identificadas, já houve algumas tentativas de usar imagens criadas digitalmente:


Neste post irei focar no código que é usado para criar uma ConvNet, caso sobre queria saber mais detalhes sobre o assunto, recomendo irem no post 1 e post 2 onde eu explico os conceitos do que está sendo feito aqui.

Para realizar esse exemplo, eu peguei uma competição do Kaggle chamada de Dovs vs Cats. O objetivo do desafio era bem simples, eles forneciam 10k fotos de cachorros e gatos, e deveríamos contruir um modelo que identificasse se na foto era um cachorro ou gato.

Primeiramente vamos importar os pacotes que vamos necessitar para esse projeto:

import matplotlib.pyplot as…


Essa semana concluí uma especialização em visão computacional e reparei que grande parte dos conteúdos que estudei eram em inglês. Há um gap muito grande de conteúdos em nosso idioma e para tentar ajudar nessa produção de conteúdo nacional, vou contar sobre o meu aprendizado neste curso feito pela deeplearning.ai.

Dividirei em três partes: a primeira trazendo a introdução do tema; a segunda focada nos conceitos e a terceira com o meu projeto mostrando o poder da ferramenta.

Então começando, o que é visão computacional?

Image for post
Image for post

Enxergar o mundo para nós humanos é algo natural, quando nascemos abrimos os olhos e…


Como comentado no post anterior visão computacional foi revolucionário pelo os avanços das Convolutional Neural Network, mas o que seria isso?

Uma rede neural convolacinal (Convolutional Neural Network (ConvNet/CNN)) é um algoritimo de deep learning que analisa uma imagem, e aprende vários aspéctos sobre os objetos nela, sendo assim capaz de aprender de maneira autonoma caracteristicas que compõe um objeto.

Image for post
Image for post
Exemplo de uma estrutura de CNN

A CNN primeiramente vai aplicar um filtro convolucional na imagem (matrizes RGB), o filtro convulacional é uma matriz que possui valores que quando multiplicados com as matrizes da imagem ressaltam features (caracteristicas) que o algorítimo está procurando.

Para deixar mais…

Henrique Peixoto Machado

Data scientist certified by google as a Tensorflow developer and also trying to be a intergalactic hitchhiker.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store