Hi~ Hi~ all guys.

Happy new year 2020 belatedly. Here is the first time for me after missing blogging so long.

Right now, it’s a series about my learning session of Data Science.

Let’s begin –

What is data science for?

Imagine that, we are in a restaurant. I, as a data engineer, is like a storage keeper who taking care of fresh and quality ingredients. When a cook want something for a dish, data engineers have to validate and afford them.

So who are cooks? Of course, data scientists are. They are mastered how to make a dish more delicious, more nutrients, and more appetizing.

Collaboration of data engineers and data scientists benefits our customers and allow them to gain insights from our data products.

Tools for data scientists

Python

Python is one of popular data manipulating languages. We can find ALL of needed libraries of the works. One thing is to make sure we are using Python 3 because Python 2 will be the end of support at the writing time.

Besides, we can write R Language as well.

Free datasets

There are lots of public dataset and it’s free to play around. One of mine is Kaggle.com/dataset and Github.

For Git, it was my old blog about it here.

IDE (integrated development environment) and Text Editor

Many IDEs and Text editors are for Python users as a list here. Pick what you favor.

For me, I would prefer Jupyter.

Command line

It’s optional if we are using Git on Terminal. Here is my old blog about Command line.

Next time will be preliminary data exploration.

See ya~

next: Note of data science training EP 2: Pandas & Matplotlib – from a thousand mile above