Data Science

Data Science is a discipline focused on studying data, particularly quantitative data, whether structured or unstructured. Many programming languages support data processing, including R, Python, SQL, JavaScript, and others. Python is one of the languages that facilitates data processing and even provides libraries specifically for this purpose, one of which is the Pandas library. For data processing, Python recommends using an integrated development environment (IDE) such as Jupyter Notebook.

Pandas DataFrame

The basic data structure in Pandas is called a DataFrame, which is a collection of ordered columns with names and types, resembling a table similar to a database where a single row represents a single instance and columns represent specific attributes. A Pandas DataFrame can also be referred to as a dictionary of lists because its structure resembles a list with key-value identification for each data entry.

DATASET

Download Data

MODULE

Download Module 12