File Handling

File Handling refers to the management of files in programming. In the context of file handling, we can open, close, read, write, append, and copy files. Python treats different file types, whether binary or text, appropriately. To implement this in Python, the syntax is: file = open(‘filename’, ‘mode’). Python provides three types of modes for opening files:

1
2
3
4
"r", for reading.
"w", for writing.
"a", for appending.
"r+", for reading and writing.

Don’t forget that after performing operations on a file, you should call the method to close it. The method call is done using file.close().

Here’s a simple example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Write File
file = open("Test.txt", "w")
file.write("Azhar Rizki Zulma")
file.close()

# Read File
file = open("Test.txt", "r")
text = file.read()
print(text)
file.close()

# Add Text
file = open("Test.txt", "a")
file.write(" - 24 Years - Jakarta")
file.close()

# Read & Write File
file = open("Test.txt", "r+")
text = file.read()
print(text)
file.close()

File Handling Exercises

Exercise 1

Create a text file named Biodata.txt using file handling implementation with the following user input:
Name: Your Name
Age: Your Age
Address: Your Address
Email: Your Email
Include both write and read methods within functions to make the program more structured.

Exercise 2

Create a program that can create a file, read a file, and append text to a file, where the file name is obtained from user input and the data to be added to the file is also provided by user input. Implement the program into functions and also incorporate branching and looping so that the program continues running until the user chooses the “close” option.

Introduce to Data Science

Data Science

Data Science is a discipline that focuses on studying data, particularly quantitative data, whether structured or unstructured. Many programming languages support data processing, including R, Python, SQL, and JavaScript, among others. Python is one of the languages that supports data processing and provides libraries for this purpose, one of which is the Pandas library. For data processing, Python recommends using the Integrated Development Environment (IDE) Jupyter Notebook.

Pandas Data Frame

The basic data structure in Pandas is called a DataFrame, which is a collection of ordered columns with names and types. It resembles a table similar to a database, where a single row represents a single example, and the columns represent specific attributes. A Pandas DataFrame can also be considered a dictionary of lists because its structure resembles a list with key-value identification for each piece of data.

Here’s a basic example of file handling in data processing:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import pandas as pd

data = {
"Country": ["Indonesia", "Japan", "India", "China", "United States", "Brazil", "Russia"],
"Capital": ["Jakarta", "Tokyo", "New Delhi", "Beijing", "Washington DC", "Brasilia", "Moscow"],
"Continent": ["Asia", "Asia", "Asia", "Asia", "America", "America", "Asia"],
"Area": [1905, 377, 3287, 9597, 9834, 8515, 17098],
"Population": [264, 143, 1252, 1357, 329, 210, 146]
}

# df = pd.read_csv('filename.csv', index_col=0) # Used to read a CSV file
df = pd.DataFrame(data)
mean = df.Population.mean()
std = df.Area.std()

print(df)
print(mean)
print(std)

# df.to_csv('newfile1.csv') # Used to create a file from the processed data in Python

Data Science Exercise

Exercise 1

Create a program that reads a DataFrame from a CSV file, with at least 10 country data entries, and display the Mean (Average) and Standard Deviation.

Exercise 2

Create a program that writes a CSV file from the dummy data, use dictionary and convert to dataframe and then writes as a CSV file with pandas library.

Data for Practice

Download Data