�ݺ�ߣ

Data Analysis with Python
Cheat Sheet: Importing Data Sets
Package/Method Description Code Example
Read CSV data set
Read the CSV file containing a data set to a
pandas data frame
df = pd.read_csv(<CSV_path>, header = None)
# load without header
df = pd.read_csv(<CSV_path>, header = 0)
# load using first row as header
Note: The labs in this course run in JupyterLite environment. In JupyterLite environment, you'll need to download the required file to the local
environment and then use the local path to the file as the CSV_path. However, in case you are using JupyterLabs, or any other Python compiler on
your local machine, you can use the URL of the required file directly as the CSV_path.
Print first few
entries
Print the first few entries (default 5) of the
pandas data frame
df.head(n) #n=number of entries; default 5
Print last few entries
Print the last few entries (default 5) of the
pandas data frame
df.tail(n) #n=number of entries; default 5
Assign header
names
Assign appropriate header names to the data
frame
df.columns = headers
Replace "?" with
NaN
Replace the entries "?" with NaN entry from
Numpy library
df = df.replace("?", np.nan)
Retrieve data types
Retrieve the data types of the data frame
columns
df.dtypes
Retrieve statistical
description
Retrieve the statistical description of the data
set. Defaults use is for only numerical data
types. Use include="all" to create summary for
all variables
df.describe() #default use df.describe(include="all")
Retrieve data set
summary
Retrieve the summary of the data set being
used, from the data frame
df.info()
Save data frame to
CSV
Save the processed data frame to a CSV file
with a specified path
df.to_csv(<output CSV path>)
3/12/24, 1:15 about:blank
about:blank 1/1

�ݺ�ߣ

Cheat-Sheets. Importing Data Sets in Python.pdf

More Related Content

Cheat-Sheets. Importing Data Sets in Python.pdf