Visualizations using the Seaborn library - exercise 2¶
The following tasks demonstrate the use of the Seaborn library in different ways of visualizing variables in Pandas data frames on the Titanic
data collection.
We import the necessary libraries. The following paragraph should be run first before the pd
, np
, and sns
imported objects are used.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# we will set the display of graphs directly in the paragraphs of the notebook
%matplotlib inline
# initialize the seaborn library
sns.set()
This time we won't use the function to load data from the Seaborn library examples repository, but we will use the dataset you worked with in the previous exercises. stored on disk in the directory and file /data/titanic-processed.csv
. The dataset is already pre-processed from the previous exercise - cleaned of missing values, some attributes were removed, others were transformed.
If we want to work using the library with the data we have stored on disk, we first need to load it (as in the previous exercises) into the Pandas data frame. Next, we can work in the same way as when using data loading from the Seaborn library examples repository.
So we read the Titanic dataset from a file stored on disk using the read_csv()
function. We display the header and the first 5 records of the data frame as usual by using head()
.
titanic = pd.read_csv('../data/titanic-processed.csv')
titanic.head()
The pre-processed dataset contains the following attributes:
- pclass - class in which the passenger traveled
- survived - indicates whether the passenger survived the steamship accident (1) or not (0)
- sex - the gender of the passenger
- age - age
- sibsp - number of siblings of the passenger, or species/mates
- parch - number of parents/children among passengers
- ticket - ticket number
- fare - fare amount
- cabin - the cabin in which the passenger was accommodated
- embarked - the port where the passenger boarded (Southampton, Cherbourg, Queenstown)
- family - number of family members with whom the passenger traveled
- has_family - indicates whether the passenger traveled with a family member or not
- fare_ordinal - discretized price
- title - title of the passenger
- title_short - short title
- age_ordinal - age grade - created by discretizing the age attribute
- deck - the deck on which he was accommodated
So we visualize the number of men and women sailing on the Titanic.
g = sns.countplot(x = 'sex', data = titanic)
Task 4.8¶
Visualize the averages and standard deviations of the ages of men and women traveling on the Titanic with a suitable graph.
# YOUR CODE HERE
Task 4.9¶
Show the number of male and female passengers in each travel class.
# YOUR CODE HERE
Task 4.10¶
Choose the appropriate graph and visualize how many of the passengers from each boarding location on the Titanic survived or did not survive the disaster.
# YOUR CODE HERE
Task 4.11¶
Choose the appropriate method for visualizing the number of passengers who survived, or they did not survive the disaster due to the class in which they were traveling. Plot the graphs at once for each gender.
# YOUR CODE HERE
Task 4.12¶
Let's look at the passengers also in terms of their age. Choose the appropriate way of displaying the age distribution (Age
) of the passengers according to the class in which they traveled.
# YOUR CODE HERE
Task 4.13¶
Expand the previous graph and choose a suitable way of visualizing the distribution of the age of passengers with respect to the class in which they traveled and with respect to the gender of the passenger.
# YOUR CODE HERE
Task 4.14¶
Create a graph that visualizes the average age of passengers within each class by gender.
# YOUR CODE HERE
Task 4.15¶
Visualize how many passengers survived/did not survive the sinking of the ship for each deck.
# YOUR CODE HERE
Task 4.16¶
Create a visualization that shows the numbers of male and female shipwreck survivors/nonsurvivors within different age groups (age_ordinal
).
# YOUR CODE HERE
Task 4.17¶
Visualize the relationship between the age of the passengers and the amount of the fare. Is there any dependency between them?
# YOUR CODE HERE
Task 4.18¶
Visualize numerical attribute correlations with Heatmaps. Plot the correlation coefficients on a graph.
# YOUR CODE HERE
Task 4.19¶
Choose one of the contingency tables from the tasks from the previous exercise and draw it using a heatmap.
# YOUR CODE HERE
Task 4.20¶
Use FacetGrid
to draw a grid of age distributions of survivors/non-survivors on individual decks.
# YOUR CODE HERE