The dataset is available here. The following Python script shows the percentages of survivors for different groups: import pandas as pd df = pd.read_csv('titanic3.csv') df_male = df[df['sex']=='male'] df_female = df[df['sex']=='female'] df_class_group = df.groupby('pclass').mean() df_class_group_male = df_male.groupby('pclass').mean() df_class_group_female = df_female.groupby('pclass').mean() print(df['survived'].mean()) print(df_female['survived'].mean()) print(df_male['survived'].mean()) print(df_class_group) print(df_class_group_male) print(df_class_group_female) Only 38% of the passengers survived the sinking but this is only part of the story, we can dig down further to see how belonging to different groups would determine a passenger's chances of survival. If we divide the passengers into male and female we can see that only 19% of male passengers survived whereas 73% of female passengers survived. We can also divide by class - there were 3 classes of ticket on the Titanic: first, second and third. The percentage survival rate (male and female) by class were:
If we divide by both gender and class:
It is clear that first class female passengers had the best chance of survival. It is also interesting that the class divisions break down for male second and third class passengers, in the case of male passengers being a second class passenger did not increase your chances of survival compared to male third class passengers. The difference in ticket price:
0 Comments
Leave a Reply. |