Since 2011 Stackoverflow have conducted a developer survey. The raw data can be downloaded from here. The following is a brief analysis of part of the 2017 survey data. Specifically analysis of some factors affecting salary. David Robinson, data scientist at Stack Overflow found an unusual relationship between the use of tabs and spaces and salary. His R code is available here. I decided to see if I could recreate the results using Python and Pandas. My Python code is available here. The following graph clearly shows the relationship for the 2017 data.
The 2015 survey also includes spaces v tabs data but the salary and experience data is in a different format making a direct comparison with the 2017 data difficult but it is possible to plot the data and see if there a similar correlation. The next graph shows that there may be a similar relationship.
It is difficult to explain this relationship however correlation does not mean causation. I also looked at the 2017 data for Masters v Degrees. When you take the overall average for developers with a Masters degree you find they have a slightly higher salary than developers with just a degree. However when you divide the data by experience this is no longer true. Now it seems having a Masters degree will negatively impact on your salary as you become more experience, see the graph below:
Again this is not what I expected.
The dataset is available here.
The following Python script shows the percentages of survivors for different groups:
import pandas as pd
df = pd.read_csv('titanic3.csv')
df_male = df[df['sex']=='male']
df_female = df[df['sex']=='female']
df_class_group = df.groupby('pclass').mean()
df_class_group_male = df_male.groupby('pclass').mean()
df_class_group_female = df_female.groupby('pclass').mean()
Only 38% of the passengers survived the sinking but this is only part of the story, we can dig down further to see how belonging to different groups would determine a passenger's chances of survival. If we divide the passengers into male and female we can see that only 19% of male passengers survived whereas 73% of female passengers survived. We can also divide by class - there were 3 classes of ticket on the Titanic: first, second and third. The percentage survival rate (male and female) by class were:
If we divide by both gender and class:
It is clear that first class female passengers had the best chance of survival. It is also interesting that the class divisions break down for male second and third class passengers, in the case of male passengers being a second class passenger did not increase your chances of survival compared to male third class passengers.
The difference in ticket price: