Discovering Data
  • Home
  • Blog

UK - anti Trump petition

1/30/2017

0 Comments

 
The UK government has a website which enables people to start petitions on different subjects. If the number of 'signatures' on the petition exceeds 100,000 the UK gov may debate the topic in Parliament. There is currently a petition entitled: Prevent Donald Trump from making a State Visit to the United Kingdom. It has over 1.2 million signatures. The petitions are supposed to only be valid for UK citizens resident in the UK. You can download a json file of the data from the site. I did a quick check on the 'signatures_by_country' field and found:
  • ​over 3000 from Australia
  • almost 2000 from Germany
  • in fact sizeable numbers from all Western European countries including 6 from the Vatican City
  • there were also 36 from Vietnam
  • 22 from Cambodia
  • and even 3 from the British Antarctic Survey. 
I'm sure some of the people who signed the petition were concerned UK citizens but a certain percentage were neither serious nor UK citizens resident in the UK. I wonder what percentage of the 1.2 million are 'bots'.
0 Comments

Effects of Brexit

1/29/2017

0 Comments

 
During the Brexit campaign the 'remainers' failed to put forward any convincing arguments for the UK remaining in the EU. Instead they operated 'project fear' which attempted to convince everyone of the economic dangers if people voted for Brexit. Their predicted economic catastrophe never materialised but there is a slow weakening or corrosion of the UK currency. Sterling was in decline against the dollar for a couple of years before Brexit but the 'leave' vote has accelerated that decline.
Picture
At the same time the Euro has been gaining ground on the pound.
Picture
The Euro zone still has huge economic problems and the gains made by the Euro could still be reversed. But it is clear that Sterling has been weakened by Brexit. While this may be good for exporters it is not good for importers and that includes the price of oil and food much of which the UK has to buy in dollars and Euros. So inflation will probably rise in 2017 with food and petrol prices rising. This will impact the working class 'leavers' the most. But this is still only the beginning of the exit process, over the next two years we could see a breakdown in the Northern Ireland peace process and a second divisive independence referendum in Scotland, both direct consequences of Brexit. The UK may also find itself in a position of having to take whatever trade deals it can get even if those are not in the UK's long term interest. 
0 Comments

The rise and fall of Pokemon Go

1/27/2017

0 Comments

 
Probably one of if not the most downloaded app of 2016. An early attempt at using augmented reality technology in a game. Initially it caused a huge spike in Nintendo's share price, followed by an equally rapid collapse once people realised the Nintendo company did not own the app. 
Picture
Compare the above to the spike in Google searches:
Picture
While the number of searches has fallen to close to zero Nintendo's share price remains about 67% higher than before the release of Pokemon Go (around 25,000 after, around 15000 before). So overall it was a good event for Nintendo share holders.
0 Comments

Asylum in the US

1/21/2017

0 Comments

 
Data set is available on Kaggle. Some data also from Wikipedia.

Between 2005 and 2007 the US accepted about 40,000 asylum seekers. By comparison the UK accepted 30,000 and Canada 25,000. However the population of the US is several times larger than the populations of the UK and Canada combined so per-capita the US was not taking in as many people as either the UK or Canada. There are two kinds of asylum granted in the US, affirmative in which the application is accepted and defensive in which the original application is rejected, the case then goes in front of a judge who can decide to accept or reject it.
Picture
Picture
So since 2007 the number of people being granted affirmative asylum has incresed significantly while those getting defensive asylum has decreased. The origins of these asylum seekers has also changed over the last decade or so.
Picture
Picture
0 Comments

Data skills are good skills to have

1/19/2017

0 Comments

 
Linkedin has placed data mining/statistical analysis as the number 2 most sought after skill in employees. This is the same as the 2015 results. 2016 also saw data visualisation enter the top ten skills (number 8) for the first time in the UK. It has not appeared yet for other countries. The following countries have data mining/statistical analysis as either number 1 or number 2 most in demand skill: US, Ireland, Netherlands, Australia, Brazil, Canada, S.Africa, UAE, UK, France, Germany. The following countries list it in the top 5 (number 3 to 5): Singapore, China, India.
0 Comments

Will interest in Data Science continue to grow?

1/17/2017

0 Comments

 
If we take google searches as an indicator of interest in a subject then Data Science has shown very significant growth in interest over the last five years or so.
Picture
But will this trend continue? If we focus in on the last twelve months then there is some indication of a slow down in the rate of growth of interest.
Picture
The top five countries for 'Data Science' searches are:
  1. Singapore
  2. India
  3. US
  4. Australia
  5. UK
0 Comments

Using machine learning to predict house prices

1/14/2017

0 Comments

 
The code is on my code blog. To test the script I used some new property listings not included in the training data. I used the script to try to predict house prices based on property type, rates* and number of bedrooms. I got the results below.

*Rates are annual tax raised by local government in Northern Ireland. The amount payable is closely related to property value and function (commercial, residential, religious and so on.)

Test 1
htype = 5; bedrooms = 3; rates = 729.9
predicted price = 104,199 to nearest int (-2.6%)
actual price = 107,000
Test 2
htype = 3; bedrooms = 5; rates = 1946.4 (+27%)
predicted price = 323,717 to nearest int
actual price = 255,000
Test 3
htype = 6; bedrooms = 3; rates = 462.27 (-16.6%)
predicted price = 60,071 to nearest int
actual price = 71,995
Test 4
htype = 3; bedrooms = 3; rates = 932.65
predicted price = 136,448 to nearest int (+5%)
actual price = 129,950
Test 5
htype = 0; bedrooms = 2; rates = 729.9
predicted price = 91,187 to nearest int (+30.3%)
actual price = 70000
Test 6
htype = 4; bedrooms = 3; rates = 811
predicted price = 116,909 to nearest int (+8.7%)
actual price = 127,995

Half the tests are within 10% of the actual price. I believe the results could be improved by increasing the amount of training data especially at the extremes (cheapest and most expensive properties). Increasing the number of features might improve the accuracy. This experiment used linear regression to make the predictions, the relationship between the price and the features may not be linear at the extremes. 
0 Comments

Taser deaths in the US

1/8/2017

0 Comments

 
 The data set originates from The Guardian it is available from Kaggle.

Before analysing the dataset I had innocently believed that Tasers were non lethal but this is not the case. The Metropolitan Police (London,UK) describe Tasers as 'less lethal'. The US data shows 50 people died in 2015 in the US as a result of being shot by a Taser. This number dropped to 22 in 2016.

Some summary stats
number killed by police officers in 2015 in the US using a Taser = 50 (4% of total number of people killed by police)
number killed by police officers in 2016 in the US using a Taser = 22 (2% of total number of people killed by police)

of the 50 killed in 2015 how many were armed:

​No         46
Other       2
Knife       1
Firearm     1

of the 22 killed in 2016 how many were armed:

No       18
Other     3
Knife     1

Compare these numbers to those who were shot dead by police using a gun:

in 2015, how many were armed

Firearm               551
Knife                 152
No                    114
Other                  61
Non-lethal firearm     47
Vehicle                44
Unknown                44
Disputed                4

in 2016, how many were armed

Firearm               496
Knife                 154
Unknown               105
No                     93
Other                  64
Non-lethal firearm     46
Vehicle                35
Disputed               11

Enthnicity of those killed by Taser in 2015:
White                     21
Black                     19
Hispanic/Latino            7
Asian/Pacific Islander     2
Native American            1

Enthnicity of those killed by Taser in 2016:
White              14
Black               5
Hispanic/Latino     2
Unknown             1

Gender, 2015:
Male      49
Female     1

Gender, 2016:
Male      21
Female     1

Age stats, 2015:
max: 62
median: 38
min: 18

Age stats, 2016:
max: 66
median: 39
min: 18

In summay the majority of people who died after being shot by a Taser were black or white males and were unarmed.
0 Comments

Comparison of 2016 Olympic medal winners

1/1/2017

0 Comments

 
The data set is available on Kaggle.

Introduction - some basic olympic stats

total number of athletes = 11538
male = 6333, 206 unique nationalities
female = 5205, 202 unique nationalities

984 men (77 unique nationalities) won at least one medal (15.5%)
873 women (66 unique nationalities) won at least one medal (16.8%)

If you want to win multiple medals then the best sport catagories are:
aquatics: 5 male and 8 female athletes won 3 or more medals
athletics: 2 male and 3 female athletes won 3 or more medals
gymnastics: 2 male and 3 female athletes won 3 or more medals

5 Countries sent an all male team:
Nauru, Monaco, Iraq, Tuvalu, Vanuatu

1 Country sent an all female team:
Bhutan

Compare average ages of medal winners and non-medal winners per sport catagory

aquatics:

mean age of male non-medal winners -- 24.0
mean age of male medal winners ------ 25.3

mean age of female non-medal winners - 22.5 
mean age of female medal winners ----- 23.7

athletics:

mean age of male non-medal winners --- 26.5
mean age of male medal winners ------- 26.2

mean age of female non-medal winners - 26.2
mean age of female medal winners ----- 26.7

gymnastics:

mean age of male non-medal winners --- 24.9
mean age of male medal winners ------- 24.1

mean age of female non-medal winners - 20.8
mean age of female medal winners ----- 20.6

In summary
There are differences in the average age of winners and losers between male and female versions of sports and between different sports. The youngest winners (on average) for the three sport catagories are female gymnasts. The oldset are female athletes. Athletics shows the least difference between ages of male and female athletes, gynastics shows the biggest difference.
0 Comments

    Archives

    June 2018
    December 2017
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blog