Discovering Data
  • Home
  • Blog

Petition for a second EU referendum in the UK

6/26/2016

0 Comments

 
There is currently an online petition calling for the UK's government to reject the outcome of the EU referendum since the turn out was less than 75% and the winning margin for Leave was less than 4%. It is possible to download the data as a .json file. Part of the process of signing the petition involves indicating where you reside. The vast majority gave their residence as different parts of the UK. But some gave their location as outside the UK. The following is the top ten locations, the code used to get this data is available on my CodeBlog.
  1. United Kingdom, 2401768
  2. Vatican City, 36189
  3. North Korea, 21616
  4. France, 18225
  5. Australia, 11585
  6. United States, 11523
  7. Spain, 11483
  8. Germany, 6852
  9. Canada, 4308
  10. Gibraltar, 4043
I think it is possible that the second and third entries above represent people who were not being honest about their location - the permanent population of the Vatican City is about 1,000 and I doubt there are 21000 British citizens living in N.Korea, and even if there were that many they would not have access to the internet. The other entries are more plausible.

​Note that the data was downloaded on June 26 in the morning (GMT). 
0 Comments

Irish baby names in the US, 1880 - 2015

6/15/2016

0 Comments

 



The data used is the same as in the previous post. This time I'm looking specifically at parents giving their children Irish names. To begin I need to define what I mean by Irish names. I am looking for names that derive specifically from Irish culture, mythology and the Irish language. Names like John or Emma may be popular in Ireland but they are not Irish names. I did not include the names Padraig (Patrick), Sean (John) or Seamus (James) as these names are gallicised  versions of non-Irish names. I used a list of the top 100 baby names in Ireland for 2014 and extracted Irish names from that list. The names are:

female names:
Aoife, Caoimhe, Saoirse, Ciara, Niamh, Roisin, Clodagh

male names:
Conor, Rian (Ryan), Oisin, Cian, Darragh, Cillian, Fionn, Finn, Eoin, Aidan, Declan

Occurrence of the names in 1880:

There were no occurrences of any of the names in 1880.
By 1920 the name Ryan appears (this is an Anglicised version of Rian so is not strictly Irish). 
By 1950 the Irish version Rian has appeared, but still no other names.
By 1975 the names ​Ciara and Conor and Aidan had appeared.
By 1980 the list had grown to include, Cian, Finn, Eoin, Declan, Niamh and  Roisin
By 1990 the name Darragh appeared. So by 1990 most of the male names in my list had appeared. Oisin and Fionn were still missing. About 60% of the female names had also appeared.
All of the remaining names had appeared by 2010.

Summary:
So all these Irish names can now be found in the US although they are not common. Most of them appeared between the late 1970s and the mid 1990s.
Given that these names can be difficult for English speakers to pronounce I am a little surprised that they all appear in the US lists.

​

0 Comments

Baby given names in the United States - 1880 to 2015

6/11/2016

0 Comments

 
To see the code used please see the code blog.

This analysis is inspired by the book 'Python for Data Analysis' (O'Reilly) by Wes McKinney and this blog post.

There are two prominent changes in the distribution of characters for boys names in the US. The following thumb nails show the changes:
Picture
The first change started to appear in the 1960s. Before the 1960s the letters l, o and n were approximately evenly distributed. However after the 1960s the letter n becomes increasingly popular. This trend continues to the present time as the graph for 2015 shows:
Picture
The second change is more recent - starting in the late 1980s/early1990s. This time it involves the vowels a and e. Before the 1980s the letter e was more common than the letter a. That has now flipped as the graph above for 2015 shows, also the graph below for 1990:
Picture
This change in popularity for the letters a and e is not limited to male names, the same trend occurred for female names: 
Picture
In the above thumbnails the two spikes on the left of each graph represent the letters a and e. Note that the letter n also became more popular for girls. Compare the two graphs below, the first is for 1880 and the second is for 1990:
Picture
Picture
However this trend for the letter n seems to be declining in more recent times, the graph below is for 2015:
Picture
While n is still ahead of l the difference is decreasing.

One further interesting comparison is the letter y which appears to be more popular in female names. 
0 Comments

Comparing property availability - Carrickfergus and Glengormley

6/5/2016

0 Comments

 
Data was collected at the end of May, beginning of June 2016.

Findings summary: 

I divided the available properties into 5 groups. In Glengormley 43% of available properties were in the Semi-detached house category. So there is slightly less choice of property available in Glengormley. In Carrickfergus the properties were more evenly distributed between the 5 groups. 

I used the average number of bedrooms as an approximate indicator of propety size. With the exception of Bungalows the averages in Carrick are either equal to or slightly higher than in Glengormley. But differences are not very significant.

With the exception of Houses, properties in Carrick are slightly cheaper than in Glengormley.

The level of rates for properties seems to depend on whether the property is detached or not. Houses and bungalows have higher rates than apartments, semi-detached houses and terraced houses.

Comparison of property types available in the two areas:

I reduced the number of types by grouping the property types. I did this because some of the smaller catagories
had only one or two properties - not representative samples. So:  

bungalow includes bungalow, semi-detatched bungalow and chalet
terrace house includes terrace house, town house and end terrace house

Distribution of properties available in the two areas:
                                        Gg              C
Bungalow                        16    13%    30    16%
Apartment                       11    9%      18    10%    
Terrace House                16    13%    37    20%
Semi-Detached house    53    43%    41    22%
House                             26    21%    62    33%
total                                122             188
Notes

Average number of bedrooms per property type:
                                      Gg               C
Apartment                      1.9               1.9
Bungalow                       3.4               3.1
House                            3.8               4.0
Semi-detached house    3.0               3.1
Terrace house                3.0               3.0

Average price per type:
                                        Gg                C
Apartment                       92227          87714
Bungalow                        153709       142125
House                             192498        213521
Semi-Detached house    117517        108414
Terrace House                95247          85177

Average rates per type

                                          Gg             C
Apartment                         778            810
Bungalow                          1035          907
House                               1205          1274
Semi-Detached house       773            751
Terrace House                   697            614

0 Comments

    Archives

    June 2018
    December 2017
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blog