Raw data files available on SonraíOscailteTÉ/OpenDataNI. The files are mainly in geojson file format. I used the Python package pandas to process the data and create the visualisation. The rail way system in Northern Ireland is quite limited but it used to be much more expansive. Fig. 1 shows the current extent of NI Rail
The dataset contains locations of industrial heritage sites across Northern Ireland. Plotting the rail related data points gives Fig 2:
yellow = old stations
red = viaducts
orange = rail bridges over roads and road bridges over rail
purple = level crossing
green = other
The difference in the West is clear - virtually all the old railways in the West have gone. !00% of rail in counties Tyrone and Fermanagh have gone.
The following Wikipedia page lists all closed stations across Ireland.
The following post on my code-examples blog shows the code used to generate the visualisation.
Secondary education in Northern Ireland is divided between grammar and non-grammar schools. Grammar schools use tests to select the most academically gifted students. If you fail the tests you can't go to a grammar school.
The political party Sinn Fein were opposed to selection and in 2011 they abolished the 11+ test. However a number of schools and thousands of parents opposed the move. As a result instead of using one state controlled test there are now several privately run tests used by different schools. Using a dataset from Open data NI I looked for any change in the number of children going to Grammar schools. If Sinn Fein's policy had been successful we could expect to see a drop in the number of children attending grammar schools.
However the above diagram shows a decline in the number of children attending non-grammar secondary schools while grammar school numbers have remained constant.
Based on this analysis my conclusion is that Sinn Fein's policy failed to reduce the grammar school share of the secondary school student numbers.
Data source: https://rainforests.mongabay.com/amazon/deforestation_calculations.html
Percentage of Rain forest lost since 1970 = 20%
Amount of destruction per year.
In 2017 another 6,624 square km were destroyed. That's an improvement on the high of 29,000 sq km in 1995. However 6,624 sq km is still eight times the size of New York city.
Suicide rates are increasing:
Some of this increase might be explained by growing population but when the data is visualised per-capita it is clear the increase cannot be explained as increased population:
the magority of suicides are male:
different age groups show different trends, over 80s (dark green), 70 to 79 (purple), 60 to 69 (pink) and under 20 (blue) are all quite flat, whereas 20 to 29 (red), 30 to 39 (green), 40 to 49 (brown) and
50 to 59' (orange) all show increases
The data for deaths from stabbing in London was found here.
The number of deaths to April 24 = 47
average age = 30, median age = 24
oldest = 70
youngest = 17
How are stabbings distributed over days of the week?
Word cloud crated from the comments in the data:
From the above, areas in London most impacted include: Camden, Peckham, Hackney, Southall and Islington. Note also that it does not just involve young men, one man was stabbed by a woman in her 20s.
Python code to generate some of the above:
import pandas as pd
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS
df = pd.read_csv('london_knife_crime.csv',parse_dates=['date']) #you need to create a csv file containing the data in your working directory or a different directory but then you need to pass in the full path to the csv_read function
df['day_of_week'] = df['date'].dt.weekday_name
df['day_of_week'].value_counts().plot(kind='bar',title='day of week of stabbing')
text = df['comment'].str.cat(sep=' ')
stopwords = set(STOPWORDS)
wordcloud = WordCloud(background_color="green", stopwords=stopwords).generate(text)
Google Trends for 'iPhone slow' clearly shows a periodic spike in interest.
These spikes occur in September 2013, September 2014, September 2015, ....
The release schedule for iPhones is September 2013, September 2014, September 2015, ....
Is it coincidence that people search for 'iPhone slow' around the time of a new iPhone release? It seems not. Apple have now admitted (or partially admitted) that they do slow down older models, according to a BBC report they claim it was "to prolong the life of the devices". Other people suspect Apple deliberately slowed older models just before the release of each new iPhone to encourage iPhone users to upgrade.
In the Google Trends graph above interest searches for 'iPhone slow' rapidly spiked they gradually fell away, but in 2017 the pattern changed, the searches didn't fall away they continued and grew stronger. More people were perhaps becoming aware of what Apple has been up to.
Google trends shows some increased interest in wildfires this year:
Some recent headlines include:
"Southern California wildfires trigger mass destruction, hurting families, economy", Fox News
"California wildfires by the numbers: $177M spent, more than 1,000 structures destroyed", CNN
"Christmas wildfires: How climate change puts California at risk all year round", the Independent
Source of Dataset used in this analysis: Kaggle.
citation for data: Short, Karen C. 2017. Spatial wildfire occurrence data for the United States, 1992-2015 [FPA_FOD_20170508]. 4th Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2013-0009.4
2017 seems to have been a bad year for California, is this part of a trend? The dataset covers the time period 1992 - 2015. If we plot the number of wildfires for each year in California in that time period we get:
The data does not suggest a strong trend either up or down, of course we are missing data for 2016 and 2017, perhaps a trend is developing in the last 3 years or so.
Of course it is not just the number of fires that is important, the size of the fires is also important. How has this changed over the two decades covered by the data?
As with the number of fires the average fire size does not show a definite trend. The third headline above makes a claim that climate change has made wildfires more common all year round, to test this we can extract data for two years, say 1993 and 2013 (20 years apart) and see if there is any difference in the number of fires per month:
Looking at just these two years we can say that the distribution of wildfires during the year is different. In 2013 wildfires are slightly more evenly distributed through the year while in 1993 they are slightly more concentrated in the summer months.
The majority of wildfires in CA are the result of human activity including arson. Natural fires due to lightning account for less than 15% of all wildfires in CA.
Is it possible to predict if a fire was started maliciously using Machine Learning?
The simple answer is yes. Using a Random Forest algorithm it is possible to get an accuracy of over 92% (for the data in the dataset). The algorithm uses the year, month and day of the week plus the latitude and longitude of the location where the fire started to predict if the fire was the result of arson.
Does AI pose a threat to society?
What is AI?
AI is an approach to problem solving which differs significantly from traditional computing.
Say we have a robot in an empty room and we want it to find the door and leave the room. The traditional computing approach to this problem would require programming the robot with specific instructions such as move forward 5 units, turn right by ninety degrees and so on. This approach will work but only for one starting position. It also requires precise knowledge about the location of the door and the starting point of the robot. The AI approach is to give the robot the ability to solve the problem by itself, this solution will work for all starting positions. In this case machine vision might be one possible solution. The robot has the ability to visually scan the room, the AI attempts to distinguish a door from walls and windows. Once it recognises a door it moves in that direction.
This kind of AI is not self-aware nor does it understand the concept of a door or the concept of leaving a room. It was trained to recognise doors and once the door is identified it will attempt to move in that direction.
Warnings about AI
Bill Gates and Stephen Hawking have both issued warnings about AI becoming too powerful. Also, earlier this year there was a twitter spat between Elon Musk and Mark Zuckerberg over the risks of AI and then a letter to the UN signed by a number of leading researchers and leaders in the tech industry warning of the dangers of weaponised AI.
Narrow v General
While the concerns being raised about AI becoming too powerful are sensible I think we are still a long way from the self-aware AI of science fiction movies which rises up and builds an army of human hating killing machines. The AI we have now is narrow intelligence – at best it can perform a task to a level which is as good as or better than a human expert. Much of the best AI we have is built around neural networks and the back-propagation algorithm. We will never get to general AI (self-aware machines) from back propagation. Geoffrey Hinton, the inventor of the back-propagation algorithm recently said: 'my view is throw it all away and start again'. It can never lead to true AI. This however does not mean that things like Machine Learning and Neural Nets will have no negative impacts on us. In the next fifteen to thirty years AI could directly affect millions of workers by the loss of jobs to machines. A recent report from PWC predicts that up to 30% of existing jobs in the UK could be automated out of existence, other industrialised countries can expect similar effects or worse. One report from the previous US administration put the figure closer to 50% in the US. If around one third to half of the working age population suddenly finds itself unemployed and possibly unemployable how will they react? AI is also being used in areas such as law enforcement and banking. In the future it may be an algorithm not a person who decides if you are eligible for a loan or if you are likely to commit a crime.
Artificial stupidity is perhaps more dangerous than artificial intelligence at least in the near term. Badly designed and poorly tested algorithms making decisions that impact people in the real world. And good old fashioned human ignorance also poses risks - politicians, business people and the military making decisions about the use of AI even though they don't understand the tech. There is also the danger of AI being hacked, and of hackers developing their own AI.
The way forward
My answer to the title question: 'does AI pose a threat to society?' Is yes it does. But it also offers many potential positives such as advances in medicine, engineering and business.
In the early 19th century a group of English workers, the Luddites, attempted to stop progress by smashing the machinery that was taking away their livelihood. They failed and the technology destroyed their ability to earn money. AI will not be stopped, it will change the world whether people and governments are ready or not.
The OECD measures life satisfaction across a number of countries. One of their findings is a link between education and satisfaction, people with more education are more satisfied. You can download the OECD data and test this for yourself. The graph below shows life satisfaction plotted against years spent in education, there is some indication of a positive correlation but this correlation is only 0.38 (measured using Spearman test).
There is a lot of scatter. If instead we look at earnings and satisfaction, the correlation coefficient is 0.74.
The correlation between earnings and education is also stronger at 0.41 than between satisfaction and education.
So perhaps the apparent correlation between Life Satisfaction and Education is actually due to a stronger correlation between wealth and satisfaction, the link being the more education you have the more likely you are to earn more. Maybe we are more materialistic than we like to admit.
This word cloud was generated by scraping Trump's twitter account for July and August:
Compare to a word cloud from May:
'Great' is a keyword for Trump. Other recurring themes and key words include America(n), job, media, fake, news, Russia(n), election, healthcare/Obamacare and thank. Trump's twitter account tends to be more positive than negative - apart from Fox he doesn't trust the media so Twitter is his main way of communicating his message and that message is more positive than negative.