Discovering Data
  • Home
  • Blog
  • become_a_data_scientist
  • Code-examples

Using NLTK to analyse vocabulary in political manifestos

6/5/2017

0 Comments

 
Picture
The technical process:
  1. download the manifesto pdf files
  2. convert the pdf files to text files
  3. use NLTK to remove stopwords and get the common words used in each manifesto
​This required two scripts, the first converted the pdf files to text files, I used the script here. The second script is on my code blog.

The results:
DUP
[('northern', 147), ('ireland', 102), ('n', 99), ('dup', 88), ('new', 40), 
('westminster', 29), ('uk', 26), ('support', 25), ('public', 25), ('national', 21), 
('people', 18), ('united', 18), ('deal', 17), ('needs', 17), ('manifesto', 16), 
('trade', 16), ('must', 14), ('also', 14), ('best', 14), ('wminster', 13), ('eu', 13), 
('across', 13), ('irelands', 13), ('economic', 13), ('identity', 13), ('military', 12), 
('would', 12), ('better', 12), ('positive', 12), ('believes', 12)]


UUP
[('northern', 51), ('ireland', 33), ('ulster', 23), ('unionist', 22), ('best', 18), 
('united', 16), ('party', 16), ('brexit', 13), ('westminster', 13), ('mps', 12), 
('local', 12), ('union', 12), ('work', 11), ('manifesto', 11), ('one', 11), ('remain', 9), 
('great', 8), ('need', 8), ('many', 8), ('yet', 8), ('would', 8), ('public', 7), 
('south', 7), ('tom', 7), ('executive', 7), ('kingdom', 7), ('people', 7), ('election', 7), 
('danny', 6), ('better', 6)]

Alliance
[('northern', 170), ('alliance', 110), ('ireland', 103), ('uk', 75), 
('support', 65), ('ensure', 51), ('would', 43), ('change', 40), ('public', 37), 
('economic', 35), ('also', 32), ('westminster', 32), ('work', 30), ('government', 30), 
('european', 29), ('direction', 27), ('need', 27), ('deal', 26), ('across', 26), 
('executive', 26), ('international', 25), ('political', 25), ('brexit', 24), 
('must', 24), ('eu', 24), ('significant', 23), 
('continue', 23), ('tax', 23), ('human', 22), ('welfare', 22)]

Sinn Fein
[('brexit', 44), ('eu', 35), ('sinn', 32), ('rights', 30), ('tory', 25), ('health', 23), 
('north', 23), ('irish', 19), ('access', 18), ('european', 17), ('vote', 16), 
('within', 16), ('party', 15), ('special', 14), ('services', 13), ('ireland', 13), 
('status', 13), ('election', 13), ('cuts', 12), ('funding', 12), ('designated', 12), 
('dup', 12), ('new', 11), ('unity', 10), ('trade', 9), ('priorities', 9), ('people', 9), 
('british', 8), ('good', 8), ('friday', 8)]


TUV
[('tuv', 97), ('northern', 95), ('ireland', 81), ('politics', 42), ('would', 41), 
('principled', 41), ('talking', 41), ('straight', 41), ('stormont', 35), ('sinn', 34), 
('fein', 30), ('people', 27), ('need', 27), ('dup', 22), ('government', 22), ('public', 20), 
('believes', 19), ('first', 19), ('uk', 19), ('must', 18), ('irish', 17), ('health', 16), 
('brexit', 15), ('eu', 15), ('one', 15), ('money', 15), ('assembly', 15), ('could', 14), 
('language', 14), ('mental', 14)]


SDLP
[('sdlp', 262), ('northern', 154), ('ireland', 104), ('new', 97), ('ensure', 85), 
('must', 58), ('support', 58), ('people', 55), ('believes', 51), ('also', 45), 
('investment', 39), ('public', 39), ('housing', 39), ('education', 39), ('better', 37), 
('government', 37), ('local', 35), ('development', 35), ('services', 32), ('strategy', 32), 
('economy', 31), ('across', 31), ('health', 30), ('areas', 30), ('community', 30), 
('sector', 29), ('economic', 28), ('work', 27), ('social', 27), ('future', 26)]
​Note the SDLP manifesto for the Westminster election can not be scanned using the pdf to text script so I scanned their Stormont manifesto from March 2017.
0 Comments



Leave a Reply.

    Archives

    October 2018
    September 2018
    June 2018
    May 2018
    December 2017
    November 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blog
  • become_a_data_scientist
  • Code-examples