According to GOV.UK, in England, Wales and Northern Ireland
The police can stop and question you at any time - they can search you depending on the situation.
The rules are different in Scotland.
I wanted to test for any correlation between the number of people who were stopped and searched and the number of people subsequently arrested. If stop and search is effective at catching criminals then I would expect to see some degree of positive correlation. In this case the data is buried in pdf files, probably the worst place to get data. There are a few Python libraries which can help when trying to extract data from PDFs for example pdfminer and the library I used - Tabula. I chose Tabula because it is relatively easy to use when extracting data in tables in PDF files. The code looks like this:
The read_pdf function returns a pandas dataframe, you need to pass in the path-file name and the page/pages. The library worked for me but that is not always the case - PDFs are not easy to work with.
The data is arranged by financial year rather than calendar year so fy_12_13 = April 2012 to March 2013
Plotting the numbers arrested against the numbers stopped for the six years gives
There does not appear to be a strong correlation. Pandas can give the correlation coefficient:
This gives a value of 0.3589143729477498. So not much correlation this implies that increasing the number of people who are stopped and searched would not necessarily result in more arrests.