Discovering Data
  • Home
  • Blog

Pandas - group by

6/9/2018

0 Comments

 
The dataset contains climate data, the data structure is:
year
month
temp max
temp min
air frost
rain (mm)
hours of sunshine
1948
1
6.6
1.3
8
170.8
40.1
I want to visualise the total hours of sunshine for the summer months (June, July, August) per year. The steps involved in preparing the data were:
  1. keep only months 6, 7 and 8, so create a subset of the full dataframe
  2. drop unnecessary columns - reduces computing power required 
  3. group by year and sum
the code:

    
The data can then be plotted.
Picture

0 Comments



Leave a Reply.

    This blog includes:

    Scripts mainly in Python with a few in R covering NLP, Pandas, Matplotlib and others. See the home page for links to some of the scripts.  Also includes some explanations of basic data science terminology.

    Archives

    October 2018
    June 2018
    April 2018
    June 2017
    April 2017
    March 2017
    February 2017
    January 2017
    November 2016
    September 2016
    July 2016
    June 2016
    May 2016
    November 2015
    November 2014

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blog