The data comes from the University ranking datasets I mentioned in the last post. The CWUR dataset includes the number of patents for each university. I wanted to group the universities by country and sum the total number of patents. The final table should be available via a browser.
Combining Pandas and Flask makes the task relatively simple. I based my code on the following example. The code:
from flask import Flask, render_template
import pandas as pd
app = Flask(__name__)
df = pd.read_csv('cwurData.csv')
df = df.groupby(by=['country'])['patents'].sum().to_frame()
return render_template('report.html', df=df.to_html())
if __name__ == '__main__':
The output can be improved with CSS. But the raw output meets my basic requirements:
the line of code: df = df.groupby(by=['country'])['patents'].sum().to_frame() is doing most of the work, the groupby() function is simialr to group by in Oracle. This line of code creates a series object, but for the to_html() function to be available I need it to be a dataframe so I used the to_frame() function. The link I provided to the example code above explains how Flask is working.
This blog includes:
Scripts mainly in Python with a few in R covering NLP, Pandas, Matplotlib and others. See the home page for links to some of the scripts. Also includes some explanations of basic data science terminology.