Skip to content
Snippets Groups Projects
Commit 7121f192 authored by td7g11's avatar td7g11
Browse files

script to generate hours

parents
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Process hours
1. Load CSV file
2. Plot each weekly hour total split across each project (bar chart)
3. List each project which has been worked on under each category
%% Cell type:code id: tags:
``` python
import numpy as np; import pandas as pd; import matplotlib.pyplot as plt
import datetime
```
%% Cell type:code id: tags:
``` python
# load csv file with information
project_hours_file = pd.read_csv('C:\\Users\\td7g11\\Documents\\20200203_export.CSV', encoding = "ISO-8859-1")
project_hours_file.head()
```
%% Output
Subject Start Date Start Time \
0 INTERVIEW SEMINAR: Dr Jonathan Belnoue "Physic... 23/1/2020 10:00:00
1 INTERVIEW SEMINAR: Dr Chris Holmes "Making air... 24/1/2020 14:00:00
2 INTERVIEW SEMINAR: Dr Meisam Jalalvand "Fibre ... 24/1/2020 10:00:00
3 INTERVIEW SEMINAR: Dr Prodip K Das "Fabricatio... 23/1/2020 14:00:00
4 INTERVIEW SEMINAR: Dr Dezong Zhao "Low Carbon ... 22/1/2020 14:00:00
End Date End Time All day event Reminder on/off Reminder Date \
0 23/1/2020 11:00:00 False False 23/1/2020
1 24/1/2020 15:00:00 False False 24/1/2020
2 24/1/2020 11:00:00 False False 24/1/2020
3 23/1/2020 15:00:00 False False 23/1/2020
4 22/1/2020 15:00:00 False False 22/1/2020
Reminder Time Meeting Organizer ... Meeting Resources \
0 09:45:00 Mechanical Engineering ... NaN
1 13:45:00 Mechanical Engineering ... NaN
2 09:45:00 Mechanical Engineering ... NaN
3 13:45:00 Mechanical Engineering ... NaN
4 13:45:00 Mechanical Engineering ... NaN
Billing Information Categories \
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
Description Location Mileage \
0 Dear all,\r\n\r\n \r\n\r\nDr Jonathan Belnoue ... 100 / 5017 NaN
1 Dear all,\r\n\r\n \r\n\r\nDr Chris Holmes will... 85 / 2207 NaN
2 Dear all,\r\n\r\n \r\n\r\nQuick reminder: the ... 67 / 1001 NaN
3 Dear all,\r\n\r\n \r\n\r\nDr Prodip Das will b... 100 / 5013 NaN
4 Dear all,\r\n\r\n \r\n\r\nDr Dezong Zhao will ... 4 / 4005 NaN
Priority Private Sensitivity Show time as
0 Normal False Normal 1
1 Normal False Normal 1
2 Normal False Normal 1
3 Normal False Normal 1
4 Normal False Normal 1
[5 rows x 22 columns]
%% Cell type:code id: tags:
``` python
# drop rows without categories assigned
# combine the start time/date columns and end time/date columns together
df = project_hours_file
df = df.dropna(axis=1, how='all')
df = df[df['Categories'].notnull()]
df['Start'] = df['Start Date'] + " " + df['Start Time']
df['End'] = df['End Date'] + " " + df['End Time']
df['Start'] = pd.to_datetime(df['Start'], format="%d/%m/%Y %H:%M:%S")
df['End'] = pd.to_datetime(df['End'], format="%d/%m/%Y %H:%M:%S")
df['Length'] = pd.to_timedelta(df['End'] - df['Start'], unit='H')/np.timedelta64(1, 'h')
# df['Start Time'] = pd.to_datetime(df['Start Time'],format= '%H:%M:%S' ).dt.time
# df['End Time'] = pd.to_datetime(df['End Time'],format= '%H:%M:%S' ).dt.time
df['Week No'] = [i.strftime("%V") for i in df['Start']]
df.sort_values('Week No', ascending=True)
```
%% Output
Subject Start Date Start Time End Date End Time \
11 Skeleton - helmet cap 14/1/2020 10:00:00 14/1/2020 12:00:00
12 Swimming - Paper 14/1/2020 08:00:00 14/1/2020 10:00:00
13 Swimming - Paper 14/1/2020 12:30:00 14/1/2020 16:30:00
14 Swimming - Paper 15/1/2020 08:00:00 15/1/2020 12:00:00
15 Swimming - Paper 16/1/2020 08:00:00 16/1/2020 12:00:00
.. ... ... ... ... ...
118 Swimming - Paper 9/3/2020 09:00:00 9/3/2020 10:30:00
87 IP meeting 10/3/2020 11:00:00 10/3/2020 11:30:00
58 MSc meetings 10/3/2020 10:00:00 10/3/2020 11:00:00
120 Swimming 10/3/2020 08:30:00 10/3/2020 10:00:00
121 Swimming 10/3/2020 11:30:00 10/3/2020 16:30:00
All day event Reminder on/off Reminder Date Reminder Time \
11 False False 14/1/2020 09:45:00
12 False False 14/1/2020 07:45:00
13 False False 14/1/2020 12:15:00
14 False False 15/1/2020 07:45:00
15 False False 16/1/2020 07:45:00
.. ... ... ... ...
118 False False 9/3/2020 08:45:00
87 False True 10/3/2020 10:45:00
58 False False 10/3/2020 09:45:00
120 False False 10/3/2020 08:15:00
121 False False 10/3/2020 11:15:00
Meeting Organizer ... Description \
11 NaN ... NaN
12 Dickson T.A.J. ... NaN
13 Dickson T.A.J. ... \r\n
14 Dickson T.A.J. ... NaN
15 Dickson T.A.J. ... \r\n
.. ... ... ...
118 Dickson T.A.J. ... \r\n
87 Turnock S.R. ... \r\n\r\n
58 Weymouth G.D. ... MSc weekly meeting. Konrad needs to skip this ...
120 Dickson T.A.J. ... \r\n
121 Dickson T.A.J. ... \r\n
Location Priority Private Sensitivity Show time as \
11 NaN Normal False Normal 2
12 NaN Normal False Normal 2
13 NaN Normal False Normal 2
14 NaN Normal False Normal 2
15 NaN Normal False Normal 2
.. ... ... ... ... ...
118 NaN Normal False Normal 2
87 176/3009 Normal False Normal 2
58 Gabe's office Normal False Normal 2
120 NaN Normal False Normal 2
121 NaN Normal False Normal 2
Start End Length Week No
11 2020-01-14 10:00:00 2020-01-14 12:00:00 2.0 03
12 2020-01-14 08:00:00 2020-01-14 10:00:00 2.0 03
13 2020-01-14 12:30:00 2020-01-14 16:30:00 4.0 03
14 2020-01-15 08:00:00 2020-01-15 12:00:00 4.0 03
15 2020-01-16 08:00:00 2020-01-16 12:00:00 4.0 03
.. ... ... ... ...
118 2020-03-09 09:00:00 2020-03-09 10:30:00 1.5 11
87 2020-03-10 11:00:00 2020-03-10 11:30:00 0.5 11
58 2020-03-10 10:00:00 2020-03-10 11:00:00 1.0 11
120 2020-03-10 08:30:00 2020-03-10 10:00:00 1.5 11
121 2020-03-10 11:30:00 2020-03-10 16:30:00 5.0 11
[108 rows x 24 columns]
%% Cell type:code id: tags:
``` python
# total time per category
total_grouped = df.groupby(['Categories'])['Length'].sum()
total_grouped
```
%% Output
Categories
Admin 60.75
Sailing 41.00
Sailing;Admin 3.50
Skeleton 88.50
Skeleton;Admin 3.50
Swimming 123.75
Name: Length, dtype: float64
%% Cell type:code id: tags:
``` python
plt.figure()
plt.bar(total_grouped.index, total_grouped.values)
plt.xlabel("Sport")
plt.ylabel("Hours spent")
plt.title("Time since 15/01/2020")
plt.show()
```
%% Output
%% Cell type:code id: tags:
``` python
# stacked bar chart
weekly_times = df.groupby(['Categories', 'Week No'])['Length'].sum()
weekly_times
```
%% Output
Categories Week No
Admin 03 1.50
04 2.50
05 2.50
06 15.75
07 7.00
08 0.50
09 7.50
10 23.50
Sailing 04 1.00
05 17.50
06 8.00
07 5.00
08 0.50
09 8.00
10 0.50
11 0.50
Sailing;Admin 06 0.50
07 0.50
08 0.50
09 0.50
10 0.50
11 1.00
Skeleton 03 6.00
04 39.00
05 22.25
06 13.75
07 5.00
10 2.50
Skeleton;Admin 07 3.50
Swimming 03 18.50
07 16.00
08 37.25
09 23.50
10 14.00
11 14.50
Name: Length, dtype: float64
%% Cell type:code id: tags:
``` python
# print df.pivot_table(index='Symbol',
# columns='Year',
# values='Action',
# fill_value=0,
# aggfunc='count').unstack()
fig, ax = plt.subplots(1, 1)
pt_weekly_numbers = df.pivot_table(index='Week No',
columns='Categories',
values='Length',
fill_value=0.,
aggfunc='sum')
pt_weekly_numbers.plot.bar(ax=ax, stacked=True)
ax.axhline(y=37., color='r', linestyle='--', lw=2)
```
%% Output
<matplotlib.lines.Line2D at 0x28dd9ce21c8>
%% Cell type:code id: tags:
``` python
pts_pt_weekly_numbers = pt_weekly_numbers.apply(lambda x:
100 * x / float(x.sum()))
```
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment