Handling Time in Python
In this blog, I’ll summarize a generic way to find out the number of seconds, minutes, days between two time points in Python using the functions provided in the powerful pandas and numpy libraries. Thanks to the brilliant R package reticulate, I am able to run Python codes in R and write this blog comfortably using R markdown.
Preparing the environment
Loading required R libraries
ipak <- function(pkg){
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
if (length(new.pkg))
install.packages(new.pkg, dependencies = TRUE)
try(sapply(pkg, require, character.only = TRUE), silent = TRUE)
}
packages <- c("reticulate")
ipak(packages)
## Warning: package 'reticulate' was built under R version 4.1.2
## reticulate
## TRUE
Set up the R-Python environment
library(reticulate)
# use_python(Sys.which("python"))
# use_condaenv("py3.7", required = TRUE)
use_condaenv("r-reticulate")
Install required Python libraries
py_install("pandas") # install pandas if it is not available
py_install("numpy")
Import required Python libraries
import pandas as pd
import numpy as np
Timestamps in pandas
In pandas, we can access the current time easily via
timestamp_now_UTC = pd.Timestamp.now(tz="UTC")
We can also construct an arbitrary time by setting up the year, month, day, hour, minute and even second
timestamp_UTC = pd.Timestamp(year=2020, month = 8, day = 1, hour = 1, minute = 1, second = 25, tz="UTC")
We can find out the difference between the above two timestamps by
# find out the difference
timestamp_diff = timestamp_now_UTC - timestamp_UTC
print(timestamp_diff)
## 557 days 05:28:08.942866
However, pandas does not provide us an easy way to convert this difference into a number representing this difference in the unit of days, hours and minutes. To this end, we might want to leverage some functions in numpy.
Convert time difference into numbers using numpy
If we want to know the difference in the unit of hours
#
UNIT = "h"
timestamp_diff_h = timestamp_diff.to_numpy().astype("timedelta64" + "[" + UNIT + "]").astype(np.int32)
print(timestamp_diff_h)
#
# timestamp_diff_h_float = timestamp_diff.to_numpy().astype("timedelta64" + "[" + UNIT + "]").astype(np.float32)
# print(timestamp_diff_h_float)
## 13373
If we want to know the difference in the unit of seconds
#
UNIT = "s"
timestamp_diff_s = timestamp_diff.to_numpy().astype("timedelta64" + "[" + UNIT + "]").astype(np.int32)
print(timestamp_diff_s)
#
# timestamp_diff_s_float = timestamp_diff.to_numpy().astype("timedelta64" + "[" + UNIT + "]").astype(np.float32)
# print(timestamp_diff_s_float)
## 48144488