Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a column that represents minutes and seconds as 20:00, or 19:58.

import pandas as pd
d = {'clock': ['19:58', '20:00']}
df = pd.DataFrame(data=d)

I want to truly express this as an object in a pandas column that is minutes and seconds.

I have attempted to use pd.to_datetime(df.clock, format="%M:%S"), and that doesn't error, but the a time of 20:00 (twenty minutes, zero seconds) parses as 1900-01-01 00:20:00.

I am drawing a blank on how to retain just the minutes and seconds in the result so that I can look the time elapsed, in seconds, between records.

The trick?

I am looking at sporting data, and there are 3 periods in the game, each that count backwards from twenty minutes, so its not as if I can just append the game date and move on, as each time sequence can appear 3 times (or more).

question from:https://stackoverflow.com/questions/66055312/parse-mmss-from-a-string-column-in-pandas

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
129 views
Welcome To Ask or Share your Answers For Others

1 Answer

There is a pd.to_timedelta function which you could try. But it seems to expect things in hh:mm:ss format, so I had to augment the input:

>>> import pandas as pd

>>> times = ['20:00', '19:58', '00:05']
>>> pd.to_timedelta(['00:'+i for i in times]) # manually convert to hh:mm:ss

TimedeltaIndex(['0 days 00:20:00', '0 days 00:19:58', '0 days 00:00:05'], dtype='timedelta64[ns]', freq=None)

What I mean is that deltas = pd.to_timedelta(times) fails with ValueError: expected hh:mm:ss format I don't see a format paramter, like pd.to_datetime unfortunately.

You can then access the total_seconds if you want to make some elapsed time calculations:

>>> deltas.total_seconds()

Float64Index([1200.0, 1198.0, 5.0], dtype='float64')

I think you could still use pd.to_datetime, but you would just have to add the step of subtracting the midnight timestamp on the date of each created timestamp, like so:

>>> import pandas as pd

>>> times = ['20:00', '19:58', '00:05']
>>> times = pd.to_datetime(times, format="%M:%S")
>>> deltas = times - times.floor('D')
# deltas is the same as above

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...