Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am trying to calculate the perc change occurring between two dates of the values for each id in my df:

date        id  a  b
2021-01-01  1   6  4
2021-01-01  2   10 10
2021-01-02  1   3  2
2021-01-02  2   20 20

What I'd like to have as a result is:

id  ratio_a  ratio_b
1   -0.5     -0.5
2   1.0      1.0

I tried playing with pct_change() but I cannot understand how to use it for this.

The data for testing can be generated with this code:

import datetime
import pandas as pd

dti = pd.to_datetime(
    [
        datetime.datetime(2021, 1, 1),
        datetime.datetime(2021, 1, 1),
        datetime.datetime(2021, 1, 2),
        datetime.datetime(2021, 1, 2)
    ]
)

d = {
    'id': [1, 2, 1, 2],
    'a': [6, 10, 3, 20],
    'b': [4, 10, 2, 20]
}

df = pd.DataFrame(data=d, index=dti)

I tried with df.groupby(by=['id']).pct_change() and it gives me:


            a   b
2021-01-01  NaN NaN
2021-01-01  NaN NaN
2021-01-02  -0.5    -0.5
2021-01-02  1.0 1.0

, which is not exactly what I want. I would need a) that dates were sort of aggregated, and b) my ids.

question from:https://stackoverflow.com/questions/65936475/pandas-pct-change-using-time-series-and-preserving-ids

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
325 views
Welcome To Ask or Share your Answers For Others

1 Answer

Here is problem pct_change not aggregate values, so output has same number of rows like original. First values of each groups are mising values, another values are counts. For preserve id is possible use:

df1 = df.set_index('id', append=True).groupby('id').pct_change()
print (df1)
                 a    b
           id          
2021-01-01 1   NaN  NaN
           2   NaN  NaN
2021-01-02 1  -0.5 -0.5
           2   1.0  1.0

If need remove NaNs and first level of MultiIndex:

df2 = (df.set_index('id', append=True)
         .groupby('id')
         .pct_change()
         .dropna(how='all')
         .droplevel(level=0))
print (df2)
      a    b
id          
1  -0.5 -0.5
2   1.0  1.0

Another alternative solution:

df2 = (df.set_index('id', append=True)
         .groupby('id')
         .pct_change()
         .dropna(how='all')
         .reset_index(level=0, drop=True))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...