Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Im relatively new to Python and trying to learn how to write functions. The answer to this post highlights how to get certain stats from a dataframe and I would like to use it in a function.

This is my attempt but it is not working with an AttributeError: 'SeriesGroupBy' object has no attribute 'test_for_B':

 def test_multi_match(df_in,test_val):
    test_for_B = df_in == test_val
    contigious_groups = ((df_in == test_val) & (df_in != df_in.shift())).cumsum() + 1
    counts = df_in.groupby(contigious_groups).test_for_B.sum()
    counts.value_counts() / contigious_groups.max()

Can someone please help put this code in a function I can re use on other data frames? Thanks.

Edit: Removed large attribute error now this has been answered.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
194 views
Welcome To Ask or Share your Answers For Others

1 Answer

Here you go:

def repeat_stats(series, var):
    isvar = series == var
    wasntvar = series != series.shift()
    cont_grps = (isvar & wasntvar).cumsum()
    counts = isvar.loc[cont_grps.astype(bool)].groupby(cont_grps).sum()
    return counts.value_counts() / cont_grps.max()

repeat_stats(rng.initial_data, 'B')

3.0    0.5
2.0    0.5
Name: initial_data, dtype: float64

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...