Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am writing a function that returns a dictionary with the year of the docs as key and, as value, it specifies a tuple that is returned by def do_get_citations_per_year function.

(我正在编写一个函数,该函数返回一个以doc年为键的字典,作为值,它指定由def do_get_citations_per_year函数返回的元组。)

This function processes the df:

(此函数处理df:)

def do_process_citation_data(f_path):
    global my_ocan

    my_ocan = pd.read_csv(f_path, names=['oci', 'citing', 'cited', 'creation', 'timespan', 'journal_sc', 'author_sc'],
                          parse_dates=['creation', 'timespan'])
    my_ocan = my_ocan.iloc[1:]  # to remove the first row
    my_ocan['creation'] = pd.to_datetime(my_ocan['creation'], format="%Y-%m-%d", yearfirst=True)
    my_ocan['timespan'] = my_ocan['timespan'].map(parse_timespan)
    #print(my_ocan.info())
    print(my_ocan['timespan'])
    return my_ocan

Then I have this function, when running it it does not trigger any error:

(然后我有此功能,在运行它时不会触发任何错误:)

    result = tuple()
    my_ocan['creation'] = pd.DatetimeIndex(my_ocan['creation']).year

    len_citations = len(my_ocan.loc[my_ocan["creation"] == year, "creation"])
    timespan = round(my_ocan.loc[my_ocan["creation"] == year, "timespan"].mean())
    result = (len_citations, timespan)
    print(result)


    return result

When I run that function inside of another function:

(当我在另一个函数中运行该函数时:)

def do_get_citations_all_years(data):
    mydict = {}
    s = set(my_ocan.creation)
    for year in s:
        mydict[year] = do_get_citations_per_year(data, year)

    return mydict

I get the error:

(我收到错误:)

  File "/Users/lisa/Desktop/yopy/execution_example.py", line 28, in <module>
    print(my_ocan.get_citations_all_years())
  File "/Users/lisa/Desktop/yopy/ocan.py", line 35, in get_citations_all_years
    return do_get_citations_all_years(self.data)
  File "/Users/lisa/Desktop/yopy/lisa.py", line 112, in do_get_citations_all_years
    mydict[year] = do_get_citations_per_year(data, year)
  File "/Users/lisa/Desktop/yopy/lisa.py", line 99, in do_get_citations_per_year
    timespan = round(my_ocan.loc[my_ocan["creation"] == year, "timespan"].mean())
ValueError: cannot convert float NaN to integer

What can I do to solve the issue?

(我该怎么做才能解决这个问题?)

Thank you in advance

(先感谢您)

  ask by Lisa Siurina translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

This error means that my_ocan.loc[my_ocan["creation"] == year, "timespan"].mean() is NaN .

(此错误意味着my_ocan.loc[my_ocan["creation"] == year, "timespan"].mean()NaN 。)

You should fill NaN values with 0 before calculating mean because it will not change the mean.

(在计算平均值之前,应将NaN值填充为0 ,因为它不会改变平均值。)

Here is an example:

(这是一个例子:)

timespan = my_ocan.loc[my_ocan["creation"] == year, "timespan"].fillna(0).mean()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...