Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am trying to draw the lognormal distribution for my data. using the following code:

mu, sigma = 136519., 50405. # mean and standard deviation
hs = np.random.lognormal(mu, sigma, 1000) #mean, s dev , Size
count, bins, ignored = plt.hist(hs, 100, normed=True)     
x = np.linspace(min(bins), max(bins), 10000)
pdf = (math.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)))
#plt.axis('tight')
plt.plot(x, pdf, linewidth=2, color='r')

As you can see, my mean and sigma are big values, it creates the problem that hs goes to infinity that gives an error. While if I put something like mu =3 and sigma =1, it works, any suggestions for big numbers?

Update 1 :

I corrected my code with the first answer, but now I only get a straight line :

 mu, sigma = 136519 , 50405 # mean and standard deviation

    normal_std = np.sqrt(np.log(1 + (sigma/mu)**2))
    normal_mean = np.log(mu) - normal_std**2 / 2
    hs = np.random.lognormal(normal_mean, normal_std, 1000)
    print(hs.max())    # some finite number
    print(hs.mean())   # about 136519
    print(hs.std())    # about 50405

#    hs = np.random.lognormal(mu, sigma, 1000) #mean, s dev , Size
#    
    count, bins, ignored = plt.hist(hs, 100, normed=True) 

    x = np.linspace(min(bins), max(bins), 10000)
    pdfT = [];
    for el in range (len(x)):
        pdfTmp = (math.exp(-(np.log(x[el]) - mu)**2 / (2 * sigma**2)))
        pdfT += [pdfTmp]


    #plt.axis('tight')
    pdf = np.asarray(pdfT)
    plt.plot(x, pdf, linewidth=2, color='r')

image pdf

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
801 views
Welcome To Ask or Share your Answers For Others

1 Answer

The parameters mu and sigma in np.random.lognormal are not the mean and STD of the lognormal distribution. They are the mean and STD of the underlying normal distribution, that is of log(X). This means that by passing 136519 for the mean you ask NumPy to generate numbers of size exp(136519) which is about 10**60000, far beyond the double precision limits.

With a bit of algebra you can get the correct parameters for np.random.lognormal from the ones you have.

mu, sigma = 136519., 50405.
normal_std = np.sqrt(np.log(1 + (sigma/mu)**2))
normal_mean = np.log(mu) - normal_std**2 / 2
hs = np.random.lognormal(normal_mean, normal_std, 1000)
print(hs.max())    # some finite number
print(hs.mean())   # about 136519
print(hs.std())    # about 50405

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...