Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am trying the merge the datetime series with a repository data while grouping by name and summing the values.

File1.csv 

Timeseries,Name,count
07/03/2015 06:00:00,Paris,100
07/03/2015 06:00:00,Paris,600
07/03/2015 06:00:00,Paris,700
07/03/2015 06:00:00,London,200
07/03/2015 06:00:00,London,100
07/03/2015 06:00:00,London,500
07/03/2015 06:00:00,Dublin,300
07/03/2015 06:00:00,Dublin,400
07/03/2015 06:00:00,Dublin,400

Output

Master_file.csv (append mode)

    Name,Timeseries(n-1)Timeseries(n)#put the datetime series as header and put       
    Paris,300,1400      #Sum of all the values with same Name
    London,200,800
    Dublin,400,1100

Program 

import pandas as pd 
import numpy as np

df = pd.read_csv('/home/lat_lon1.csv')
df1 = pd.read_csv('/home/lat_lon_master.csv')


gp = df.groupby('Name')['date timeseries'].sum().reset_index() 
df1.merge(gp, on='Name')

I am having trouble in changing the date time column to header and putting the correct values under. Those Names not found can be given NAN and replaced in next iterations.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
172 views
Welcome To Ask or Share your Answers For Others

1 Answer

Please check the python pandas Data Frame documentation Click here Here is the code you are looking at.

Output

Timeseries Name count 07/03/2015 06:00:00 Dublin 1100 07/03/2015 06:00:00 London 800 07/03/2015 06:00:00 Paris 1400

   #!/bin/python
    import pandas as pd
    import numpy as np
    df=pd.read_csv('/home/saiharsh/Documents/Crowd Street/Transition_Data/Telecom_7.csv') #Please enter the file Location
    gp=df.groupby('Name').sum().reset_index()
    flag=0
    for i in gp['Name']:
        if flag==1:
            time=df['Timeseries'][df['Name']==i]
            time=time.tail(1)
            frames=[time1,time]
            time1=pd.concat(frames)
        else:
            time1=df['Timeseries'][df['Name']==i]
            time1=time1.tail(1)
            flag=1
    time1=time1.reset_index(drop=True)
    result=pd.concat([time1,gp],axis=1,join='inner')
    result=result.to_csv(index=False)
    print result

Please feel free to reply if any problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...