Is there an implementation where I can join two arrays based on their keys? Speaking of which, is the canonical way to store keys in one of the NumPy columns (NumPy doesn't have an 'id' or 'rownames' attribute)?
See Question&Answers more detail:osIs there an implementation where I can join two arrays based on their keys? Speaking of which, is the canonical way to store keys in one of the NumPy columns (NumPy doesn't have an 'id' or 'rownames' attribute)?
See Question&Answers more detail:osIf you want to use only numpy, you can use structured arrays and the lib.recfunctions.join_by
function (see http://pyopengl.sourceforge.net/pydoc/numpy.lib.recfunctions.html). A little example:
In [1]: import numpy as np
...: import numpy.lib.recfunctions as rfn
...: a = np.array([(1, 10.), (2, 20.), (3, 30.)], dtype=[('id', int), ('A', float)])
...: b = np.array([(2, 200.), (3, 300.), (4, 400.)], dtype=[('id', int), ('B', float)])
In [2]: rfn.join_by('id', a, b, jointype='inner', usemask=False)
Out[2]:
array([(2, 20.0, 200.0), (3, 30.0, 300.0)],
dtype=[('id', '<i4'), ('A', '<f8'), ('B', '<f8')])
Another option is to use pandas (documentation). I have no experience with it, but it provides more powerful data structures and functionality than standard numpy, "to make working with “relational” or “labeled” data both easy and intuitive". And it certainly has joining and merging functions (for example see http://pandas.sourceforge.net/merging.html#joining-on-a-key).