I have a program that will store many instances of one class, let's say up to 10.000 or more. The class instances have several properties that I need from time to time, but their most important one is the ID.
class Document
attr_accessor :id
def ==(document)
document.id == self.id
end
end
Now, what is the fastest way of storing thousands of these objects?
I used to put them all into an array of Documents:
documents = Array.new
documents << Document.new
# etc
Now an alternative would be to store them in a Hash:
documents = Hash.new
doc = Document.new
documents[doc.id] = doc
# etc
In my application, I mostly need to find out whether a document exists at all. Is the Hash's has_key?
function significantly faster than a linear search of the Array and the comparison of Document
objects? Are both within O(n) or is has_key?
even O(1). Will I see the difference?
Also, sometimes I need to add Documents when it is already existing. When I use an Array, I would have to check with include?
before, when I use a Hash, I'd just use has_key?
again. Same question as above.
What are your thoughts? What is the fastest method of storing large amounts of data when 90% of the time I only need to know whether the ID exists (not the object itself!)
question from:https://stackoverflow.com/questions/5551168/performance-of-arrays-and-hashes-in-ruby