Comments on: Cache conscious hash tables http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/ Random tangents Fri, 14 Nov 2014 14:38:29 +0000 hourly 1 By: mk http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-823 Sat, 26 Jun 2010 11:21:32 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-823 In the function addString():

for (k = 0; k > 16) & 0xffff);

It looks like your null-terminating every string inserted into the
CChash, but the strings are length-encoded, so I don’t think you
have to do this. From what I gather, only the 2D-char array needs to
be nulled. You should try removing the nulls here, it will save
space and may increase performance too. To make this work, the two
bytes used to encode the “id” can be stored after the
length-encode: [length-encode][id][string]

]]>
By: Alex Radzie http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-727 Thu, 20 May 2010 20:03:07 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-727 It should be possible to make CCHashTable even more cache conscious by using byte arrays for storing strings instead of char arrays, this way it’s likely to take half the space when UTF-8 encoded. The idea is simple — the less data you fetch, the more cache hits you get.

And you don’t seem to cache key hash code in the slot bytes along with string length. This would improve string comparison speed and probably increase overall hash table performance.

]]>
By: Mark Dennehy http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-489 Fri, 05 Mar 2010 19:10:03 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-489 In reply to George V. Reilly.

It was an artifact of the move from wordpress.com to wordpress.org. Should be fixed now.

]]>
By: George V. Reilly http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-478 Mon, 01 Mar 2010 02:16:36 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-478 The double-encoded HTML in the code samples is an unreadable mess. Please fix.

]]>
By: sandrar http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-46 Thu, 10 Sep 2009 13:58:05 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-46 Hi! I was surfing and found your blog post… nice! I love your blog. 🙂 Cheers! Sandra. R.

]]>
By: mdakin http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-45 Wed, 24 Jun 2009 10:17:27 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-45 Ok, I also tried to implement a simple String set using same tricks, I don’t know but Java 6 standart HashSet implementation consistently gave better results. The problem is, Set uses String references and comparisons are basicly free because hash values of strigs are cached. So this is basically not very useful if you have the strings stored in memory as well, but could be an option to have a compact storage for character arrays. I also think your benchmark application could be flawed as well,

]]>
By: mdakin http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-44 Thu, 05 Feb 2009 20:41:49 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-44 I think performance gains are mainly coming from the encoding of integers in the contents instead of creating new Entry objects for each integer value. I would like to see the comparison of HashSet and this implementation (without integer values) under Java 6, my bet is they would be pretty close.

]]>
By: Implementing HAT-Tries in Java « Stochastic Geometry http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-43 Mon, 05 May 2008 23:06:23 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-43 […] HAT-Tries in Java Filed under: General, Java — Mark Dennehy @ 0:06 As I said in an earlier post detailing the CCHashTable, it formed a part of a larger data structure, the HAT-Trie. The HAT-Trie is a recent variant on the […]

]]>
By: Hagen http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-42 Tue, 01 Apr 2008 08:15:06 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-42 Hello Mr Dennehy,
while I was searching for, additional informations, to the research paper written for the SPIRE 2005 conference by Nikolas Askitis and Justin Zobel, I found Your blog.

I am a student in this http://www.uni-weimar.de/cms/medien/webis/home.html research group,We are currently evaluating the opportunities, to hold large dictionary’s in the main memory, for our retrieval applications.

I would like to ask You, if You would allow us to test your implementation, against our’s, and do some further testing with Your code.

Please let me know, If You would agree and If You would like to get back with me to maybe discuss a little bit.

Thanks a lot, best regards
Hagen Tönnies

]]>
By: Mark Dennehy http://178.63.27.54:8080/statictangents/2008/03/29/cache-concious-hash-tables/comment-page-1/#comment-41 Mon, 31 Mar 2008 12:58:15 +0000 http://stochasticgeometry.wordpress.com/?p=36#comment-41 Minor but important! Fixed and thanks mk.

]]>