Wayback Machine
AUG Oct JUL
Previous capture 27 Next capture
2005 2006 2007
4 captures
18 Feb 06 - 11 Jul 07
sparklines
Close Help
  home · browse · search · game entities · user directory · message board · IRC | register

October 27, 2006, 4:40 pm PDT
username  
password  
forgot password?

Popular Resources
  • Half-Life 2 Mod FAQ
  • Valve Hammer Editor
  • Hammer 3.5 beta test
  • Half-Life Utilities
  • game data files
  • ZHLT 2.5.3 custom build
  • Half-Life SDK
  • Feedback
    If you've got any feedback, suggestions, or bugs to report regarding the Collective website, go here!

  • Feedback (301)
  • Newsletter
     
    Enter your email address in the above form to add yourself to the email newsletter list. Click here for more info.

    Hosted Sites
  • Valve ERC
  • Collective
  • TFMapped
  • Spirit of Half-Life
  • Selective Design
  • Pixel Reviews
  • recent articles

    NPC and Item Placement Theory
    17/03/05 11:35pm PST
    Non-Player Character (NPC) and item placement can influence both the gameflow and immersion of a level. This article aims to give some pointers on how to properly place them.
    - Hugh 'Hugh' Lloyd

    Got Props?
    13/03/05 08:32am PST
    A common problem in HL2 mapping is props not showing up in game. This article explains why and offers solutions.
    - Jeff 'Yesukai' Pritchard

    Simulating Randomness
    18/12/04 11:29pm PST
    This article focuses on how to properly simulate random events that should occur at a certain average frequency, or within a certain probability per period of time.
    - Skyler 'Zipster' York

    Adding Single-Player Weapons to Half-Life 2
    15/12/04 06:52pm PST
    Covers the process behind adding weapons to a single-player Half-Life 2 modification.
    - Skyler 'Zipster' York

    Your world in HL2
    06/12/04 12:17am PST
    This article gives tips and advice to anyone wanting to make custom photorealistic textures to be used in Half-Life 2.
    - Oksid

    Hiding in Shadow
    21/08/04 01:11pm PDT
    Describes how to create a function that has monsters disregard you if you are hiding in a certain level of "darkness," which can be set from within map properties.
    - Anders [Wolf] Jenbo (NoBody)

    XSI EXP for Half-Life 2 Tutorial - Camera Control
    23/09/04 12:43am PDT
    A SOFTIMAGE|XSI tutorial explaining all of the camera controls available to you in XSI!
    - Josh Enes

    Bump Mapping in Half-Life
    08/08/04 11:58am PDT
    Details a method of achieving real-time bump mapping in Half-Life, and provides an implementation of the algorithm.
    - Francis 'DeathWish' Woodhouse

    Real-Time "TRON 2.0" Glow For Low-Spec Hardware
    19/06/04 02:06pm PDT
    A sequel to the original "Real-Time 'TRON 2.0' Glow" article, this describes how to implement real-time glow that works on low-spec graphics cards.
    - Francis 'DeathWish' Woodhouse

    Hitboxes and Code
    05/06/04 06:25pm PDT
    How do I make only one part of a monster take damage? Learn about the relationship between model hitboxes and what you can do with them in a characters code.
    - Jonathan 'Teh_Freak' Smith

    Hash Tables
    [Tue Feb 12, 2002 / 01:55pm PST] Skyler 'Zipster' York - comments (6) comments enabled

    Welcome to the wonderful world of hash tables. Curiosly enough, hash has nothing to do with this article at all, and I have no idea why they call it a hash table anyway. Ironically enough, Webster's Dictionary defines hash as a synonym for muddle or "to confuse"; hodgepodge was the exact wording if I recall. But don't let that discourage you. Let me get back on topic. :) ^(/icebreaker)^

    Hash tables, hash tables... just what the heck are they? There are many different organization styles for a hash table, but this article will focus on one: chaining. It's a hybrid between an array and a linked list, and its efficiency over the latter two depends on the programmer's ability to create a good hashing function - don't worry, I'll explain everything.

    A hash table is an array of pointers, each which points to the beginning of a linked list. That's it! The efficiency of the system comes from the fact that array lookups are performed at godly speeds compared to linked lists iterations, but if the first array lookup doesn't satisfy the job, the linked list at that location is iterated until the requested information is found! If your array is designed well, you mimimize the number of lookups. Basically, it tackles the issue of speed associated with linked lists and the issue of memory usage associated with arrays. Here's a picture for you:

    I mentioned earlier that the programmer plays a big role in determining the efficiency of the hash table. Hash tables are commonly used to store records and other database related things. It follows that a hash table must have some way of quickly finding and storing data. But how? This is where the hashing function comes in. It's sole job is to translate raw data into an index for the hash array. So you can see that this function must be well designed in order to be fast, and have an algorithm that produces varied indexes - remember that we fall back on linked lists if there is more than one record at a certain index.

    The hashing function need not be overly complicated - just enough to quickly generate a nice range of indexes. Let's look at an example. Suppose we have a dictionary program that stores the definitions of words. One record consists of a word string and a definition string. The most logical organization of this array would be alphabetacally; thus our initial hash table will have 26 pointers, one for each letter of the alphabet. The following function will take a word (a.k.a "raw data") and convert it into an index:

    int HashIt(const char *word)
    {
    	return ((*word) - 'a');
    }
    

    This makes the assumption that the word is in lowercase, or at least the first letter, but you get the idea - the proper index is returned. Of course, you might want to do some error checking for characters that would return negative indexes or those larger than the array, but it should be kept outside our hashing function.

    At any rate, once you have the index, you jump to that point in the array (lookup #1) and iterate through the linked list at that position until you find the word you are looking for.

    That last example wasn't that practical. For example, I can think of hundreds (ok, maybe only a couple dozen) of words off the top of my head that start with the letter 's', and they'll all be linked at the same index. The chances that the desired word will be near the beginning of the list get slim, and it would degenerate into your everyday linked list iteration. But one thing's for sure. It beats a regular linked list for things such as word associations (or whatever the term I should use is).

    Hash table indexes don't really need to rely upon a logical relationship with the data it's "hashing". As long as it does its job, go ahead and use it. Lastly, there's the issue of how large to initialize the hash array to (hey, stop staring at my dangling participle!!). For something as simple as a dictionary, 26 works for basic alphabetizing. But let's suppose you want to store these words based on the first two letters. That would make our array 262 elements long, or 676. It works out that this array would require about 2.5 K more in memory for the pointers. That's nothing, yet we would optimize the speed of the hash table by reducing the number of words per index. Sweet. And if we go for three letters? 17576 entries, 66 K more - again, additional optimization with an unoticable increase in pointer memory.

    Well there you have it - hash tables. Not as notorious as linked lists or arrays (because they require a bit more in terms of implementation), but they serve their purpose well. Remember, they can be used for more than just character data. However, don't use them too much. They should be used primarily when record-style information (which is also refered to as "associated data" in the context of a hash table) is being stored and high-speed access in required. Like everything else in C++, and programming in general, they aren't the new solution to all our data storage needs. However, when used appropiatly, you can really whip up some nice applications.

    Recommended Reading: Hash Tables

    article created on Sat Feb 09, 2002 / 09:04pm PST
    this item has been viewed 3198 times
    [general / coding]

    Only registered users can post comments. Have you registered yet?

    user comments

    displaying comments of normal or higher rating

    1.

    Nicholas "Starwolf" Pitino
    Tue Nov 12, 2002 / 09:29am PST

      Well, that was all greek to me but never the less, good article Zippy.

    2.

    Fraktur
    Wed Jan 08, 2003 / 06:08am PST

      Uhm!! The size of the table doesnt depend on your hash function. Your example with the 2 first letters of a word; It gives 26^2 possible keys. but u dont need an table that big. it may be smaller but not bigger. simply divide the key by the table size. just ensure that your hash function is able to produce more keys than you'll need and divide. also be sure that your hash functions can produce every key so that every table position is usable.

    And you dont even need the lists. if u get a key that is already occupied you search the next free array element. or use fibonacci to determine next element to use to spread the data better. And dont forget to make modulo table size after adding something to the key or you'll get core dumped :D . Also only use that if u know the maximum amount of records u need to store.

    3.

    Skyler 'Zipster' York
    Wed Jan 08, 2003 / 09:32am PST

      The concept behind a hash table is to find a reasonable balance between the number of array indices and the speed of a lookup. True, the table could be less than 26^2 keys; it could follow my first example and only be 26 keys. Heck, why not make that just plain 2? However, the less number of keys you have, the longer it will take to find any given key on average. That is why I suggested you should evaluate just how much memory you are going to use for the pointers as you increase the size of the table, because it turns out that you can generate quite a large table and still not use that much money.

      Uhm!! The size of the table doesnt depend on your hash function.

    It doesn't depend on the hash function. Once again, it all comes down to speed vs. storage. The reason it seems that the hash function is so closely related to the size of the table is because the hash function is designed to produce a range of indices that spread across the entire table, just like you said. If a collision does occur, then you fall back on the linked list at that index. The only assumption made is that there are as many words beginning with "aa" as there are beginning with "zz", or "nn". To generate a better distribution would take some sort of active knowledge of language. For example, if you know there are more "bb" words than "xx" words, then you spread out the "bb" words across multiple indices with some clever hash. But without any knowledge of this, the best you can do is generate the widest range of indices possible and hope that it averages out.

    4.

    Brian 'Yamazaki' Dorner
    Wed Jan 08, 2003 / 12:07pm PST

      To further derail the discussion of has tables, I could give my quick solution to alphabetical data. A 26 pointer tree (I forget the scientific term for this, n-tree?), each pointer being for a specific letter in the alphabet. Time to search the tree structure and find any particular word is equal to the length of the word and has nothing to do with the number of words in the tree. I used this cheap trick on a class project. Everyone else had 2-pointer trees and there 60,000 word dictionaries took 30mins to create and sort, mine took 30 seconds. Saved me days of time when everyone else would run it, walk off for a coffee and return... and mine had already finished several times over and I had the chance to debug it. I even got a bonus point for specifically designing the tree to match what data it was designed to hold (dictionary), as opposed to using the generic one (binary).

    And that holds true for hashing (I managed to return to the topic! Yay!). Data structures for games should be specifically designed for the data they're meant to hold. You won't find generic insertion and deletion functions in a BSP tree, everything will be centered on quickly parsing the tree for geometry... not handling generic chunks of data.

    If you've got a game with several hundred AI controlled units in play, and you need to quickly find a specific unit and work with it somehow, you'll probably write up a hash function that can quickly find that unit based upon its type (air, ground, sea), behaviour (enemy, friend, neutral, non-AI, etc.) or whatever parameters would be best used to sort all your units.

    And so on :)

    5.

    Skyler 'Zipster' York
    Wed Jan 08, 2003 / 03:55pm PST

      Ahh, so they all used a binary tree while you used a smarter, 26 pointer tree. That's quite clever, and would make it much faster. :)

    In retrospect, I find that maybe the dictionary example wasn't the best usage of a hash table, but it's certainly the easiest way to explain what a hash table is and how it can be used.

    6.

    Fraktur
    Thu Jan 09, 2003 / 03:46am PST

      right. know the data and youre able to speed up. e.g. perfect hashing. if u know that every key is uniqe (e.g. IP numbers), finding one in a hashtable is O(1).
    I first learned perfect hashing. After i understood it i found the problem that different keys may produce same hash values. And the solution is no big deal.

    I like the 26 pointer tree. :) Lets say u want to store phone numbers. then u need more RAM for pointers then for phone numbers. Take such an empty tree and put Yamazakis phone# in. Then you have 208 pointers. All except 8 are NULL. :D
    comment modified on Thu Jan 09, 2003 / 03:56am PST

    VERC © 2004. All content copyright its respective owner, all rights reserved.
    script execution time: 0.200309038162 seconds