 |
Feedback
If you've got any feedback, suggestions, or bugs to report regarding the Collective website, go here!
Feedback (301)
|
|
|
|
 |
 |
recent articles
NPC and Item Placement Theory
17/03/05 11:35pm PST
Non-Player Character (NPC) and item placement can influence both
the gameflow and immersion of a level. This article aims to give some
pointers on how to properly place them.
- Hugh 'Hugh' Lloyd
|
|
Got Props?
13/03/05 08:32am PST
A common problem in HL2 mapping is props not showing up in game. This article explains why and offers solutions.
- Jeff 'Yesukai' Pritchard
|
|
Simulating Randomness
18/12/04 11:29pm PST
This article focuses on how to properly simulate random events
that should occur at a certain average frequency, or within a certain
probability per period of time.
- Skyler 'Zipster' York
|
|
Your world in HL2
06/12/04 12:17am PST
This article gives tips and advice to anyone wanting to make custom photorealistic textures to be used in Half-Life 2.
- Oksid
|
|
Hiding in Shadow
21/08/04 01:11pm PDT
Describes how to create a function that has monsters disregard
you if you are hiding in a certain level of "darkness," which can be set
from within map properties.
- Anders [Wolf] Jenbo (NoBody)
|
|
Bump Mapping in Half-Life
08/08/04 11:58am PDT
Details a method of achieving real-time bump mapping in Half-Life, and provides an implementation of the algorithm.
- Francis 'DeathWish' Woodhouse
|
|
Real-Time "TRON 2.0" Glow For Low-Spec Hardware
19/06/04 02:06pm PDT
A sequel to the original "Real-Time 'TRON 2.0' Glow" article,
this describes how to implement real-time glow that works on low-spec
graphics cards.
- Francis 'DeathWish' Woodhouse
|
|
Hitboxes and Code
05/06/04 06:25pm PDT
How do I make only one part of a monster take damage? Learn about
the relationship between model hitboxes and what you can do with them
in a characters code.
- Jonathan 'Teh_Freak' Smith
|
|
|
|
|
 |
 |
Welcome to the wonderful world of hash tables. Curiosly
enough, hash has nothing to do with this article at all, and I have no
idea why they call it a hash table anyway. Ironically enough, Webster's
Dictionary defines hash as a synonym for muddle or "to confuse";
hodgepodge was the exact wording if I recall. But don't let that
discourage you. Let me get back on topic. :) ^(/icebreaker)^
Hash tables, hash tables... just what the heck are they? There are
many different organization styles for a hash table, but this article
will focus on one: chaining. It's a hybrid between an array and a
linked list, and its efficiency over the latter two depends on the
programmer's ability to create a good hashing function - don't worry,
I'll explain everything.
A hash table is an array of pointers, each which points to the
beginning of a linked list. That's it! The efficiency of the system
comes from the fact that array lookups are performed at godly speeds
compared to linked lists iterations, but if the first array lookup
doesn't satisfy the job, the linked list at that location is iterated
until the requested information is found! If your array is designed
well, you mimimize the number of lookups. Basically, it tackles the
issue of speed associated with linked lists and the issue of memory
usage associated with arrays. Here's a picture for you:

I mentioned earlier that the programmer plays a big role in
determining the efficiency of the hash table. Hash tables are commonly
used to store records and other database related things. It follows
that a hash table must have some way of quickly finding and
storing data. But how? This is where the hashing function comes in.
It's sole job is to translate raw data into an index for the hash array.
So you can see that this function must be well designed in order to be
fast, and have an algorithm that produces varied indexes - remember
that we fall back on linked lists if there is more than one record at a
certain index.
The hashing function need not be overly complicated - just enough to
quickly generate a nice range of indexes. Let's look at an example.
Suppose we have a dictionary program that stores the definitions of
words. One record consists of a word string and a definition string.
The most logical organization of this array would be alphabetacally;
thus our initial hash table will have 26 pointers, one for each letter
of the alphabet. The following function will take a word (a.k.a "raw
data") and convert it into an index:
int HashIt(const char *word)
{
return ((*word) - 'a');
}
This makes the assumption that the word is in lowercase, or at least
the first letter, but you get the idea - the proper index is returned.
Of course, you might want to do some error checking for characters that
would return negative indexes or those larger than the array, but it
should be kept outside our hashing function.
At any rate, once you have the index, you jump to that point in the
array (lookup #1) and iterate through the linked list at that position
until you find the word you are looking for.
That last example wasn't that practical. For example, I can think of
hundreds (ok, maybe only a couple dozen) of words off the top of my
head that start with the letter 's', and they'll all be linked at the
same index. The chances that the desired word will be near the
beginning of the list get slim, and it would degenerate into your
everyday linked list iteration. But one thing's for sure. It beats a
regular linked list for things such as word associations (or whatever
the term I should use is).
Hash table indexes don't really need to rely upon a logical
relationship with the data it's "hashing". As long as it does its job,
go ahead and use it. Lastly, there's the issue of how large to
initialize the hash array to (hey, stop staring at my dangling
participle!!). For something as simple as a dictionary, 26 works for
basic alphabetizing. But let's suppose you want to store these words
based on the first two letters. That would make our array 262
elements long, or 676. It works out that this array would require
about 2.5 K more in memory for the pointers. That's nothing, yet we
would optimize the speed of the hash table by reducing the number of
words per index. Sweet. And if we go for three letters? 17576 entries, 66 K more - again, additional optimization with an unoticable increase in pointer memory.
Well there you have it - hash tables. Not as notorious as linked
lists or arrays (because they require a bit more in terms of
implementation), but they serve their purpose well. Remember, they can
be used for more than just character data. However, don't use them too
much. They should be used primarily when record-style information
(which is also refered to as "associated data" in the context of a hash
table) is being stored and high-speed access in required. Like
everything else in C++, and programming in general, they aren't the new
solution to all our data storage needs. However, when used appropiatly,
you can really whip up some nice applications.
Recommended Reading: Hash Tables |
 |
article created on Sat Feb 09, 2002 / 09:04pm PST
this item has been viewed 3198 times
|
[general / coding]
|
|
Only registered users can post comments. Have you registered yet?
user comments
displaying comments of normal or higher rating
1. |
Nicholas "Starwolf" Pitino
Tue Nov 12, 2002 / 09:29am PST
|
|
|
Well, that was all greek to me but never the less, good article Zippy. |
|
2. |
Fraktur
Wed Jan 08, 2003 / 06:08am PST
|
|
|
Uhm!! The size of the table doesnt
depend on your hash function. Your example with the 2 first letters of a
word; It gives 26^2 possible keys. but u dont need an table that big.
it may be smaller but not bigger. simply divide the key by the table
size. just ensure that your hash function is able to produce more keys
than you'll need and divide. also be sure that your hash functions can
produce every key so that every table position is usable.
And you
dont even need the lists. if u get a key that is already occupied you
search the next free array element. or use fibonacci to determine next
element to use to spread the data better. And dont forget to make modulo
table size after adding something to the key or you'll get core dumped . Also only use that if u know the maximum amount of records u need to store. |
|
3. |
Skyler 'Zipster' York
Wed Jan 08, 2003 / 09:32am PST
|
|
|
The concept behind a hash table is
to find a reasonable balance between the number of array indices and
the speed of a lookup. True, the table could be less than 26^2 keys; it
could follow my first example and only be 26 keys. Heck, why not make
that just plain 2? However, the less number of keys you have, the
longer it will take to find any given key on average. That is why I
suggested you should evaluate just how much memory you are going to use
for the pointers as you increase the size of the table, because it turns
out that you can generate quite a large table and still not use that
much money.
Uhm!! The size of the table doesnt depend on your hash function. It
doesn't depend on the hash function. Once again, it all comes down to
speed vs. storage. The reason it seems that the hash function is so
closely related to the size of the table is because the hash function is
designed to produce a range of indices that spread across the entire
table, just like you said. If a collision does occur, then you fall
back on the linked list at that index. The only assumption made is that
there are as many words beginning with "aa" as there are beginning with
"zz", or "nn". To generate a better distribution would take some sort
of active knowledge of language. For example, if you know there are
more "bb" words than "xx" words, then you spread out the "bb" words
across multiple indices with some clever hash. But without any
knowledge of this, the best you can do is generate the widest range of
indices possible and hope that it averages out. |
|
4. |
Brian 'Yamazaki' Dorner
Wed Jan 08, 2003 / 12:07pm PST
|
|
|
To further derail the discussion
of has tables, I could give my quick solution to alphabetical data. A 26
pointer tree (I forget the scientific term for this, n-tree?), each
pointer being for a specific letter in the alphabet. Time to search the
tree structure and find any particular word is equal to the length of
the word and has nothing to do with the number of words in the tree. I
used this cheap trick on a class project. Everyone else had 2-pointer
trees and there 60,000 word dictionaries took 30mins to create and sort,
mine took 30 seconds. Saved me days of time when everyone else would
run it, walk off for a coffee and return... and mine had already
finished several times over and I had the chance to debug it. I even got
a bonus point for specifically designing the tree to match what data it
was designed to hold (dictionary), as opposed to using the generic one
(binary).
And that holds true for hashing (I managed to return to
the topic! Yay!). Data structures for games should be specifically
designed for the data they're meant to hold. You won't find generic
insertion and deletion functions in a BSP tree, everything will be
centered on quickly parsing the tree for geometry... not handling
generic chunks of data.
If you've got a game with several hundred
AI controlled units in play, and you need to quickly find a specific
unit and work with it somehow, you'll probably write up a hash function
that can quickly find that unit based upon its type (air, ground, sea),
behaviour (enemy, friend, neutral, non-AI, etc.) or whatever parameters
would be best used to sort all your units.
And so on  |
|
5. |
Skyler 'Zipster' York
Wed Jan 08, 2003 / 03:55pm PST
|
|
|
Ahh, so they all used a binary
tree while you used a smarter, 26 pointer tree. That's quite clever,
and would make it much faster. 
In
retrospect, I find that maybe the dictionary example wasn't the best
usage of a hash table, but it's certainly the easiest way to explain
what a hash table is and how it can be used. |
|
6. |
Fraktur
Thu Jan 09, 2003 / 03:46am PST
|
|
|
right. know the data and youre
able to speed up. e.g. perfect hashing. if u know that every key is
uniqe (e.g. IP numbers), finding one in a hashtable is O(1). I first
learned perfect hashing. After i understood it i found the problem that
different keys may produce same hash values. And the solution is no big
deal.
I like the 26 pointer tree.
Lets say u want to store phone numbers. then u need more RAM for
pointers then for phone numbers. Take such an empty tree and put
Yamazakis phone# in. Then you have 208 pointers. All except 8 are NULL.  comment modified on Thu Jan 09, 2003 / 03:56am PST |
|
|
|
|
 |