Hashing

Hashing

1. Introduction

     Hashing is a technique used for performing insertions, deletions, and finds in constant average time.

Hash function

     Each key is mapped into some number in the range 0  to TableSize -1 and placed in the appropriate cell. And this mapping is called a hash function since there are a finite number of cells and a virtually inexhaustible supply of keys, this is clearly impossible, and thus we seek a hash function that distributes the keys evenly among the cells.

     The only remaining problems :

  • deal with choosing a function
  • deciding what to do when two keys hash to the same value

2. Hash Function

  if the input keys are integers, then simply returning key mod TableSize is generally a reasonable strategy, unless Key happens to have some undesirable properties. In this case, the choice of hash function needs to be carefully considered. For instance, if the table size the is 10 and the keys all end in zero, then the standard hash function is a bad choice. For reasons we shall see later, and to avoid situations like the one above, it is often a good idea to ensure that the table size is prime.

  Usually, the Keys are strings; in this case, the hash function needs to be chosen carefully. The best way use Horner’s rule. If the keys are very long, the hash function will take too long to compute. A common practice in this case is not to use all the characters. Some programmers implement their hash function by using only the characters in the odd spaces, with the idea that the time saved computing the hash function will make up for a slightly less evenly distributed function.

       The main programming detail left is collision resolution. If, when an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it. There are several methods for dealing with this. We will discuss tow of the simplest:  separate chaining and open addressing.

 3. Separate Chaining

  The first strategy, commonly known as separate chaining, is to keep a list of all elements that hash to the same value.

  • To perform a search, we use the hash function to determine which list to traverse. we then search the appropriate list.
  • To perform a insert, we check the appropriate list to see whether the element is already in place. If the element turns out to be new, it can be inserted at the front of the list, since it is convenient and also because frequently it happens that recently inserted elements are the most likely to be accessed in the near future.

4. Hash Tables without Linked Lists

  Separate chaining hashing has the disadvantage of using linked lists. This could slow the algorithm down a bit because of the time required to allocate new cells, and also essentially requires the implementation of a second data structure.

  There are three common collision resolution strategies alternatively:

  • Linear Probing
  • Quadratic Probing
  • Double Hashing

5. Rehashing

  If the table gets too full, the running time for the operations will start taking too long and insertions might fail for open addressing hashing with quadratic resolution. This can happen if there are too many removals intermixed with insertions. A solution, then, is to build another talbe that is about twice as big and scan down the entire original hash table, computing the new hash value for each element and inserting it in the new table.

  This entire operation is called rehashing.

Rehashing can be implemented in several ways with quadratic probing:

  • rehash as soon as the table is half full
  • rehash only when an insertion fails
  • rehash when the table reaches a certain load factor, Since performance does degrade as the load factor increase, the third strategy, implemented with a good cutoff, could be best.

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/110273.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • 第三章数据链路层_数据链路层链路管理包括

    第三章数据链路层_数据链路层链路管理包括冗余链路出现的背景由于公司对网络的可靠性的要求,大部分公司都会增加额外的交换机,防止在某台交换机出现故障时造成网络的无法使用的情况,例如形成如下图的拓扑的结构。假设W和X交换中的一台出现故

  • vim编辑器怎么退出_退出vim编辑器

    vim编辑器怎么退出_退出vim编辑器当编辑完文件,准备退出Vi返回到shell时,可以使用以下几种方法之一。  (1)在命令模式中,连按两次大写字母Z,若当前编辑的文件曾被修改过,则Vi保存该文件后退出,返回到shell;若当前编辑的文件没被修改过,则Vi直接退出,返回到shell。  (2)在末行模式下,输入命令:wVi保存当前编辑文件,但并不退出,而是继续等待用户输入命令。在使用w命令时,可以再给编辑文件起一个新的文件名。…

  • NeatUpload的安装使用

    NeatUpload的安装使用版本:NeatUpload-1.2.32,用于文件上传。可传大文件。1.在VS工具箱中点右键选“选择项”……将Brettle.Web.NeatUpload.dll添加到工具箱。可以在添加后的工具箱看到

  • JavaScript字符串截取

    JavaScript字符串截取一、常用方法说明1.substr2.substring3.slice二、举例说明1.substr2.substring3.slice

  • mybatis返回值_存储过程获取查询结果

    mybatis返回值_存储过程获取查询结果com.jerry.mapper.TestMapper.javapackagecom.jerry.mapper;importjava.util.List;importjava.util.Map;publicinterfaceTestMapper{ /** *查寻单个结果直接返回Map<String,Object> *@paramid *…………..

  • 电脑蓝屏0X000000ED_0X000000ED

    电脑蓝屏0X000000ED_0X000000ED说到电脑问题,就不得不提蓝屏的问题。最近有位朋友的电脑开机的时候,并没有进入正常的启动程序,反而进入了蓝色界面,显示代码0x000000ed,不知道为什么会这样,也不知道如何去解决。下面就来看看蓝屏0x000000ed的原因和解决方法详解吧!蓝屏代码0x000000ed的原因详解!蓝屏现象,是我们在使用电脑中最常见的一种启动问题,而蓝屏显示的代码就是帮助我们去了解蓝屏的原因以及解决方法的主要依据。…

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号