Hashing

Hashing

1. Introduction

     Hashing is a technique used for performing insertions, deletions, and finds in constant average time.

Hash function

     Each key is mapped into some number in the range 0  to TableSize -1 and placed in the appropriate cell. And this mapping is called a hash function since there are a finite number of cells and a virtually inexhaustible supply of keys, this is clearly impossible, and thus we seek a hash function that distributes the keys evenly among the cells.

     The only remaining problems :

  • deal with choosing a function
  • deciding what to do when two keys hash to the same value

2. Hash Function

  if the input keys are integers, then simply returning key mod TableSize is generally a reasonable strategy, unless Key happens to have some undesirable properties. In this case, the choice of hash function needs to be carefully considered. For instance, if the table size the is 10 and the keys all end in zero, then the standard hash function is a bad choice. For reasons we shall see later, and to avoid situations like the one above, it is often a good idea to ensure that the table size is prime.

  Usually, the Keys are strings; in this case, the hash function needs to be chosen carefully. The best way use Horner’s rule. If the keys are very long, the hash function will take too long to compute. A common practice in this case is not to use all the characters. Some programmers implement their hash function by using only the characters in the odd spaces, with the idea that the time saved computing the hash function will make up for a slightly less evenly distributed function.

       The main programming detail left is collision resolution. If, when an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it. There are several methods for dealing with this. We will discuss tow of the simplest:  separate chaining and open addressing.

 3. Separate Chaining

  The first strategy, commonly known as separate chaining, is to keep a list of all elements that hash to the same value.

  • To perform a search, we use the hash function to determine which list to traverse. we then search the appropriate list.
  • To perform a insert, we check the appropriate list to see whether the element is already in place. If the element turns out to be new, it can be inserted at the front of the list, since it is convenient and also because frequently it happens that recently inserted elements are the most likely to be accessed in the near future.

4. Hash Tables without Linked Lists

  Separate chaining hashing has the disadvantage of using linked lists. This could slow the algorithm down a bit because of the time required to allocate new cells, and also essentially requires the implementation of a second data structure.

  There are three common collision resolution strategies alternatively:

  • Linear Probing
  • Quadratic Probing
  • Double Hashing

5. Rehashing

  If the table gets too full, the running time for the operations will start taking too long and insertions might fail for open addressing hashing with quadratic resolution. This can happen if there are too many removals intermixed with insertions. A solution, then, is to build another talbe that is about twice as big and scan down the entire original hash table, computing the new hash value for each element and inserting it in the new table.

  This entire operation is called rehashing.

Rehashing can be implemented in several ways with quadratic probing:

  • rehash as soon as the table is half full
  • rehash only when an insertion fails
  • rehash when the table reaches a certain load factor, Since performance does degrade as the load factor increase, the third strategy, implemented with a good cutoff, could be best.

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/110273.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • python跳出循环重新开始_python怎么跳出循环

    python跳出循环重新开始_python怎么跳出循环本文主要讲下python中的break语句用法,常用在满足某个条件,需要立刻退出当前循环时(跳出循环),break语句可以用在for循环和while循环语句中。简单的说,break语句是会立即退出循环,在其后边的循环代码不会被执行。break语句的用法>>>x=1>>>whileTrue:>>>x+=1>>>…

  • 四轴平面机器人的手眼标定

    四轴平面机器人的手眼标定四轴平面机器人的手眼标定介绍在实际的机器人应用中,通常会给机器人配备视觉传感器,视觉传感器用于感知周围环境。但是,通过视觉传感器获取的场景坐标是基于视觉坐标系下的,机器人并不能直接使用,要获取机器人可以直接使用的坐标信息,必须将坐标转换到机器人坐标系下。因此,机器人手眼标定的目的是为了获取从视觉坐标系转换到机器人坐标系的转换矩阵。机器人手眼标定问题可以分为两类:1)eye-in-hand,…

  • 服务器系统sm总线控制器驱动,sm总线控制器驱动

    服务器系统sm总线控制器驱动,sm总线控制器驱动SM总线控制器是全称SystemManagement,是主板控制芯片上的一个通信控制器,主板芯片技术中的一种,如果你遇到设备管理器中quotm总线控制器quot有一黄色问号,下载您所使用的主板最新的系统所对应的驱动程序,在安装了正确的主机板驱动程序后,系统将能够正确识别您所有的芯片,问题即可解决。sm总线控制器是什么?它是SystemManagement的缩写,是主板芯片技术中的一种,主要是用…

  • PHP进程间通信-信号

    PHP进程间通信-信号

  • Away3D基础教程(二):加载外部模型[通俗易懂]

    Away3D基础教程(二):加载外部模型[通俗易懂]预览地址:http://leoas.host-home-idc.k5.fhfinance.com/tutorials/2/glass.html模型随鼠标转动,中键滚轮缩放。模型和完整源码下载:http

  • navicat premium 15 激活码【2021最新】[通俗易懂]

    (navicat premium 15 激活码)本文适用于JetBrains家族所有ide,包括IntelliJidea,phpstorm,webstorm,pycharm,datagrip等。IntelliJ2021最新激活注册码,破解教程可免费永久激活,亲测有效,下面是详细链接哦~https://javaforall.cn/100143.html…

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号