在“The Cost of GUIDs as Primary Keys”
(
http://www.sinoprise.cn/showtopic-105.aspx)一文中,提出了“COMB”数
据类型的设想,即是将GUID的后6位改为时间,由于原GUID是无序的,这样一来就
便于索引排序,关于为什么采用GUID的后6位,而不是前6位,在原文中有这样一
段描述:
The guess that I had about the problem with the INSERT overhead for
GUIDs was that the lack of order in the Windows 2000–generated GUIDs
was giving SQL Server a hard time administering indexes under massive
INSERT periods. Therefore, I tried to create an order for the GUIDs
instead. I tried to CAST the current DATETIME to a BINARY(8) and put
that first in the GUID. Unfortunately, that had no effect. When I
investigated it further, I found that when I had a BINARY(16) value and
CASTed it to a UNIQUEIDENTIFIER, some bytes were scrambled. What to do?
You guessed it. I tried to compensate for the scrambling to see if that
had any positive effect on throughput, but no effect occurred. Then I
found out that it wasn't the first (high) byte that was important for
the new ordering, but the last (low) bytes. I also learned that I
didn't have to use BINARY(8) for the current DATETIME. BINARY(6) is
enough for the next 77 years, so I decided to occupy only the last
(low) 6 bytes with the current DATETIME. In Listing 5, you can see the
result of PRINT CAST(GETDATE() AS BINARY(8)). The first (high) bytes
are 0.
我曾经做过一次关于效率的测试,使用Delphi+Access做的,“Integer GUID和
Comb做主键的效率测试” (
http://www.sinoprise.cn/showtopic-104.aspx),
因为在其他的数据库中,没有所谓的GUID字段,所以应该把GUID(16个字节)转
化为字符串(32个字符),以字符串作为索引和主键,这样就比用整数做主键大
了整整8倍,不用测试,平经验就知道效率肯定赶不上整数做主键,但GUID字符串
做主键在某些应用上还是有优势的,特别是当多个数据库需要同步时,这是自动
加一的整数就忘尘莫及了。
既然字符串做为主键,我还是坚持我的看法,我觉得应该是前边的字符有序更利
于索引,所以这次一如既往,我将GUID的前6个字节替换掉了,这样生成的字符串
有序,在前6个字节中用两个字节表示但前时间距1970-1-1的天数,后4个字节表
示当前时间距0:0:0的时间点数,这样可以精确到1/300秒,我在数据库中做了1
千万条数据,无重复。在内存中用哈希表检测10亿条GUID无重复。按照这个算法
,前6个字节可以保证179.54794520547945205479452054795年无重复。
我的机器配置:
硬件: 迅驰1.8 G CPU,IBM T43 笔记本,2G内存
软件: .NET 1.1, 2.0,3.0,3.5
使用.NET 2.0测试
mysql 5.0
mysql provider 5.1(mysql官方驱动)
mysql数据库结构:
DROP TABLE IF EXISTS `maintable`;
CREATE TABLE `maintable` (
`szKey` varchar(32) NOT NULL default '',
`szValue` varchar(32) default NULL,
PRIMARY KEY (`szKey`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `subtable`;
CREATE TABLE `subtable` (
`szKey` varchar(32) NOT NULL default '',
`szFK` varchar(32) NOT NULL default '',
`szValue` varchar(32) default NULL,
PRIMARY KEY (`szKey`),
KEY `subtable_fk` (`szFK`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
应用程序界面: