clamav病毒库格式解析
本站寻求有缘人接手,详细了解请联系站长QQ1493399855
clamav简介
Clam AntiVirus(ClamAV)是免费而且开放源代码的防毒软件,软件与病毒码的的更新皆由社群免费发布。目前ClamAV主要是使用在由Linux、FreeBSD等Unix-like系统架设的邮件服务器上,提供电子邮件的病毒扫描服务。ClamAV本身是在文字接口下运作,但也有许多图形接口的前端工具可用,另外由于其开放源代码的特性,在Windows与Mac OS X平台都有其移植版。 ———— [ clamav开源主页 ]
clamav的病毒库可以在安装后配置自动升级,也可以按如下方法手动获取:
http://db.cn.clamav.net/daily.cvd
http://db.cn.clamav.net/main.cvd
http://db.cn.clamav.net/safebrowsing.cvd
http://db.cn.clamav.net/bytecode.cvd
sigtool的用法
下载的病毒库cvd是由zlib压缩库压缩的文件,前512个bytes是一个特殊的头文件,记录引擎病毒库的简单信息,包括名字、创建时间、版本号、签名数量等,然后一个接着一个存储病毒库文件。clamav提供了一个签名工具sigtool,可以查看、生成和解压缩病毒库,在clamav的运行代码中会通过cli_untgz函数解压缩cvd文件,可以通读代码了解详细过程。
sigtool查看cvd的信息
[root@0a75584e9acf updata]# ./sigtool -i main.cvd
File: main.cvd
Build time: 07 Jun 2017 17:38 -0400
Version: 58
Signatures: 4566249
Functionality level: 60
Builder: sigmgr
MD5: 57462fd73f1cfdb356b9dca66da2b732
Digital signature: KWRdhTG+Own6ohh0wn5+vqg1d8ULKCxxxQeKuSA155B3ijxBKgf+bV3IXPcmZrIBUDn1xi8FmyvB63UieykwN/Avq5mTjHIVO8zFnC7wVF7dhdcEYn9Nt+Pmk/HXXx0voylYkidvgZmrxI8jx4a/Re6n3hHQJoCZrkHM15GER8j
Verification OK.
[root@0a75584e9acf updata]#
sigtool解压缩main.cvd
[root@0a75584e9acf updata]# ./sigtool -u main.cvd
[root@0a75584e9acf updata]# ll
total 470304
-rw-r--r--. 1 root root 17992 Mar 28 06:41 COPYING
-rw-r--r--. 1 root root 199693 Feb 23 10:31 bytecode.cvd
-rw-r--r--. 1 root root 55604157 Feb 23 10:31 daily.cvd
-rw-r--r--. 1 root root 44 Mar 28 06:41 main.crb
-rw-r--r--. 1 root root 117892267 Jan 10 06:54 main.cvd
-rw-r--r--. 1 root root 27584 Mar 28 06:41 main.fp
-rw-r--r--. 1 root root 3649543 Mar 28 06:41 main.hdb
-rw-r--r--. 1 root root 24806499 Mar 28 06:41 main.hsb
-rw-r--r--. 1 root root 1060 Mar 28 06:41 main.info
-rw-r--r--. 1 root root 255481374 Mar 28 06:41 main.mdb
-rw-r--r--. 1 root root 92 Mar 28 06:41 main.msb
-rw-r--r--. 1 root root 23505408 Mar 28 06:41 main.ndb
-rw-r--r--. 1 root root 87 Mar 28 06:41 main.sfp
-rw-r--r--. 1 root root 52 Jan 10 06:53 mirrors.dat
-rwxr-xr-x. 1 root root 363966 Mar 28 06:40 sigtool
[root@0a75584e9acf updata]#
clamav病毒库格式(cvd解压后)
A. 基于hash的签名
A.1 基于md5 hash的签名,签名文件后缀*.hdb
格式:
HashString:FileSize:MalwareName
举例:
507d8f868c27feb88b18e6f8426adf1c:12391:Win.Exploit.CVE_2013_3163
用sigtool生成签名的方法:
sigtool –md5 test.exe > test.hdb
A.2 基于SHA1和SHA256的签名,签名文件后缀*.hsb
格式:
HashString:FileSize:MalwareName
举例:
71e7b604d18aefd839e51a39c88df8383bb4c071dc31f87f00a2b5df580d4495:544:clam.exe
用sigtool生成签名的方法:
sigtool –sha1 test.exe > test.hsb
sigtool –sha256 test.exe >test.hsb
A.3 基于PE section hash的签名,文件后缀*.mdb
格式:
PESectionSize:PESectionHash:MalwareName
举例:
23040:32f0b60a78f632259c16b0e94ae8ebbc:Win.Trojan.Qqpass-1714
用sigtool生成签名的方法:
sigtool –mdb test.exe > test.mdb
A.4 未知文件的大小的hash签名 clamav0.98开始支持,用通配符*表示未知文件大小,但有一个限制条件(functional level≥73):
举例:
在以上.hsb和.mdb中匹配未知文件大小的签名分别
HashString:*:MalwareName:73
*:PESectionHash:MalwareName:73
B. body-based签名
B.1 十六进制格式 用sigtool将可显示字符串转换成十六进制的方法是 sigtool –hex-dump string
举例:
zolw@localhost:/tmp/test$ sigtool –hex-dump How do I look in hex?
486f7720646f2049206c6f6f6b20696e206865783f0a
B.2 通配符
包括(为保证准确性,使用英文啦):
• ??
Match any byte.
• a?
Match a high nibble (the four high bits).
IMPORTANT NOTE: The nibble matching is only available in libcla-
mav with the functionality level 17 and higher therefore please only use
it with .ndb signatures followed by ”:17”.
• ?a
Match a low nibble (the four low bits).
• *
Match any number of bytes.
• {n}
Match n bytes.
• {-n}
Match n or less bytes.
• {n-}
Match n or more bytes.
• {n-m}
Match between n and m bytes (m > n).
• (aa|bb|cc|..)
Match aa or bb or cc..
• !(aa|bb|cc|..)
Match any byte except aa and bb and cc.. (ClamAV≥0.96)
• (aaaa|bbbb|cccc|..)
Match alternative strings aaaa or bbbb or cccc. Alternative strings must
have identical lengths.
• !(aaaa|bbbb|cccc|..)
Match any string except aaaa and bbbb and cccc. Alternative strings must
have identical lengths. (ClamAV≥0.98.2)
• HEXSIG[x-y]aa or aa[x-y]HEXSIG
Match aa anchored to a hex-signature, see https://bugzilla.clamav.
net/show_bug.cgi?id=776 for discussion and examples.
• (B)
Match word boundary (including file boundaries).
• (L)
Match CR, CRLF or file boundaries.
B.3 基本签名格式,签名文件后缀名*.db
格式:
MalwareName=HexSignature
举例:
Trojan.URLspoof.gen(Clam)=6c6f636174696f6e2e687265663d756e6573636170652827*3a2f2f*25303140*2729
B.4 扩展签名格式,文件后缀名 *.ndb
格式:
MalwareName:TargetType:Offset:HexSignature[:MinFL:MaxFL]]
举例:
WIN.Trojan.Lolu:1:*:6e23692300000000ffffffff0400000022202f6600000000ffffffff0100000041000000ffffffff0100000043000000ffffffff07000000636d*6f72644c756369
选项:
—TargetType
• 0 = any file
• 1 = Portable Executable, both 32- and 64-bit.
• 2 = file inside OLE2 container (e.g. image, embedded executable, VBA
script).
• 3 = HTML
• 4 = Mail file
• 5 = Graphics
• 6 = ELF
• 7 = ASCII text file (normalized)
• 8 = Unused
• 9 = Mach-O files
• 10 = PDF files
• 11 = Flash files
• 12 = Java class files
—Offset
• * = any
• n = absolute offset
• EOF-n = end of file minus n bytes
Signatures for PE, ELF and Mach-O files additionally support:
• EP+n = entry point plus n bytes (EP+0 for EP)
• EP-n = entry point minus n bytes
• Sx+n = start of section x’s (counted from 0) data plus n bytes
• SEx = entire section x (offset must lie within section boundaries)
• SL+n = start of last section plus n bytes
B.5 逻辑签名,文件后缀名*.ldb 逻辑签名允许以逻辑符号组合多个扩展格式的签名(最多64个),以便提供更详尽和弹性的规则匹配。
格式:
SignatureName;TargetDescriptionBlock;LogicalExpression;Subsig0; Subsig1;Subsig2;…
举例:
Exploit.MS08-067;Target:1;(0&1)&(2|3|4|5);5c5c25735c495043;6e6361636e5f6e70;2e2e5c2e2e;2e2e5c5c2e2e;2e002e005c002e002e;2e002e005c005c002e002e
Trojan.Generic.FakeAV;Engine:51-255,Target:1,IconGroup2:FAKEAV;(0);EP+0:5589e581ec
选项:
• TargetDescriptionBlock 是引擎或者文件格式的键值对,0.95版本目前只支持Target:X (目标文件类型,参考2.3.2.4)and Engine:X-Y(engine functionality ).
• LogicalExpression:a. 使用 0,1,…,N这样的数值索引来分别代表字串 Subsig0, Subsig1,…,SubsigN;
子串A和B之间可以是如下关系或关系组合:(A&B), (A|B), A=X, A=X,Y, A>X, A>X,Y, A 这里大于小于等于是匹配次数的比较,例如A=X是子串A匹配X次。
• 最大支持子串数量是64.
C. PE文件的图标签名,文件后缀*.idb
从clamav0.96开始支持 图标的近似或者模糊匹配
格式:
ICONNAME:GROUP1:GROUP2:ICON_HASH
举例:
Trojan.Bifrose.NSP.3204:UNUSED:BIFROSE:204bb0f0e35206053520f05000000000008000001000ff0000ff1400ff0013680c116e140b6f0c04680d0e570c0634040b00000000001500161700580b07
选项:
• ICON_NAME is a unique string identifier for a specific icon ;
• GROUP1 is a string identifier for the first group of icons (IconGroup1)
• GROUP2 is a string identifier for the second group of icons (IconGroup2),
• ICON_HASH is a fuzzy hash of the icon image
D. PE文件版本信息元数据签名,文件后缀复用*.ndb
clamav0.96开始,包含文件版本信息取名为VS_VERSION_INFORMATION,分为两部分:
1). 标识文件版本的数字和标志,其初衷是为了用户能识别是否当前PE或关联库是否需要更新/覆盖;
2). 一个简易的键/值字符串列表,作用于某些用户信息和OS。例如,当发现ping.exe带有”Microsoft Corporation”, “TCP/IP Ping command”的描述,内部名称“ping.exe”等,对于某些OS版本,一些键值可能在文件属性日志里给予特殊的显示。
为匹配一个键/值对的版本信息,需要特殊的文件偏移锚(anchor)VI,类似于其他锚(例如EP和SL)
例如:基于VI的签名可以通过clamscan –debug freecell.exe 查询到:
[…]
VI:43006f006d00700061006e0079004e0061006d006500000000004d006900
630072006f0073006f0066007400200043006f00720070006f0072006100740
069006f006e000000
基于VI的签名可以直接用在逻辑签名中,例如在.ndb文件中如下格式:
my_test_vi_sig:1:VI:paste_your_hex_sig_here
如果想将基于VI的签名转换为可显示字符,可以执行如下命令:
echo hex_string | xxd -r -p | strings -el
例如:
$ echo 460069006c0065004400650073006300720069007000740069006f006e000000000045006e007400650072007400610069006e006d0065006e00740020005000610063006b0020004600720065006500430065006c006c002000470061006d0065000000 | xxd -r -p | strings -el
FileDescription
Entertainment Pack FreeCell Game
E. 信任和撤销的证书,文件后缀.crb(看到也有.crtdb)
clamav0.98开始检查签发证书的PE文件和核查数据库中每个信任和撤销的证书。
格式:
Name;Trusted;Subject;Serial;Pubkey;Exponent;CodeSign;TimeSign;CertSign;
NotBefore;Comment[;minFL[;maxFL]]
举例:
crtdb.4918813;0;41f84339593d0cdb2733a7f5b3c8952fd2ebd66e;1d9ee03b2102a9eab94acbcb086c89f3cbe2ca01;D85256A226BF048F7C65C570B8A82A40CAA755EBEB06F1F5A8D054A3EBD5400D708248E0CDCC957B51EEDE26F7A4826E3DD77B81506C97120D7981F3F62E275B2A6309EAE4C85C334D339AE12F3861D82A03CBEA37A6E2109FE572E0A7E6778520E423135F626A9E172E62BAC1017B3AD4500ACDCD93FE9C0D99333D80CFD6A02FE7398295662CD1B97205A0891F51738292CC27E8799AEE48C88554F9AC3F5F4C4C10BF3127C0BE5931C5E8425A263DBB130B3E9D6630CDB76A95B2370BC69C8C5FC8ABA658000C55D9889E24F98059BC571D57C945DAA6238179E9795E03F274B6F8E032B45BD9F25BC5436356C3CAF4D86D2083A5ED6EE014C4CCC48DB48D;010001;1;0;0;;hohhothandingtradeandbusiness-6ed2450ceac0f72e73fda1727e66e654
选项:
• Name: name of the entry
• Trusted: bit field, specifying whether the cert is trusted. 1 for trusted. 0
for revoked
• Subject: sha1 of the Subject field in hex
• Serial: the serial number as clamscan –debug –verbose reports
• Pubkey: the public key in hex
• Exponent: the exponent in hex. Currently ignored and hardcoded to 010001
(in hex)
• CodeSign: bit field, specifying whether this cert can sign code. 1 for true,
0 for false
• TimeSign: bit field. 1 for true, 0 for false
• CertSign: bit field, specifying whether this cert can sign other certs. 1 for
true, 0 for false
• NotBefore: integer, cert should not be added before this variable. Defaults
to 0 if left empty
• Comment: comments for this entry
F. 容器元数据(container metadata),文件后缀*.cdb
clamav0.96开始允许满足特定条件的容器类型的签名,容器类型的文件可以是zip, rar, arj, 7z,tar等。
格式:
VirusName:ContainerType:ContainerSize:FileNameREGEX:
FileSizeInContainer:FileSizeReal:IsEncrypted:FilePos:
Res1:Res2[:MinFL[:MaxFL]]
选项:
• VirusName: Virus name to be displayed when signature matches
• ContainerType: one of CL_TYPE_ZIP, CL_TYPE_RAR, CL_TYPE_ARJ,
CL_TYPE_CAB, CL_TYPE_7Z, CL_TYPE_MAIL, CL_TYPE_(POSIX|OLD)_TAR,
CL_TYPE_CPIO_(OLD|ODC|NEWC|CRC) or * to match any of the container
types listed here
• ContainerSize: size of the container file itself (eg. size of the zip archive)
specified in bytes as absolute value or range x-y
• FileNameREGEX: regular expression describing name of the target file
• FileSizeInContainer: usually compressed size; for MAIL, TAR and
CPIO == FileSizeReal; specified in bytes as absolute value or range
• FileSizeReal: usually uncompressed size; for MAIL, TAR and CPIO ==
FileSizeInContainer; absolute value or range
• IsEncrypted: 1 if the target file is encrypted, 0 if it’s not and * to ignore
• FilePos: file position in container (counting from 1); absolute value or
range
• Res1: when ContainerType is CL_TYPE_ZIP or CL_TYPE_RAR this field is
treated as a CRC sum of the target file specified in hexadecimal format; for
other container types it’s ignored
• Res2: not used as of ClamAV 0.96
H. 只基于ZIP/RAR元数据的签名,文件后缀名为.zmd(zip)和.rmd(rar)
格式:
virname:encrypted:filename:normal size:csize:crc32:cmethod:
fileno:max depth
举例:
Worm.Bagle-8-zippwd:1::94126:83556::8:3:2
Rar.Suspect.ExecutableFax-rarpwd::(?i)(incomingfax|fax[0-9]{3,}).(exe|scr)$::::::*
选项:
• Virus name
• Encryption flag (1 – encrypted, 0 – not encrypted)
• File name (this is a regular expression - * to ignore)
• Normal (uncompressed) size (* to ignore)
• Compressed size (* to ignore)
• CRC32 (* to ignore)
• Compression method (* to ignore)
• File position in archive (* to ignore)
• Maximum number of nested archives (* to ignore)
I. 白名单数据库
保存md5签名的白名单数据库后缀名是.fp,保存SHA1和SHA256签名的白名单数据库后缀名是.sfp。
J.钓鱼签名
参考了英文文档:https://github.com/vrtadmin/clamav-devel/tree/master/docs/phishsigs_howto.pdf,更多信息可以访问:http://www.antiphishing.org
J.1 钓鱼url/host签名,文件后缀*.pdb
格式:
R[Filter]:RealURL:DisplayedURL[:FuncLevelSpec]
H[Filter]:DisplayedHostname[:FuncLevelSpec]
举例:
R:.+.ebay.com([/?].)?:gotoebay.co.uk([/?].)?:17-
选项:
- R表示正则表达式
- H表示精确匹配DisplayedHostname
- Filter被忽略(考虑兼容性)
- RealURL:真实的URL,例如html的URL链接
- DisplayedURL:呈现给用户的描述性的URL显示
- FuncLevelSpec :有两种格式:a)minlevel,所有引擎都满足functionality level >= minlevel才加载这条规则;b)minlevel-maxlevel,所有引擎都满足functionality level >= minlevel且functionality level <= maxlevel
J.2 钓鱼URL hash签名,文件后缀*.gdb 签名文件来自于Google Safe Browsing datebase文件safebrowsing.cvd
格式:
S:P:HostPrefix[:FuncLevelSpec]
S:F:Sha256hash[:FuncLevelSpec]
S1:P:HostPrefix[:FuncLevelSpec]
S1:F:Sha256hash[:FuncLevelSpec]
S2:P:HostPrefix[:FuncLevelSpec]
S2:F:Sha256hash[:FuncLevelSpec]
S:W:Sha256hash[:FuncLevelSpec]
举例:
S2:F:00003d62bace3cc49cbd245aeb36e46e64841fdb9c74034d364c7bd31eb6a254
选项:
- S: These are hashes for Google Safe Browsing - malware sites, and should not be used for other purposes.
- S2: These are hashes for Google Safe Browsing - phishing sites, and should not be used for other purposes.
- S1: Hashes for blacklisting phishing sites. Virus name: Phishing.URL.Blacklisted
- S:W Locally whitelisted hashes.
- HostPrefix 4-byte prefix of the sha256 hash of the last 2 or 3 components of the host-name. If prefix doesn’t match, no further lookups are performed.
- Sha256hash sha256 hash of the canonicalized URL, or a sha256 hash of its pre-fix/suffix according to the Google Safe Browsing “Performing Lookups” rules. There should be a corresponding :P:HostkeyPrefix entry for the hash to be taken into consideration.
J.3 钓鱼URL白名单签名,文件后缀*.wdb
签名文件来自于Google Safe Browsing datebase文件safebrowsing.cvd
格式:
X:RealURL:DisplayedURL[:FuncLevelSpec]
M:RealHostname:DisplayedHostname[:FuncLevelSpec]
X表示后面的URL是正则表达式,M表示后面的是主机名。
举例:
M:info.searscard.com:sears.com
X:.+.ebay.com([/?].)?:gotoebay.co.uk([/?].)?:17-
总结
以上病毒库签名文件总结如下:
签名文件后缀名 | 签名类型 | 签名格式 |
---|---|---|
*.hdb | 基于md5 hash的签名 | HashString:FileSize:MalwareName |
*.hsb | 基于sha1和sha256 hash的签名 | HashString:FileSize:MalwareName |
*.mdb | 基于PE section hash的签名 | PESectionSize:PESectionHash:MalwareName |
*.db | Body-based的基本签名 | MalwareName=HexSignature |
*.ndb | Body-based的扩展签名 | MalwareName:TargetType:Offset:HexSignature[:MinFL:[MaxFL]] |
*.ldb | Body-based的逻辑签名 | SignatureName;TargetDescriptionBlock;LogicalExpression;Subsig0; Subsig1;Subsig2;… |
*.idb | PE文件的图标签名 | ICONNAME:GROUP1:GROUP2:ICON_HASH |
*.ndb | PE文件版本信息元数据签名 | my_test_vi_sig:1:VI:paste_your_hex_sig_here |
.crb, .crtdb | 信任和撤销的证书签名 | Name;Trusted;Subject;Serial;Pubkey;Exponent;CodeSign;TimeSign;CertSign; NotBefore;Comment[;minFL[;maxFL]] |
*.cdb | 容器元数据签名 | VirusName:ContainerType:ContainerSize:FileNameREGEX:FileSizeInContainer:FileSizeReal:IsEncrypted:FilePos:Res1:Res2[:MinFL[:MaxFL]] |
.zmd(zip)和*.rmd(rar) | 只基于ZIP/RAR元数据的签名 | virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth |
.fp(md5)和.sfp(sha1和sha256) | 白名单数据库 | HashString:FileSize:MalwareName |
*.pdb | 钓鱼url/host签名 | R[Filter]:RealURL:DisplayedURL[:FuncLevelSpec]或H[Filter]:DisplayedHostname[:FuncLevelSpec] |
*.gdb | 钓鱼URL hash签名 | S:P:HostPrefix[:FuncLevelSpec]或S:F:Sha256hash[:FuncLevelSpec]或S1:P:HostPrefix[:FuncLevelSpec]或S1:F:Sha256hash[:FuncLevelSpec]或S2:P:HostPrefix[:FuncLevelSpec]或S2:F:Sha256hash[:FuncLevelSpec]或S:W:Sha256hash[:FuncLevelSpec] |
*.wdb | 钓鱼URL白名单签名 | X:RealURL:DisplayedURL[:FuncLevelSpec]或M:RealHostname:DisplayedHostname[:FuncLevelSpec] |
获取病毒库特征码数量
解压后发现每个病毒库特征码是一行,统计一下可以得到病毒特征码库的数量,得到三个病毒库main.cvd bytecode.cvd daily.cvd 病毒库特征码数量
./sigtool -u main.cvd && ./sigtool -u bytecode.cvd && ./sigtool -u daily.cvd && wc -l `ls main.* daily.* *.cbc | grep -v -e cvd`
执行结果如下:
[root@0a75584e9acf updata]# ./sigtool -u main.cvd && ./sigtool -u bytecode.cvd && ./sigtool -u daily.cvd && wc -l `ls main.* daily.* *.cbc | grep -v -e cvd`59 3986187.cbc26 3986188.cbc39 3986206.cbc102 3986212.cbc14 3986214.cbc14 3986215.cbc
...
...
...59104 main.hdb347038 main.hsb10 main.info4059199 main.mdb1 main.msb100555 main.ndb1 main.sfp6831572 total
[root@0a75584e9acf updata]#