MalStone:
分为MalStone A-10和MalStone B-10基准。
用malgen生成100亿条记录,生成的数据带有时间戳,时间随机分布在一年里面。每条记录100字节左右。
记录格式:
Event ID | Timestamp | Site ID | Compromise Flag | Entity ID
计算过程算法伪代码如下:
for record in read( data )
( site, date, compromised_indicator ) = parse( record )
group by site
for each site
map:date --> timeslice
total_compromised_to_date, total_seen_to_date = 0
for each timeslice in sort ( timeslices )
total_compromised_to_date += compromised_for_timeslice
total_seen_to_date += seen_for_timeslice
statistic[site, timeslice] = 0 or total_compromised_to_date /
total_seen_to_date