Module to calculate and display stats regarding the operation of the filter since the filter was last initialized. Stats regarding the number of spam messages received, spam frequency/period and accuracy rates are displayed.
Display the statistics regarding the performance of the filter.
# File buryspam.rb, line 4009 def show get_stats if @total_spams.zero? puts "No spams received since reinitialization." return end unless @unprocessed_from.empty? print "Messages not processed by buryspam:\n " print @unprocessed_from.join(" ") puts "" end secs = (Time.now - @inittime) days = secs / SECS_PER_DAY spam_freq = "%.1f spam%s/day".pluralize(@total_spams / days) spam_period = time_units(secs / @total_spams).to_s + "/spam" misses_per_day, accuracy = rate_accuracy(@misses, days) false_neg = "%-6s" % @misses false_neg_day = "%-6s" % misses_per_day accuracy = "%-6s" % accuracy sas = ": " if @sa_installed sas = " (SA):" sa_misses_per_day, sa_accuracy = rate_accuracy(@sa_misses, days) false_neg << " (%s)" % @sa_misses false_neg_day << " (%s)" % sa_misses_per_day accuracy << " (%s)" % sa_accuracy end print "Last reinitialized: #{@inittime.strftime("%F %T")} (#{time_units(secs)} ago) Unprocessed messages: #{@unprocessed} Total spams: #{@total_spams} Spam frequency: #{spam_freq} Spam period: #{spam_period} False neg.#{sas} #{false_neg} False neg./day#{sas} #{false_neg_day} Accuracy#{sas} #{accuracy} " end
Scan the given mbox file and count the total number of spam messages, missed spam messages and unprocessed messages in the mbox. Used exclusively by the ::get_stats method.
# File buryspam.rb, line 4093 def file_stats(file) return unless Mbox.is_valid?(file) && File.mtime(file) > @inittime mbox = IO.binread(file) mbox.scan(FROM_HDR) { |from, hdr| msgtime = Message.extract_time(from) next if msgtime < @inittime if BURYSPAM_HDR.match(hdr) @total_spams += 1 @misses += 1 if $1 == "No" if SA_SPAM_HDR.match(hdr) @sa_installed = true else @sa_misses += 1 end else @unprocessed += 1 @unprocessed_from << from end } end
Make a count of all the spam messages and false negatives that have occurred since the last initialization. Give an estimated comparison with the previous version of buryspam.
# File buryspam.rb, line 4071 def get_stats # It's a bit expensive to load in the entire bayesian database # just to get access to the initialization timestamp... db = Hashbase.load(Config.word_file) raise "Cannot load '#{Config.word_file}'" if db.nil? raise "No timestamp in '#{Config.word_file}'" if db[:timestamp].nil? @inittime = db[:timestamp] @total_spams = @misses = @sa_misses = @unprocessed = 0 @sa_installed = false @unprocessed_from = [] Config.bad_dirs.each { |dir| Dir[File.join(dir, '*')].each { |f| file_stats(f) } } file_stats(Config.missed_spam_file) end
Given the number of missed spam messages and the number of days since last (re)initialization of the filter, return the average number of misses per day and the accuracy of the filter with respect to the total number of spams received.
# File buryspam.rb, line 4062 def rate_accuracy(misses, days) misses_per_day = "%.1f" % (misses / days) accuracy = "%.2f%%" % (100 - misses / @total_spams.to_f * 100) return misses_per_day, accuracy end
Convert the given seconds to a more appropriate time unit string (minutes, hours, days).
# File buryspam.rb, line 4116 def time_units(secs) days = "%.1f" % (secs / SECS_PER_DAY) hours = "%.1f" % (secs / SECS_PER_HOUR) mins = "%.1f" % (secs / SECS_PER_MINUTE) secs = "%.1f" % secs case secs.to_i when 0...SECS_PER_MINUTE "%g second%s".pluralize(secs.to_f) when SECS_PER_MINUTE...SECS_PER_HOUR "%g minute%s".pluralize(mins.to_f) when SECS_PER_HOUR...SECS_PER_DAY "%g hour%s".pluralize(hours.to_f) else "%g day%s".pluralize(days.to_f) end end