module Buryspam::MUN_IMAP

This module contains methods that connect and retrieve messages from MUN's IMAP server. The messages are either bulk filtered without procmail or filtered with procmail, depending upon the invocation/environment.

Constants

LASTFOLDER

Regular expression to match the name of the folder in a .procmail.log file to which the last message was saved. The format is assumed to be of the form: "Folder: <file> <size>", e.g.:

Folder: junk/spam                              2524
PS_PROCMAIL

This is the line that should be returned by the /bin/ps command in the procmail_running? method if procmail is indeed running.

Public Class Methods

transfer() click to toggle source

Behaviour depends upon the value of Startup.cmd.mode:

  • If startup mode is :transfer, then transfer all messages from the IMAP server to the local system, then feed the messages to procmail for filtering.

  • If startup mode is :poll, then we'll daemonize and periodically transfer messages, like :transfer

  • If mode is :bulk or the procmail executable is not found, then we'll do a bulk filter of all the messages without using procmail.

# File buryspam.rb, line 840
def transfer
  @imap = nil
  begin
    Lockfile.only_one { |lockfile|
      return unless valid_client?
      return unless procmail_test?

      Status.print("NOTE: Messages will be deleted from the IMAP server\n" +
                   "      after they have been transferred.\n\n")

      @procmail_pid = nil

      @server   = Config.imap_server || ""
      @username = Config.imap_username || ""
      @inbox    = Config.imap_inbox || ""
      raise "Blank imap server."     if @server.empty?
      raise "Blank imap username ."  if @username.empty?
      raise "Blank imap inbox."      if @inbox.empty?

      return unless verify_credentials

      if Startup.cmd.mode == :poll
        poll(lockfile)
      else
        fetch # Handles both :transfer and :bulk modes.
      end
    }
  rescue Lockfile::AlreadyLockedError
    Status.error($!.message)
  end
end

Private Class Methods

append_msg(mbox_msg, file) click to toggle source

Append the given mbox_msg to the specified folder file and update the folder counts.

# File buryspam.rb, line 1262
def append_msg(mbox_msg, file)
  Logger.debug("Appending message to '#{file}'...")
  Lockfile.open(file, File::LOCK_EX) { |f|
    Logger.debug("Writing to '#{file}'")
    f.print(mbox_msg)
  }
  update_folders(file)
end
bulk_filter(mbox_msg) click to toggle source

During bulk filtering, we archive the message and store it in the spam file or the spool, depending upon how it was filtered. If the Bayesian database was deemed to be invalid or if no spam file was specified in the configuration, then all messages are stored in the spool.

# File buryspam.rb, line 1275
def bulk_filter(mbox_msg)
  Logger.debug("Bulk filtering message...")
  Mbox.archive(mbox_msg)
  mbox = Mbox.new(mbox_msg)
  if @valid_db
    mbox.filter { |msg, is_spam|
      # Put all spam in spool too if @spam.file is nil.
      folder = (is_spam ? @spam.file : @spool) || @spool
      append_msg(msg, folder)
    }
  else
    # No bayesian database: put all mail unfiltered in spool.
    mbox.each_msg { |msg|
      append_msg(msg, @spool)
    }
  end
end
fetch() click to toggle source

Retrieve messages from the configured IMAP server and filter them appropriately. Provide informational output as messages are transfered/filtered.

# File buryspam.rb, line 1383
def fetch
  imap_open {
    Logger.debug {
      "Opened IMAP connection to '%s' from '%s'" %
        [@server, Startup::HOSTNAME]
    }

    @imap.select(@inbox)
    @num_msgs = get_num_msgs(@inbox)
    if @num_msgs <= 0
      Status.puts("\nNo messages found in remote mailbox '%s'." % @inbox)
      return
    end

    Status.puts("\nTransferring %d message%s.".pluralize(@num_msgs))

    fwd_to_addr = Config.fwd_to || ""
    if fwd_to_addr.strip.empty?
      Logger.info("No forwarding address given; not forwarding.")
      @fwd = nil
    else
      @fwd = Forward.new(fwd_to_addr)
    end
    @processed = 0

    # Filter messages one at a time to allow for more granular error
    # recovery.  If there was a problem with +procmail+ or bulk
    # filtering a given message, then don't delete that message
    # from the IMAP server.
    #
    @folders = Hash.new(0)
    attrs = %w(RFC822 INTERNALDATE ENVELOPE UID)
    @imap.fetch(1..Config.max_msg_transfer, attrs).each { |msg|
      if msg.nil? || msg.attr.nil? || ! msg.attr.has_key?("UID")
        Status.error("Invalid message from IMAP server: #{msg.inspect}")
        next
      end
      uid = msg.attr["UID"]
      Logger.debug("Fetched message with UID '#{uid}'.")
      mbox_msg = mbox_fmt(msg)
      show_msg_details
      begin
        filter(mbox_msg)
        # Forward *after* filtering to prevent the forwarding address
        # from receiving duplicates if there was an exception during
        # filtering.  Don't forward messages that were bounced back
        # to (as determined by the fwd_inhibit configuration parameter
        # and the .procmailrc file).
        @fwd.send(msg) if @fwd
        # If an exception is raised by 'filter' or '@fwd.send', the message
        # will not be deleted.  This is intended behaviour.
        imap_delete(uid)
        @processed += 1
      rescue Exception
        # If there were problems with +procmail+ or with bulk
        # filtering then stop now rather than trying to fetch/process
        # other messages.
        Status.error($!)
        Status.error("Fetching terminated.")
        break
      end
    }
    Status.print("done.\n\n")
    show_summary
  }
end
filter(mbox_msg) click to toggle source

Use procmail if the procmail executable is available (and we are not bulk filtering: see procmail_test?). Otherwise, ignore procmail and just do in-process bulk filtering.

# File buryspam.rb, line 1346
def filter(mbox_msg)
  if @pm_avail
    procmail_filter(mbox_msg)
  else
    bulk_filter(mbox_msg)
  end
end
from_address(env) click to toggle source

Given the envelope of a message retrieved from an IMAP request, return the originating address in the form local-part@domain. The logic is a little twisted to accommodate group syntax and/or non-existant originating fields.

# File buryspam.rb, line 1029
def from_address(env)
  return 'unknown@localhost' if env.nil? || env.from.nil?
  from = env.from[0]
  domain = from.host
  if domain.nil? || domain.empty?
    # Group syntax may have been used (from.mailbox should contain
    # group name -- ignore it).  Get mailbox/host from next address.
    Logger.warn("Group syntax probably used for 'from' address.");
    Logger.warn("Envelope:\n#{env.inspect}")
    from = env.from[1]
    if from.nil?
      Logger.warn("No from address in IMAP envelope.")
      return 'unknown@localhost'
    end
    domain = from.host
    # If domain is still nil, then something is wrong.
    if domain.nil? || domain.empty?
      Logger.warn("IMAP envelope from HOST is blank.")
      domain = 'localhost'
    end
  end
  local = from.mailbox
  if local.nil? || local.empty?
    Logger.warn("IMAP envelope MAILBOX is blank.  Premature end of group?")
    Logger.warn("Envelope:\n#{env.inspect}")
    return 'unknown@' + domain
  end
  # Quote the local-part of the email address if it contains "unusual"
  # characters.  See: http://en.wikipedia.org/wiki/E-mail_address
  if local.count("^a-zA-Z0-9!#%$&'*+/=?^_`{|}~.-") > 0 ||
     local.match(/^\.|\.$|\.\./)
    local = '"' + local + '"'
  end
  local + '@' + domain
end
get_logfile() click to toggle source

Determine procmail's logfile. Return nil if it can't be determined.

# File buryspam.rb, line 1364
def get_logfile
  begin
    return Procmail.logfile
  rescue
    Status.error($!)
  end
  return nil
end
get_num_msgs(mailbox) click to toggle source

Determine how many messages are in the given mailbox on the imap server.

# File buryspam.rb, line 1228
def get_num_msgs(mailbox)
  begin
    status = @imap.status(mailbox, ["MESSAGES"])
    return status["MESSAGES"]
  rescue
    Status.error($!)
    return 0
  end
end
imap_delete(msg_uid) click to toggle source

Delete the message with the specified UID from the IMAP server.

# File buryspam.rb, line 1374
def imap_delete(msg_uid)
  @imap.uid_store(msg_uid, "+FLAGS", [:Deleted])
  @imap.expunge
  Logger.info("Message with UID '#{msg_uid}' deleted from imap server.")
end
imap_open() { || ... } click to toggle source

Log on to the IMAP server and yield to the supplied block. Ensure that we logout.

# File buryspam.rb, line 1097
def imap_open
  begin
    login
    yield
  ensure
    logout
  end
end
lastfolder() click to toggle source

Look at the end of the procmail log file to determine to which folder the last message was stored. This is used only if we are filtering using procmail. If there is a better way to do this, please let me know.

# File buryspam.rb, line 1207
def lastfolder
  begin
    folder = nil
    Lockfile.open(@logfile) { |f|
      f.seek(-1024, IO::SEEK_END)
      end_contents = f.read
      folder = end_contents[LASTFOLDER, 1]
    }
    if folder.nil?
      raise "Cannot read last folder from '#{@logfile}'.\n" +
            "(contents: '#{end_contents}')"
    end
    return folder
  rescue
    Status.error($!)
    return "<unknown folder>"
  end
end
login() click to toggle source

Login to the IMAP server. Any exceptions related to connecting to the server are the responsibility of the caller. In particular, if this method takes too long to execute, a Timeout:Error exception is raised.

# File buryspam.rb, line 1110
def login
  begin
    Timeout.timeout(Config.imap_timeout) {
      Logger.debug("IMAP login...")
      unless @imap.nil?
        Logger.debug("already logged in.")
        return
      end

      if RUBY_VERSION.match(/^1\.8\./)
        # Hack to temporarily disable -w (verbose) mode to suppress warning:
        #   /usr/lib/ruby/1.8/net/imap.rb:901:
        #      warning: using default DH parameters.
        # http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/210670
        v, $VERBOSE = $VERBOSE, nil
        @imap = Net::IMAP.new(@server, Config.imap_port, Config.imap_use_ssl)
        $VERBOSE = v
      else
        opts = { :port => Config.imap_port }
        if Config.imap_use_ssl
          # IMAP connection requires certs under ruby 1.9.
          opts[:ssl] = { :ca_file => '/etc/ssl/ca-bundle.crt',
                         :ca_path => '/etc/ssl/certs',
                         :verify_mode => OpenSSL::SSL::VERIFY_PEER }
        end
        @imap = Net::IMAP.new(@server, opts)
      end

      @imap.login(@username, @t)
      Logger.debug("login successful.")
    }
  rescue Timeout::Error
    logout  # Try to clean up.
    raise Timeout::Error, "Timed out during IMAP login"
  end
end
logout() click to toggle source

Disconnect from the IMAP server.

# File buryspam.rb, line 1148
def logout
  Logger.debug("IMAP logout...")
  if @imap.nil?
    Logger.debug("already logged out.")
    return
  end
  begin
    @imap.disconnect unless @imap.disconnected?
  rescue Errno::ECONNRESET
    Status.warn($!.message + " during disconnect (warning ignored)")
  ensure
    # Set @imap to nil even if the disconnect was unsuccessful.
    # This will prevent login from re-using a potentially bogus IMAP
    # object next time we try to access the IMAP server.
    @imap = nil
  end
  Logger.debug("logout complete.")
end
mbox_fmt(msg) click to toggle source

Convert a message retrieved from an IMAP server to mbox format. by prepending a "From " postmark line, escaping "From " lines in the rest of the body, CRLF translation and ensuring that there is a blank line at the end. (www.qmail.org/man/man5/mbox.html)

# File buryspam.rb, line 1069
def mbox_fmt(msg)
  envelope = msg.attr["ENVELOPE"]
  from = from_address(envelope)
  time = nil
  begin
    time_attr = msg.attr["INTERNALDATE"]
    time = Time.parse(time_attr)
  rescue
    Logger.warn("Invalid INTERNALDATE attr: #{time_attr.inspect}")
    time = Time.now
  end
  rfc822 = msg.attr["RFC822"]
  if rfc822.nil? || rfc822.empty?
    Logger.warn("IMAP RFC822 attribute not set.")
    # prepend newline to separate from "From " line with blank line.
    rfc822 = "\n(empty message?)"
  end
  @from_line =  "From %s  %s\n" % [from, time.asctime]
  @subject = envelope.subject || "(no subject)"
  mbox_msg = @from_line +
             msg.attr["RFC822"].
                 gsub(/\r\n/, "\n").
                 gsub(/^From /, ">From ")
  mbox_msg << "\n" * (2 - mbox_msg[-2..-1].count("\n"))
end
poll(lockfile) click to toggle source

Daemonize this process so that we can periodically check for messages on the IMAP server in the background.

# File buryspam.rb, line 876
def poll(lockfile)
  Status.print("\nWill check for new messages every %s.\n\n" %
              Config.poll_timer)
  logout  # Using the previous IMAP login from 'transfer' seems
          # to confuse the IMAP library after the daemonization.

  Logger.debug("Daemonizing...")

  # Close the logger while daemonizing.  It'll be re-opened
  # automatically by the next debug/info/warn... write to it.
  Logger.close

  Process.daemon

  # Write the hostname/pid to the log file.
  Lockfile.write_pid_host(lockfile)

  loop do
    begin
      fetch if FileUtils.free_space?
    rescue Exception
      # Catch/log all exceptions and keep going.
      Logger.error($!)
      Logger.error("Cannot fetch messages via IMAP.")
    end
    # Close the log file while we sleep.  Otherwise, the log
    # file could get rotated underneath us.
    Logger.close

    # Do garbage collection before sleeping.
    GC.start
    Config.poll_timer.wait
  end
end
procmail_filter(mbox_msg) click to toggle source

Run procmail to sort the mail into the folders defined in the .procmailrc file.

# File buryspam.rb, line 1312
def procmail_filter(mbox_msg)
  if procmail_running?
    raise "procmail pid #{@procmail_pid} still running?"
  end

  @procmail_pid = nil
  Logger.info("Invoking procmail...")

  # Because running procmail may cause another buryspam spam process to
  # run, close the log file now so the two buryspam processes won't
  # conflict.
  Logger.close

  begin
    procmail_pipe = IO.popen(Config.procmail, "w")
    @procmail_pid = procmail_pipe.pid
    procmail_pipe.print(mbox_msg)
    procmail_pipe.sync
  ensure
    procmail_pipe.close
    Logger.debug "'#{Config.procmail}' returned #$?"
    # If +procmail+ returns a non-zero exit code it may be best not
    # to continue.
    raise "Problem with procmail." unless $?.success?
  end

  Logger.info("procmail finished.")
  update_folders(lastfolder)
end
procmail_running?() click to toggle source

Return true if a procmail process we started is still running, false otherwise.

# File buryspam.rb, line 1295
def procmail_running?
  return false if @procmail_pid.nil?
  # Would like to be clever and use:
  #   Process.kill(0, @procmail_pid)
  # (http://www.ruby-forum.com/topic/99567)
  # But this creates problems if the user (or someone else) starts a
  # different process that just happens to be assigned the same process id
  # as the (terminated) procmail process, or if the procmail process
  # somehow changes ownership.
  ps_procmail = `/bin/ps ho ruid,comm -p #{@procmail_pid}`.chomp
  Logger.debug("Previous procmail pid = #{@procmail_pid}")
  Logger.debug("Testing '#{ps_procmail}' == '#{PS_PROCMAIL}'")
  ps_procmail == PS_PROCMAIL
end
procmail_test?() click to toggle source

Determines whether or not to use procmail after retrieving the messages from the IMAP server. Returns:

  • true if procmail executable exists and we are not doing bulk filtering.

  • true if procmail does not exist or we are in bulk mode and the spool file is okay for bulk processing and the user wishes to continue

  • false if procmail will not be used but the spam/spool file is invalid or the user does not with to contine.

# File buryspam.rb, line 938
def procmail_test?
  @pm_avail = File.executable?(Config.procmail)

  # Return true if we will be using +procmail+.
  if @pm_avail && Startup.cmd.mode != :bulk
    Logger.debug("Found 'procmail' executable at '#{Config.procmail}'")
    Status.print("Messages will be filtered by '%s'.\n\n" % Config.procmail)
    @logfile = get_logfile  # Procmail's logfile used by lastfolder.
    return true     # Use procmail during transfer.
  end

  # At this point, procmail will NOT be invoked to filter messages.
  # Force @pm_avail to false.
  @pm_avail = false

  # If we are not in bulk mode, the user was likely implicitly
  # expecting to use +procmail+.  Explain why we aren't using it.
  if Startup.cmd.mode != :bulk
    errmsg = Config.procmail.nil? || Config.procmail.empty? ?
      "'procmail' configuration parameter not set.\n\n" :
      "Cannot find '#{Config.procmail}' executable.\n\n"
    Status.warn(errmsg)
  end

  # If there is no spool file specified in the configuration file, then
  # there's no point in continuing.  If there are problems later on writing
  # to the specified spool file, an exception will be raised by append_msg,
  # and transferring should stop.
  @spool = Config.mail_spool
  if @spool.nil? || @spool.empty?
    Status.puts("'mail_spool' configuration parameter not set.")
    return false
  end

  cfg_spam_file_blank = Config.spam_file.nil? || Config.spam_file.empty?

  # We won't be able to filter any messages if the Bayesian database
  # isn't present/valid.
  @valid_db = valid_bayesian_db?

  # Get a copy of the spam singleton object -- we may need to change
  # its filename because procmail (and therefore .procmailrc) are not
  # going to be used.  We'll use the 'spam_file' configuration parameter,
  # if it's specified, instead.
  @spam = Spam.instance

  # Note that we don't use the spam file from the ~/.procmailrc file.
  # Instead use the spam file specified in the configuration file.
  # If the spam_file configuration parameter is not specified or
  # if there is no Bayesian database to use for filtering, then
  # we'll append all messages to the spool file.
  if cfg_spam_file_blank || ! @valid_db
    @spam.file = nil
    if cfg_spam_file_blank
      Status.puts("'spam_file' configuration parameter not set.")
    end
    Status.puts("All messages will be appended to '%s'." % @spool)
  else
    # Inform the @spam singleton of the new spam file.  It will need to
    # know the name of the spam file to rotate if the file gets too large
    # while filtering.
    @spam.file = Config.spam_file
    Status.puts("All spam will be appended to '%s'." % @spam.file)
    Status.puts("All non-spam will be appended to '%s'." % @spool)
  end

  Status.print("Continue? (y/n) ")
  return false unless gets.yes?     # Abandon transfer.
  Status.puts
  true
end
show_msg_details() click to toggle source

Display the current message's 'From ' and subject lines and a percentage of how many messages have been or are currently being processed.

# File buryspam.rb, line 1249
def show_msg_details
  # @processed is incremented after we have finished processing
  # the message.  Add one to get the message number we are currently
  # processing.
  processing = @processed + 1
  pcnt = "%5.1f%%" % (100 * processing / @num_msgs.to_f)
  Status.puts("#{processing}/#{@num_msgs} (#{pcnt})")
  Status.print(@from_line)
  Status.puts(" Subject: %s" % @subject)
end
show_summary() click to toggle source

Display a summary of the number of transferred/filtered messages and the number of messages deposited into each folder.

# File buryspam.rb, line 1240
def show_summary
  Status.puts("Transferred %d message%s:".pluralize(@processed))
  @folders.each { |folder, count|
    Status.puts("%25s: %3d" % [folder, count])
  }
end
update_folders(folder) click to toggle source

Display the folder to which the most recently fetched message was deposited and update the folders count.

# File buryspam.rb, line 1356
def update_folders(folder)
  @fwd.prevent = Config.fwd_inhibit.match(folder) if @fwd
  relfolder = folder.gsub(%r{#{ENV['HOME']}/}, "")
  Status.puts("  Folder: %s" % relfolder)
  @folders[relfolder] += 1
end
valid_bayesian_db?() click to toggle source

Attempt to load the Bayesian database for filtering. If there's a problem, then confirm continuing without filtering. Returns true if Bayesian database is okay or if user wishes to continue without filtering.

# File buryspam.rb, line 1014
def valid_bayesian_db?
  begin
    Bayesian.db
    return true
  rescue Bayesian::DatabaseError
    Status.error($!.message)
    Status.warn("WARNING: buryspam cannot filter spam.")
  end
  return false
end
valid_client?() click to toggle source

Don't allow transfers to be performed on the IMAP server itself (this may cause competetion with the Inbox). Also, ensure that we are connecting to the IMAP server from the configured client machine.

# File buryspam.rb, line 914
def valid_client?
  host_ip = Socket.gethostbyname(Startup::HOSTNAME).last
  server_ip = Socket.gethostbyname(Config.imap_server).last
  if host_ip == server_ip
    Status.warn("Performing transfers on IMAP server forbidden.")
    return false
  end
  client = Config.imap_client.strip
  if ! client.empty? && Startup::HOSTNAME != client
    Status.print("Please transfer from '#{client}'.\n\n")
    return false
  end
  true
end
verify_credentials() click to toggle source

Ensure that the IMAP username and password are valid by logging onto the IMAP server. Returns true if login successful, false otherwise.

# File buryspam.rb, line 1170
def verify_credentials
  Status.print("Connecting to %s\n\n" % @server)
  begin
    Status.puts("login: %s" % @username)
    Status.print("Password: ")
    system("stty -echo")
    @t = gets.chomp
    system("stty echo")
    Status.puts
    login
  rescue Net::IMAP::NoResponseError => err
    logout # Just to be safe.
    Status.puts("\n#{err}")
    Status.print("Please try again...\n\n")
    retry
  rescue Timeout::Error
    Status.puts("\nTimed out trying to connect.")
    Status.puts("IMAP server may be experiencing difficulties.")
    Status.puts("Please try again later.")
    return false
  rescue Interrupt
    Status.puts("\n\nInterrupted.")
    return false
  rescue Exception
    Status.error("#$!\nTerminating.")
    return false
  ensure
    # Make sure we restore echo.
    system("stty echo")
  end
  Status.puts("Login successful.")
  true
end