Opt-In List Manager: Extract And Clean


This tool allows to extract e-mail addresses from the source files and cleans only syntactically correct e-mail addresses.

What are syntactically correct e-mail addresses?

  1. A valid address contains only letters (a-z), numbers (0-9), hyphens ('-'), underscores ('_'), periods ('.') and only one 'at' sign ('@').
  2. Valid address must begin with a letter or a number.
  3. Valid address must not exceed the specified maximum address length. If none specified, then max length is 45.
  4. There must be at least one '.' character in the address.
  5. There must be at least one character before the '.' and at least one character after it.
  6. A valid address must end with a character (a-z).
  7. Address must have at least 2 characters before the '@' sign. For example, it will kick out ones like 1@mail.ru.
  8. In the case of AOL addresses, OILM will allow for AOL e-mail addresses that contain spaces to the left of the '@' sign. Also a valid AOL screen name must be between 3 and 16 characters to the left of the '@' sign, and must begin with a letter. This conforms to the exact syntax for valid AOL addresses.
  9. Valid address must not contain '-' after the '@' sign.

There are two modes:

  • Extract And Clean All Addresses From These Files And Folders. Input files can be of any format - ASCII text, binary or whatever. Output file will contain syntactically correct e-mail addresses, one e-mail address per line. Output e-mail addresses cannot be longer than Maximum Address Length, which defined in Options dialog box.
  • Clean Mail Lists. Extracts lines with syntactically correct e-mail addresses and put them to the output file. Unlike Extract And Clean All Addresses you can process multicolumn mail lists. Also in this mode you can set Clean Options to perform some formatting of the result list:
    1. Replace delimiters by TAB/COMMA useful when columns in some input lists delimited by tab, and columns in the others delimited by comma, but you want to have unified output lists, delimited by tab only or by comma only.
    2. Remove quotes, Remove leading and trailing spaces from fields remove unnecessary characters (quotes and spaces) from fields. Remember, other functions of list manager can be sensitive to existence of quotes or spaces. For example, two emails "joe@aol.com" (with quotes) and joe@aol.com (without quotes) will be different in comparison/sorting. So, for correct working your email lists should not contain quotes and leading/trailing spaces in email fields.
    3. Move emails to the beginning, Remove empty fields, Reorder/Remove fields allows you re-arrange columns in the lists and remove unnecessary ones.
    4. Convert dates to system format (on the Date/Time Format tab) option used to convert dates and times of different formats to unified format (specified in system Regional Options). Enter numbers of date/time fields, delimited by comma, in the edit box below the option. Some examples of date/time values that can be converted:
          2005-7-27 16:55
          7.27.2005
          20050120

Extract And Clean All Addresses From These Files And Folders

Input files and folders from which e-mail addresses will be extracted.
You can simply drag and drop files and folders from windows explorer into this list or press Add Files.../Add Dir... buttons to open appropriate dialog box.

Output File Containing Clean Email Addresses

Specifies an output file in which will be kept e-mail addresses.

Sort (Output File)

When this option is turned on then output file will be sorted.

Remove Duplicates (Output File)

With this option you can remove duplicate e-mail addresses from output file. This option is accessible only if sorting of output file is turned on.

Sort By Domain (Output File)

This option allows you to sort output file by domain. The Sort option must be also turned on.

Reject Any Addresses Longer Than

You can tell the OILM to reject all addresses longer than a specified number of characters by checking this option and selecting the maximum allowed address length.

Allow Embedded Spaces In AOL Usernames

When this option is turned on then AOL addresses like this will be allowed: bill gates@aol.com. The final result on your output file would have the username portion of e-mail address without spaces. Final result of the above example is billgates@aol.com.

No Duplicate Domains

Check this option if you need only one e-mail address from each domain present in the output file.

Reject non-country domains with 2 or more dots and country domains with 3 or more dots

You can remove emails of non-country domains with 2 or more dots after @, like xxx@yyy.zzz.com, and emails of country domains with 3 or more dots, e.g. xxx@yyy.zzz.com.au. Press Edit country domains button to edit the list of country domains.

Reject domains that start with numbers

Check this option to reject e-mail addresses of  domains that start with numbers.

Multi Column Support

This option is accessible only in Clean Mail Lists mode and allows you to process multicolumn e-mail lists. See What is Multi Column Support for details.

Save Rejected Addresses Into The Following File

You can specify a file in which you want to keep the rejected addresses. Leave it blank to prevent keeping the rejected addresses.