If we are to effectively target spammers, proper identification of the target is important. If 90% of the complaints a provider gets are invalid, they tend to be less apt to give needed attention to the 10% that are valid. Figuring out where a spam really came from is somewhat of an art. The trick is to dig out the real information the spammer can't easily alter or forge while discarding the forged information. Forget the "To:", "Reply-To:", "From:", and "Mesg-ID" lines as they are all guaranteed to be forged. The key is careful scrutinization of the "Received:" lines, and the knowledge of the use of basic tools such as whois. Each Received line will have something like: Received: from iac6.navix.net (iac6.navix.net [207.91.5.4]) by mx1.eskimo.com The address outside the parenthesis is the name the connecting host gave the SMTP host in the "Hello" command where it identifies the host. The name inside the parenthesis is the real host name. The SMTP host looks up the IP address (inside the brackets) and if it can resolve the hostname puts that inside the parenthesis. The top Received: line is the closest to you, i.e., the host of your provider as it received the mail from some other host on the net. If the name outside the parenthesis and inside the parenthesis don't match, then the connecting machine in effect lied and is most likely the spammers. In this case, use the name IP address inside the parenthesis and do a whois on the net block. To do this use the first three octets and a trailing dot. For example if it came from our server, say, 204.122.16.4, you would do: whois -h whois.arin.net 204.122.16. And in this case, this would show a net block: OrgName: Eskimo North OrgID: ESKI Address: P.O. Box 55816 City: Seattle StateProv: WA PostalCode: 98155 Country: US NetRange: 204.122.16.0 - 204.122.31.255 CIDR: 204.122.16.0/20 NetName: ESKIMO NetHandle: NET-204-122-16-0-1 Parent: NET-204-0-0-0-0 NetType: Direct Allocation NameServer: MAIL.ESKIMO.COM NameServer: ESKIMO.COM NameServer: ISUMATAQ.ESKIMO.COM NameServer: ESKINEWS.ESKIMO.COM Comment: ADDRESSES WITHIN THIS BLOCK ARE NON-PORTABLE RegDate: 1994-08-16 Updated: 1996-02-22 TechHandle: RD160-ARIN TechName: Dinse, Robert TechPhone: +1-206-361-1161 TechEmail: nanook@eskimo.com And you would see above that I am the net admin for that block. Now, you want to do this because it will tell you not only what provider the actual mail originated from, but also who the upstream provider is in case either the information from reverse DNS was bogus or the provider is not cooperative. About 90% of the spams I have seen lately originate on a dial-up port of a major provider such as Compuserve, AOL, Netcom, ATT, MCI, Sprint, etc. Usually that top line will show a name coming from a dial-IP port. The next Received line will usually have an address of that port. There will usually be additional Received lines below this. An important thing to know is no legitimate orginization is going to have their MX servers on a dial-up PPP connection, so Received lines below this point are bogus. There are other things spammers typically do wrong when forging that reveals what information is forged. Some typical mistakes include putting the forged "Received" lines below other message header lines. Any "Received" lines below a user header line like Subject or To should be ignored, they are forged. Look for mistakes in the Received lines, like invalid numerical components in the octets of the IP address (each octet can be from 1-255, so if you see an IP address like 954.23.22.42 you know it's forged). Look for format errors; Received: from iac6.navix.net (iac6.navix.net [207.91.5.4]) by mx1.eskimo.com This is an example of a properly formatted Received: line. There should be a space after the "hello" address before the left parenthesis, no space between the left parenthesis and host name, a space after the host name before the bracket, and no space after the bracket before the right parenthesis, and then a sapce after before the 'by'. By determining the first forged received line you are also determining the last possible valid received line.