I have found, that on of my customer domains have problem to send messages outside their environment. Some messages got stuck in queue for several hours / days without any reason to do so. SMTP traffic was OK to most of other domains, but some had problem. I suspect, that the reason was more SPF TXT records for single domain. Example:
domain.com TXT="v=SPF1 mx host1.domain1.com ~all" domain.com TXT="v=SPF1 mx host2.domain2.com ~all" domain.com TXT="v=SPF1 mx host3.domain3.com ~all"
RFC 4408 stays, that no multiple SPF records should be available:
3.1.2. Multiple DNS Records: A domain name MUST NOT have multiple records that would cause anauthorization check to select more than one record. See Section 4.5
for the selection rules.
Explanation is quite logical. If there is more than one SPF record, permanent error is returned.
4.5. Selecting Records Records begin with a version section: record = version terms *SP version = "v=spf1" Starting with the set of records that were returned by the lookup, record selection proceeds in two steps: 1. Records that do not begin with a version section of exactly "v=spf1" are discarded. Note that the version section is terminated either by an SP character or the end of the record. A record with a version section of "v=spf10" does not match and must be discarded. 2. If any records of type SPF are in the set, then all records of type TXT are discarded. After the above steps, there should be exactly one record remaining and evaluation can proceed. If there are two or more records remaining, then check_host() exits immediately with the result of "PermError". If no matching records are returned, an SPF client MUST assume that the domain makes no SPF declarations. SPF processing MUST stop and return "None".
Well. The cause of this “implementation” is, that some messages from domain containing wrong SPF record to domain with SPF check might be lost (-All) or delayed. I am going to investigate this further. If you have some experience with similar problem, please let me know.