When doing a “send quick test” from the EXM review tab, we were seeing everything working as expected for the first few sends. However on subsequent sends (usually the 4th – 5th) the email was not sent and the spinner next to the send button took a long time. The XHR request to /sitecore/api/ssc/EXM/ExecuteSendQuickTest eventually timed out and showed the error “We are very sorry, but there has been a problem, please contact your system administrator.”. I thought about doing that….but that would probably cause some sideways looks given I’d be talking to myself.
This only appeared to be an issue in a Paas environment (we were unable to reproduce locally) on Sitecore 9.0.1 (with the EXM cumulative hotfix). This did not appear to be an issue with any dispatch tasks (just the “send quick test”).
Investigating the CM logs revealed that the latter sends that were failing had the following entry, followed by not much at all:
ManagedPoolThread #6 07:36:59 INFO MessageTaskRunner is starting 0 e-mail dispatch worker threads.
Previous successful sends would see something more like :
INFO MessageTaskRunner is starting 10 e-mail dispatch worker threads.
Then a series of entries like the following indicating the thread spinning up, then exiting once no more work is required:
INFO E-mail dispatch worker thread ‘MessageTaskRunner worker thread 3’ is starting.
INFO E-mail dispatch worker thread ‘MessageTaskRunner worker thread 3’ did not find any active tasks and exits.
As this appeared to be a threading issue we experimented with various settings such as Numthreads & MaxGenerationThreads, but see similar issues for each. Just with varying number of threads spinning up/exiting on the successful attempts.
After discussing with support, they were able to identify a bug and provide a patch (ref 214025). This issue applies to Sitecore 9.01 and 9.02.
This patch replaces the implementation of the SendEmail processor in the SendEmail pipeline, which ensures that the thread semaphore is fully released for test sends. Without the patch EXM would only be able to send up to the amount of test messages equal to the MaxGenerationThreads setting as the threads are not properly released on previous attempts.