Home > SharePoint 2010, Workflow > How to Fix “Workflow failed to start” in SharePoint2010

How to Fix “Workflow failed to start” in SharePoint2010

Problem:

Recently I am trouble-shooting some production Workflow (WF) issues .Everything works fine in dev and Stage but not in production. The problem we are having is that , we use code to trigger the WF asynchronously (i.e. SharePoint workflow timer job needs to pick it up and invoke the workflow , see below for the code) and it failed the first time and then start working 10 mins later see screenshot below also it work fine when trigger the workflow manually . First thing we did is to enable workflow related logging as described in my previous log but find nothing.

Here are the code we use to invoke the workflow:

SPWorkflow wf = CurrentSite.WorkflowManager.StartWorkflow(item, workflowAssociation, “<Data></Data>”, SPWorkflowRunOptions.Asynchronous);


Solution:

After some more research, we find out the topology of the Production farm is different from staging, in staging we have 2 apps server and 2 web front end servers and Microsoft SharePoint Foundation Workflow Timer Service (SFTS) is started by default on all the servers in the farm ( We have 6 Server – 2 WFE, 2 APP/ Crawl(Index), 2 DBs) and We have been observing that, this STFS, running on Crawl and App server is most likely causing problem for the workflow failure. After stopping this service in APP servers ,the workflow works like a charm.

Here is summary of the issues and solution I grabbed from MSDN Forum:

Problem:

·         A state machine workflow is deployed on multi server SharePoint server 2010 farm.

·         Workflow has DelayActivity used in multiple states.

·         Workflow(s) logs an error in workflow history list as “<workflow name> failed to run”  (randomly, no specific pattern)

·         ULS logs, Event Viewer has no error logged

Analysis:

·         I understand that ( I would love to get my understanding corrected if not the case) during processing of delayactivity by Workflow, a timerjob is created and scheduled/added on (or may be picked up by) the server(s) who has Microsoft SharePoint Foundation Workflow Timer Service running on it.

·         As a part of executing this timer job(on time maturity), server (WFE/Crawl/APP) try to process the instruction which in this case rescheduling the workflow execution and this requires workflow assembly to be available on this server. (Do read this very interesting post if want to understand how workflows are executed http://www.the14folder.com/2010/07/25/migrating-workflows-question)

·         Now you may be wondering

   Should Workflow assembly be present on this server roles?

   If yes then how does Workflow assembly go missing from Crawl, App server

·         Well the culprit was a value ‘WebFrontEnd’ of attribute ‘DeploymentServerType’ in a Solution manifest file. This has caused the solution deployment process to copy the Workflow assembly only to WFE’s and not on Crawl and App Server roles (http://msdn.microsoft.com/en-us/library/ms412929.aspx)

·         Where in, since Microsoft SharePoint Foundation Workflow Timer service was running on Crawl and App servers as well, timerjob execution was failing with either “Feature not found” and/or “Assembly cannot be loaded information in ULS logs (you will find these only if you enable verbose level logging http://technet.microsoft.com/en-us/library/ee748656.aspx)

Solution:

·       We stopped the Microsoft SharePoint Foundation Workflow Timer Service on the Crawl and App server roles, since as referred by the Lily, it is not recommended to have these service running on App and Crawl server roles.

Take Aways:

·         Be sure of the attributes that you choose in your Solution manifest file.

·         Enable the verbose level logging if you do not see any error in ULS logs, Event Viewer

·         Make sure that the services running on each of the server roles are MUST to have and you know why you have chosen it that way.

·         Stop the Microsoft SharePoint Foundation Workflow Timer Service on the server roles where you not intend to deploy your solution(s) that would require this service.

References:

    http://social.technet.microsoft.com/Forums/en-US/sharepoint2010setup/thread/90e84e00-d956-492c-b096-56adb839494c/

    


Advertisements
  1. sharepointdiva
    May 11, 2012 at 7:45 am

    It’s so rare to find your exact problem described AND the answer for it in the same place. Thank you!

  2. September 5, 2012 at 12:47 am

    Thank you, this helped me alot!

  3. August 22, 2013 at 4:32 pm

    very helpful, thanks for posting Q&A in the same place.

  1. March 1, 2012 at 12:03 pm
  2. May 11, 2012 at 7:36 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: