Welcome to

SqlServer-QA.net

Sign in | Join | Help

SQL Server Storage Engine & Tools (SSQA.net)

SQL Server Tools includes storage engine that includes a complete set of graphical tools and command prompt utilities that allow users, programmers, and administrators. SSMS, SSRS, SSIS & SSAS are most commonly used tools.
SQL Server error: Msg 6101, Level 16, State 1, Line 1 Process ID -2 is not a valid process ID. Choose a number between 1 and 2048 - update

Here is the quirky error that has caused a resources crunch on a SQL instance, Error : Process ID -2 is not a valid process ID.

The actual issue occurred during an update to a table that has been used by Online application to update the action log, intially the DBAs have thought it is a blocking problem and trying to catch up the SPID to kill with no success. As the existence of that process was on a minimum time and hard to catch hold, also another practice is to refer to SQL Server error log to see any DEADLOCK log is printed.

So the practice of inbuilt TSQL to catch orphaned transactions and the above problem has been occuring on regular basis causing more issues to the shared SQL Instance that is important to the business. TO catch up running SP_WHO2 identified that SPID as -2 causing a blocking chain with a dozen other transactions behind it trying to acquire resource. The obvious practice is to kill that SPID which is causing such a mayhem, but if you try to do so you will get specified on the subject line. Unless there is a big problem I wouldn't recommend to kill a SPID which might cause additional problems, when it is performing ROLLBACK operations and in this case say if you are trying to delete x 100 of rows on a busy table!

Ok let us go further in killing this issue then here is the fix the problem on killing this negative SPID, not that easy. One of the DBA suggested to restart the SQL Server services that will wipe this blocking and negative SPID, you bet it will not as it is treated as open transaction and further issues to recover the databases when you have a shared server platform, not a good practice at all.

In order to catch hold of that SPID from the system tables you could run the following TSQL:

select req_transactionUOW  from master..syslockinfo where req_spid = -2

As it is observed the req_transactionUOW is the new column, as per the BOL "Identifies the Unit of Work ID (UOW) of the DTC transaction. For non-MS DTC transactions, UOW is set to 0". Ok, a new indication of option here which is MSDTC, a bit of background on the topic is All distributed transactions not associated with a session are orphaned transactions. The Database Engine assigns all orphaned distributed transactions the SPID value of -2, which makes it easier for a user to identify blocking distributed transactions. I will cover later about how to resolve these inconsistent distributed transactions and coming back to resolve this negative SPID which is classed asorphan distributed transactions which are not associated with any real SPID.

Identify the Unit Of Work (UOW) for orphan distributed transaction :

use master
go
select req_transactionUOW from syslockinfo where req_spid = -2
go

Use KILL 'UOW' to terminate orphaned DTC transactions

KILL 'FE4A57F2-28C5-44F9-8416-B08760DFE7E9'

Once the KILL process completes successfully then you can be assured by referring to SQL error log for following text:

Spid 60 tried to abort the distributed transaction with UOW {FE4A57F2-28C5-44F9-8416-B08760DFE7E9}.

Confirm there no more locks held by -2 by running below in Query Analyzer, sp_lock -2. Ok now the actual process to see whether any distributed transactions are involved within your setup, if not you have to look at the SPID process that is running any TSQL/Stored procedure to optimize for further occurrence on the instance. Not all environments will have MSDTC service usage, but say when any transaction involves data which resides on more than one server, such as when a database record is replicated out to two or more servers, MSDTC needs to become involved where the SQL engine manages it efficiently. Also if such distributed transactions are involved between the servers then any network issue might trigger the inconsistency on SQL creating such a mayhem.

Now coming back to handling the resolution for  the inconsistent distributed transactions, SQL Server BOL has the topic about "Troubleshooting MS DTC Transactions". Also I have seen sometimes that restart of Distributed Transaction Co-ordinator (MSDTC) service might resolve the issue, but not in all the cases. The below steps should avoid the issue:

1.  Type mmc.exe in the Run dialog box to open Microsoft Management Console (MMC).
2.  In MMC, on the Console menu, click Add/Remove Snap-in .
3.  In the Add/Remove Snap-in dialog box, click Add .
4.  In the Add Standalone Snap-in dialog box, click Component Services , and then click Add to add the snap-in.
5.  Click Close .
6.  In the Add/Remove Snap-in dialog box, click OK .
7.  In the console root under Component Services , expand Component Services , point to Computers , point to My Computer , and then click Distributed Transaction Coordinator .
8.  Select Transaction List . You see transactions in the details pane. Right-click the transaction, and then click Resolve . You now have three options: Commit , Abort , or Forget . Use the one is appropriate for your situation.
Windows NT
1.  Click Start , click Run , type dac.exe , and then press enter. The MS DTC Administrative Console opens.
2.  On the Transactions tab, right-click the transaction, and then click Resolve . You now have three options: Commit , Abort , or Forget . Use the one is appropriate for your situation.
 

Additional update on the issue, the same problem appeared again and again causing more performance issue. So the root cause is hardware and network links that are associated with that server. In addition to the MSDTC issues that I referred above both of these problems are adding the fuel. One of the DBA made a mistake by killing that negative SPID causing some more problem due to the nature of ROLLBACK actions, the explanation was he found that PROFILER was showing more CPU resource time for that SPID. I would say use PROFILER as a last resort and the results shown on there is duration after that statement is complete, anyhow the resolution must be find the long running code and explicit KILL of SPID will not help. It is better to follow the results from the DMV sys.dm_exec_requests to get more information.

Posted: Monday, July 27, 2009 12:07 AM by SQL Master

Comments

Other SQL Server Blogs around the Web said:

Here is the quirky error that has caused a resources crunch on a SQL instance, Error : Process ID -2

# June 11, 2009 3:07 AM

Other SQL Server Blogs around the Web said:

Here is the quirky error that has caused a resources crunch on a SQL instance, Error : Process ID -2

# July 27, 2009 1:09 AM

SqlServerKudos said:

Kudos for a great Sql Server article - Trackback from SqlServerKudos

# July 27, 2009 7:37 PM
Anonymous comments are disabled