Friday, March 30, 2012

Recently Rebuilt SQL2K Cluster

I have recently had to rebuild my SQL Server 2000 sp3a 2 node active/passive
cluster running on Windows 2000 AS SP4. We ran into issues about two weeks
ago after a SAN migration (I'll spare all the gory details). When my
previous cluster died, I used one of the nodes as a stand alone instance of
SQL Server (to run our production database). On the other former node, I
removed the OS clustering software and had to manually remove SQL Server. I
eventually (with the assistance of Donna L. at MS) was able to get a single
node cluster of SQL2K running on physical server PRODSQL01 (the virtual
server name is PRODSQLCL01). While the cluster was running as a single node
cluster, I applied sp3a and the 818 hotfix. I then migrated the data from
the stand alone instance to PRODSQLCL01 and started to use it for production.
I turned my attention to the other box and removed and reinstalled the
Windows 2000 cluster software. This box is now called PRODSQL02. I was able
to add PRODSQL02 to the windows cluster, and the successfully added it via
SQL Server setup to the virtual server. My issue now is that PRODSQL02 has
the RTM versions of the binaries. I am not sure if I have a setup issue or I
misunderstand point 3.10 of the sp3a readme. Currently the virtual server is
running on PRODSQL01. If I attempt follow the steps under "If you need to
rebuild a node in the failover cluster..." from PRODSQL02 I am only able to
select the Virtual Server, and if I continue setup I get this error "all
cluster disks available to this virtual server are owned by other node(s)"
and then "Setup was unable to verify the state of the server for an upgrade.
Verify the server is able to start and that you provided a valid sa password
and restart setup". I understand that the virtual resources are only
available on the currently active node, but the way I read the instructions I
should be able to run the service pack installation on the inactive node. Do
I need to rerun the SP (and hotfix) setup on SQL01 (since the virtual server
is running there)? Do I need to move the resources to SQL02 and run the SP
setup there? I just need clarification of where to run the SP setup since I
just added SQL02 to this cluster.
You should be able to run setup on the non-host node and it will upgrade the
local binaries. Try rebooting the newly added RTM node and see if it helps.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
"Justin Hoffmann" <Justin Hoffmann@.discussions.microsoft.com> wrote in
message news:A214ED19-F6CD-4D25-90CE-0A1D456750BD@.microsoft.com...
>I have recently had to rebuild my SQL Server 2000 sp3a 2 node
>active/passive
> cluster running on Windows 2000 AS SP4. We ran into issues about two
> weeks
> ago after a SAN migration (I'll spare all the gory details). When my
> previous cluster died, I used one of the nodes as a stand alone instance
> of
> SQL Server (to run our production database). On the other former node, I
> removed the OS clustering software and had to manually remove SQL Server.
> I
> eventually (with the assistance of Donna L. at MS) was able to get a
> single
> node cluster of SQL2K running on physical server PRODSQL01 (the virtual
> server name is PRODSQLCL01). While the cluster was running as a single
> node
> cluster, I applied sp3a and the 818 hotfix. I then migrated the data from
> the stand alone instance to PRODSQLCL01 and started to use it for
> production.
> I turned my attention to the other box and removed and reinstalled the
> Windows 2000 cluster software. This box is now called PRODSQL02. I was
> able
> to add PRODSQL02 to the windows cluster, and the successfully added it via
> SQL Server setup to the virtual server. My issue now is that PRODSQL02
> has
> the RTM versions of the binaries. I am not sure if I have a setup issue
> or I
> misunderstand point 3.10 of the sp3a readme. Currently the virtual server
> is
> running on PRODSQL01. If I attempt follow the steps under "If you need to
> rebuild a node in the failover cluster..." from PRODSQL02 I am only able
> to
> select the Virtual Server, and if I continue setup I get this error "all
> cluster disks available to this virtual server are owned by other node(s)"
> and then "Setup was unable to verify the state of the server for an
> upgrade.
> Verify the server is able to start and that you provided a valid sa
> password
> and restart setup". I understand that the virtual resources are only
> available on the currently active node, but the way I read the
> instructions I
> should be able to run the service pack installation on the inactive node.
> Do
> I need to rerun the SP (and hotfix) setup on SQL01 (since the virtual
> server
> is running there)? Do I need to move the resources to SQL02 and run the
> SP
> setup there? I just need clarification of where to run the SP setup since
> I
> just added SQL02 to this cluster.
>
|||Geoff:
I've rebooted the newly added RTM node several times and it doesn't seem to
help. When I launch SP3 setup (via setup.bat in the local sql2ksp3
directory) on the newly added RTM node I get to the screen that says Computer
Name. Local Computer is grayed out, there is a box for the existing Virtual
Server name. When I type the virtual server name in the box and press next,
I get the error messages I mentioned in my original post. Is there anything
else you can recommend?
Thanks,
Justin
"Geoff N. Hiten" wrote:

> You should be able to run setup on the non-host node and it will upgrade the
> local binaries. Try rebooting the newly added RTM node and see if it helps.
>
> Geoff N. Hiten
> Microsoft SQL Server MVP
> Senior Database Administrator
>
|||This is how I wound up fixing this:
I ran sp3a setup from PRODSQL01, which was the machine in control of the SQL
Virtual server at the time. It updated the binaries on PRODSQL02, but in
order to do so, the SQL Server service was stopped. Once sp3a setup
completed, I had to reboot PRODSQL02. When it was back up, I verified the
version of the sqlservr.exe file, and it was 8.00.760 (right click,
properties, version). Once I established that sp3a took on the newly added
node, I started the setup for the 8.00.818 security hotfix. SQL Server went
down briefly again, and then the binaries on the newly added box were
updated. I then installed MDAC 2.8 on this node and rebooted. Once
PRODSQL02 was running again I tested moving the cluster resources from 01 to
02. It succeeded. While SQL was running on 02 I ran throught setting up
Imceda SQL Lite Speed (again, it was on 01 but not 02). Once that completed,
I was able to verify that my normally scheduled transaction log backups
completed successfully. Both machines run SQL Server and perform the
backups correctly. I am not sure about the root cause of the issue I
encountered with not being able to run the sp and hotfix on the newly added
node when it wasn't running the Virtual SQL Server (the sp3a readme seems to
indicate that you can), but I finally have a 2 node active passive cluster
running again.
|||Just to clarify, it is active passive, so only one node runs the Virtual SQL
Server at any time. I was trying to say that each node is able to run SQL
Server, and each node is able to run the SQL Lite Speed backups. I had been
concerned that somehow I would get PRODSQL02 all patched but then failing
over would not work correctly. Everything is working OK.
"Justin Hoffmann" wrote:

> This is how I wound up fixing this:
> I ran sp3a setup from PRODSQL01, which was the machine in control of the SQL
> Virtual server at the time. It updated the binaries on PRODSQL02, but in
> order to do so, the SQL Server service was stopped. Once sp3a setup
> completed, I had to reboot PRODSQL02. When it was back up, I verified the
> version of the sqlservr.exe file, and it was 8.00.760 (right click,
> properties, version). Once I established that sp3a took on the newly added
> node, I started the setup for the 8.00.818 security hotfix. SQL Server went
> down briefly again, and then the binaries on the newly added box were
> updated. I then installed MDAC 2.8 on this node and rebooted. Once
> PRODSQL02 was running again I tested moving the cluster resources from 01 to
> 02. It succeeded. While SQL was running on 02 I ran throught setting up
> Imceda SQL Lite Speed (again, it was on 01 but not 02). Once that completed,
> I was able to verify that my normally scheduled transaction log backups
> completed successfully. Both machines run SQL Server and perform the
> backups correctly. I am not sure about the root cause of the issue I
> encountered with not being able to run the sp and hotfix on the newly added
> node when it wasn't running the Virtual SQL Server (the sp3a readme seems to
> indicate that you can), but I finally have a 2 node active passive cluster
> running again.
|||The current correct term is "Single-Instance". "Active-Active" and its
cousins all refer to technology used specifically in SQL server 7.0 only.
Sometimes the binary only upgrade doesn't work right. Your solution is the
fallback solution but it does have the obvious downside of taking the SQL
server offline for a short time during the upgrade.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Adminstrator
"Justin Hoffmann" <JustinHoffmann@.discussions.microsoft.com> wrote in
message news:DA3E156A-EAFD-45D9-B72E-AB4A657D53A0@.microsoft.com...[vbcol=seagreen]
> Just to clarify, it is active passive, so only one node runs the Virtual
> SQL
> Server at any time. I was trying to say that each node is able to run SQL
> Server, and each node is able to run the SQL Lite Speed backups. I had
> been
> concerned that somehow I would get PRODSQL02 all patched but then failing
> over would not work correctly. Everything is working OK.
> "Justin Hoffmann" wrote:

No comments:

Post a Comment