Step by Step How to Failover in Cluster and Apply Patches in Cluster and Active Passive and Active Active Cluster Explanation
Cluster information
2 node Cluster Active-Passive Method
We have installed one SQLServer from New SQL Server Failover Cluster @ SQLNODE01
After completion on Failover cluster SQL Installation on SQLNODE01
We launched SQL installation from SQLNODE03 and use
add node to SQL failover cluster @ SQLNODE03
so only one SQL services are running on one node and other services will be in stop status it is called Active Passive Cluster method
Windows Cluster Name: WinClusterSQLTechi
Domain:SQLTechi.com
Roles:3 Clustered Roles ( 2 Alwayson and 1 Clustered SQL roles)
Nodes: 2 Node
Node info: SQLNode01, SQLNode03
SQL Cluster Name: SQLTechiCluster
Roles information
Node information
SQLNode01 info:
SQLNode03 info:
SQL Cluster information
SQL Cluster Services info NODE01
SQL Cluster Services info NODE03
Disk information:
Network information:
Automatic Failover:
sometime when hardware/network failures Cluster will automatically failed to possible node
How to failover Manually on Cluster SQL Services:
here you can see current owner node is SQLNODE03
Now we should go to Roles and right click on SQL cluster and choose Move and select Node/Possible Node
we can choose which node to SQL Services can run as see below
Note:
Here we can defined which is preferred owner
We can also whenever preferred node is ready we can mention to fail back and defined how much failover can happen from here
Apply Patches on SQL Server Active-Passive Cluster
We should need to first apply Patches on Passive Node (Because SQL Services already on stopped state)
Here you can see Current Owner node is SQLNODE03
so we can apply patches to SQLNODE01
Verify instance edition info
Verify SQL Services are stopped state
Step 1:
we need to copy patch setup file to appropriate node (our case NODE01)
Step 2:
Right Click and run as admin in setup file of SP3 2012
Step 3 :
Complete step by step wizard
now we have successfully applied Service Pack 3 on SQL 2012 @ SQLNODE01 (passive Node)
Now we need to failover to SQLNODE01 from SQLNODE03 to apply service pack on SQLNODE03 (NEW Passive)
Now we have successfully failed over to SQLNODE01 From SQLNODE03
we can verify it SP3 upgraded to SQLNODE01
We now ready to apply service pack on new passive Node (SQLNODE03)
You can see services stopped state on SQLNODE03
Now applying SQL2012 SP3 setup file run as admin @ SQLNODE03 (New Passive Node)
after failed over to SQLNODE03 We can SSMS to see Edition info upgraded to SP3 SQL2012 From
SQL2012 RTM
Active Active Cluster
We are going to Install another SQL Server Services and convert this Active Passive to Active Active Cluster
Current Mode (Active Passive) we can failover to any node like SQLNODE01/03
SQLNODE01 SQLNODE03
VSQL failover Installed added failover node to SQLNODE01
we are planning to install another SQL Installation VSQL1 on SQLNODE03 and add to
SQLNODE01 (Active Active)
SQLNODE01 SQLNODE03
Will add failover node to SQLNODE03 VSQL1 failover will Install
Starting Installation on SQLNODE03 for SQL 2014
We have installed SQL 2014 on SQLNODE03 with failover cluster installation
now we need to add node to this cluster on SQLNODE01
DONE . We have successfully added SQLNODE01 node to cluster
now failover cluster manager shows active active node as see below
SQL Services info @ SQLNODE03
here VSQL2014 is installed on SQLNODE03 and it is currently active owner node it will be in running status see below
SQL Services info @ SQLNODE01
here VSQL is installed on SQLNODE01 and currently it is owner node so it will be in running status see below
login via SSMS SQLNODE03
SQL EDITION information @ sqlnode03 and @sqlnode01
Trying with alwaysON
We need to enable alwayson availability group on services of SQL before we do here
After we failed over all VSQL & Vsql2014 to SQLNODE03 it looks like below
VSQL2014 Installed SQL 2014 (sqlnode03) and SQLNODE03 Installed SQL 2012
VSQL2014 Installed SQL 2014 (sqlnode03) and VSQL Installed SQL 2012
When we are trying SQLNODE01
Checking for compatibility of the database file location on the secondary replica resulted in an error. (Microsoft.SqlServer.Management.HadrTasks)
The following folder locations do not exist on the server instance that hosts secondary replica Node1\AGTest: S:\MSSQL11.AG1\SQL_DATA; Microsoft.SqlServer.Management.HadrTasks)
– This error occurs if the drive letters or the folder path does not match between primary and secondary replica.
– To resolve this error, make sure that same drive letter and folder path exists on both the servers or perform manual synchronization of secondary server databases using backup/restore.
Failed to create, join or add replica to availability group ‘AGTest’, because node ‘Node1’ is a possible owner for both replica ‘AGTest\AGTest’ and ‘Node1\AGTest1’. If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again. (Microsoft SQL Server, Error: 19405)
– This error occurs, if SQL Server instance where you are setting up AlwaysON Availability Group is part of SQL Server failover clustering and has both nodes as possible owners and then You tried to add another availability Group for another SQL cluster instance involving these nodes.
– To resolve this error, either follow the solution suggested in the error message “If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again” or instead of using cluster SQL instances, just use standalone instances.
– This error occurs, if SQL Server instance where you are setting up AlwaysON Availability Group is part of SQL Server failover clustering and has both nodes as possible owners and then You tried to add another availability Group for another SQL cluster instance involving these nodes.
– To resolve this error, either follow the solution suggested in the error message “If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again” or instead of using cluster SQL instances, just use standalone instances.
Error while trying to connect to AlwaysON Availability group using Listener Name
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 2)
– This error occurs if the port number of the Listener is not 1433. To resolve this issue either use the port number along with the listener name like SQLAGLSTNR,1467 or create an alias.
– Make sure listener port is open to communicate. Test using telnet listenername portnumber
– Communication only works when using TCP network protocol.
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 2)
– This error occurs if the port number of the Listener is not 1433. To resolve this issue either use the port number along with the listener name like SQLAGLSTNR,1467 or create an alias.
– Make sure listener port is open to communicate. Test using telnet listenername portnumber
– Communication only works when using TCP network protocol.
Database Mirroring login attempt by user ‘Domain\ComputerName$.’ failed with error:
‘Connection handshake failed. The login ‘Domain\ComputerName$’ does not have CONNECT permission on the endpoint. State 84.’
– This error will be logged in SQL errorlog when trying to setup AlwaysON or Database Mirroring. This error occurs if the SQL Server services are running under local system accounts or if the SQL Service account does not have connect permission on the endpoint.
– To resolve the error, Change the SQL Server services to run under a domain account and then grant connect permission on endpoint to SQL Server service account.
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\SQLServiceAccount]
GO
‘Connection handshake failed. The login ‘Domain\ComputerName$’ does not have CONNECT permission on the endpoint. State 84.’
– This error will be logged in SQL errorlog when trying to setup AlwaysON or Database Mirroring. This error occurs if the SQL Server services are running under local system accounts or if the SQL Service account does not have connect permission on the endpoint.
– To resolve the error, Change the SQL Server services to run under a domain account and then grant connect permission on endpoint to SQL Server service account.
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\SQLServiceAccount]
GO
Summary for the replica hosted by ServerName\INSTANCE2 Replica mode: Asynchronous commit
This replica will use asynchronous-commit availability mode and support only forced failover
(with possible data loss).
Note: This is a Failover Cluster Instance. Failover Cluster Instances do not support AlwaysOn automatic failover.
– This error occurs, if you are using SQL clustered instances to setup AlwaysON Availability Group.
– Cluster SQL Server instance will not support the automatic failover in availability groups. If you want automatic failover, install SQL server instance as standalone instance.
This replica will use asynchronous-commit availability mode and support only forced failover
(with possible data loss).
Note: This is a Failover Cluster Instance. Failover Cluster Instances do not support AlwaysOn automatic failover.
– This error occurs, if you are using SQL clustered instances to setup AlwaysON Availability Group.
– Cluster SQL Server instance will not support the automatic failover in availability groups. If you want automatic failover, install SQL server instance as standalone instance.
Cluster network name resource ‘TestAG’ failed to create its associated computer object in domain ‘testdomain.com’ during: Resource online.The text for the associated error code is: A constraint violation occurred.Please work with your domain administrator to ensure that:
The cluster identity ‘SQLClus$’ has Create Computer Objects permissions. By default all computer objects are created in the same container as the cluster identity ‘SQLClus$’.The quota for computer objects has not been reached. If there is an existing computer object, verify the Cluster Identity ‘SQLClus$’ has ‘Full Control’ permission to that computer object using the Active Directory Users and Computers tool.
– This error occurs if Cluster Name Object does not have rights to create a new Virtual Network Name Object. Refer below article for instructions to prestage the Virtual Name Object.
http://technet.microsoft.com/en-us/library/dn466519.aspx
The cluster identity ‘SQLClus$’ has Create Computer Objects permissions. By default all computer objects are created in the same container as the cluster identity ‘SQLClus$’.The quota for computer objects has not been reached. If there is an existing computer object, verify the Cluster Identity ‘SQLClus$’ has ‘Full Control’ permission to that computer object using the Active Directory Users and Computers tool.
– This error occurs if Cluster Name Object does not have rights to create a new Virtual Network Name Object. Refer below article for instructions to prestage the Virtual Name Object.
http://technet.microsoft.com/en-us/library/dn466519.aspx
Joining database on secondary replica resulted in an error.
(Microsoft.SqlServer.Management.HadrTasks)
Failed to join the database ‘AGTest’ to the availability group ‘AGGrpTest′ on the
availability replica ‘Replica2′. (Microsoft.SqlServer.Smo)
The connection to the primary replica is not active. The command cannot be processed.
(Microsoft SQL Server, Error: 35250 Level 16, State 7)
– Try below steps to fix the error
o Make sure that the alwaysON endpoint [Hadr_endpoint] is not blocked by firewall
o Verify and make sure that SQL Server service account of primary server is added as a login on all the secondary servers and vice-versa.
o If SQL Server service accunt is “Nt service\” or local system account then ensure system account (Domainname\systemname$) of each replica is added as a login to other replicas.
CREATE LOGIN Domain\replica2$] FROM WINDOWS
o Grant connect permission on alwaysON endpoints on each replicas for SQL Server service account of all other replicas On Secondary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica1$]
On primary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica2$]
o Make sure SQL Server name (select @@servername) matches with hostname of the server.
o Make sure cluster service startup account is added as SQL Server login.
(Microsoft.SqlServer.Management.HadrTasks)
Failed to join the database ‘AGTest’ to the availability group ‘AGGrpTest′ on the
availability replica ‘Replica2′. (Microsoft.SqlServer.Smo)
The connection to the primary replica is not active. The command cannot be processed.
(Microsoft SQL Server, Error: 35250 Level 16, State 7)
– Try below steps to fix the error
o Make sure that the alwaysON endpoint [Hadr_endpoint] is not blocked by firewall
o Verify and make sure that SQL Server service account of primary server is added as a login on all the secondary servers and vice-versa.
o If SQL Server service accunt is “Nt service\” or local system account then ensure system account (Domainname\systemname$) of each replica is added as a login to other replicas.
CREATE LOGIN Domain\replica2$] FROM WINDOWS
o Grant connect permission on alwaysON endpoints on each replicas for SQL Server service account of all other replicas On Secondary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica1$]
On primary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica2$]
o Make sure SQL Server name (select @@servername) matches with hostname of the server.
o Make sure cluster service startup account is added as SQL Server login.
Error while trying to setup AlwaysOn availability group
Operating System Error 1265(The system cannot contact a domain controller to service the authentication request. Please try again later.).
– This error can occur if the password of inter-domain trust account is not synchronized on both sides of the trust relationship.
– Refer below link to see if it helps fixing the error, else engage your domain Administrator.
http://support.microsoft.com/kb/816577/en-us
Operating System Error 1265(The system cannot contact a domain controller to service the authentication request. Please try again later.).
– This error can occur if the password of inter-domain trust account is not synchronized on both sides of the trust relationship.
– Refer below link to see if it helps fixing the error, else engage your domain Administrator.
http://support.microsoft.com/kb/816577/en-us
This is applicable on below versions of SQL Server
SQL Server 2012
SQL Server 2014
SQL Server 2014
Possible owners Windows 2008 R2 Cluster
In the Microsoft advice for applying a SQL Server 2008 R2 service pack on a cluster, it states that it is very important to “remove nodes from the possible owner nodes list in the failover cluster instance.” This was frustrating, because I didn’t know what that meant. Plus, I had trouble finding this in the Cluster Manager tool in Windows Server 2008 R2.
One problem I have in understanding this was that there are many possible services where you can remove Possible Owners. I chose to only do this on the cluster name, but maybe you would want to remove possible owners on the SQL Server service. Some more clarity is needed in the Microsoft documentation (obviously, otherwise I wouldn’t have had to write this blog post).
Eventually I poked around enough to find the Possible Owners check boxes. Under services and applications, navigate to Services and Applications, then SQL Server (MSSQLSERVER).
In the right-hand pane, right click on the server name, and select Properties.
Go to the Advanced Policies tab, and remove the possible owner, according to the instructions from Microsoft. Continue to follow the directions from Microsoft to install the service pack.
Preferred Owners are a different thing, and that is done in the left pane by right-clicking on SQL Server (MSSQLSERVER), then the preferred owners check boxes will show up on the General tab.
go to properties and check advanced policies and choose possible owners
Preferred Owner from Windows Server 2012:
Cluster Steps:
Disable firewall on all 4 (clusterDC,node1,2,3)
disable UAC on all 4
Add.net framework 3.5 on all 4
Add failover cluster feature on all 3 servers(node1,2,3)
update windows software
set timezone on all 4 (time zone should be same for all)
Need to change Computer name
Add application role on all 3 servers (node1,2,3)
Create Cluadmin on all 3
add Cluadmin to admin group
add IPADDRESS on all 4
Add additional adapter for Node1,2,3
we need to restart all server which changed ipaddress
add features on DC to add ACTIVE DOMAIN SERVICES
PROMOTE add domain userpc.local
Logged in with Domain\administrator ON DC
In DC set reverse lookup zone
join all 3 nodes to DC domain
Restart all 3 nodes
add cluadmin as domain controller
use active directory users and groups
Logged in with Domain\cluadmin on all 3
all 3 each nodes use iscsi initiator to use SHARE STORAGE
Tools-->iscsi initiator
use computer management --> disk managementon
all 3 nodes each to bring db online
drive letters to be same all 3 nodes (ISCSI drives should be same on node 1,2,3)
node1
create windows cluster from failover wizard
we need to give cluseripaddress
configure quorum
Sample IP Details:
DC Domain Controller
NODE01
NODE02
You cannot add a domain controller as a node in a Windows Server 2012 failover cluster environment
Ref:
https://support.microsoft.com/en-in/help/2795523/you-cannot-add-a-domain-controller-as-a-node-in-a-windows-server-2012
https://support.microsoft.com/en-in/help/2795523/you-cannot-add-a-domain-controller-as-a-node-in-a-windows-server-2012
When you build a Windows Server 2012 failover cluster environment, you cannot add a server that has the Active Directory Domain Services (AD DS) role as a node.
We do not support combining the AD DS role and the failover cluster feature in Windows Server 2012.
Although we do not recommend this, you can enable domain controllers as a cluster node in Windows Server versions earlier than Windows Server 2012. However, starting with Windows Server 2012, we no longer support this configuration.
Positive site, Lots of useful information here . This is what readers need to Know and Glad I found your Channel! This is something I have been considering for a year now. . I like it, thank you for showing your model and mindset on Windows Versions . You are the best because you are providing so much of value.
ReplyDelete