There are known knowns; there are things we know that we know.
There are known unknowns; that is to say, there are things that we now know we don’t know.
But there are also unknown unknowns – there are things we do not know we don’t know.
(Donald Rumsfeld, United States Secretary of Defense).
Last July saw me work my last release weekend for quite some time. But the last weekend was not without it’s challenges, as we were attaching a new JBOD to our SQL Cluster (upgrading from 8 disk RAID 5 to 20 disk RAID 10 for OLAP Databases, really excited so see how they work out), amongst some other big changes. And you can plan for everything, except for the Unknown Unknowns.
Attaching this JBOD to a DAS, which in turn is connected to a SQL Cluster required a restart of both servers in the cluster. When the servers came back up all the clustered services and dependencies were up and running, except for one SSAS Instance. This SSAS Instance ran on the “passive” node; that is, we have a primary node with a Clustered Role of SQLDB and SSAS. We utilize the other node by having another Cluster Role of just SSAS running (Should one of the instances failover the servers have the capacity for one node to run both instances without to much performance degradation until we get the other node working again. It’s a great way to reduce redundancy).
I failed the SSAS Instance over between nodes and still the error occurred. I checked the event viewer and the log of the instance and it read
- The service cannot be started: The following system error occurred: An attempt was made to access a socket in a way forbidden by its access permissions.
Weird. This instance was working fine since February, and we had failed over several times, though not in a while. After 3 hours of fruitless Googling and checking and re-checking I opened up SSMS and tried to connect, just to see the error. As expected, it was the typical error when SSAS is not up and running:
- “Error with connecting analysis services through SSMS, No connection could be made because the target machine actively refused it 127.0.0.1:2383”
Knowing that we could have both Clustered Roles running on one machine, it was still trying to connect on 2383, the default port, which is fine (see below for explanation). I tried starting SSAS through cmdline earlier using this command
- “C:\Program Files\MicrosoftSQLServer\MSAS11.SSAS\OLAP\bin\msmdsrv.exe” -s “H:\OLAP\Config”
The error was the same, but I thought it worth looking in the config file “msmdsrv.ini”. I noticed that the <port> key value pair was set to 2382. 2382? Looks like someone had manually configured this instance to point to the SQL Browser. This would not work. In Clustered environment, SSAS can listen only on Port Number 2383. Setting it back to 0 would mean that it would by default listen on the port 2383. If both SSAS instances are running on the box then it will be be directed to SQL Browser, which will dynamically assign it a port to listen to. So i set it back to 0, and started the service successfully.
I’d love to find the culprit, but probably never will. The important thing was that I could get the instance running.Follow @rPh0enix