Monday, March 8, 2010

Yet another happy EqualLogic thread

http://episteme.arstechnica.com/eve/forums/a/tpc/f/833003030931/m/679006453041

Highlights

I've been playing the last couple of days.

The HIT kit and "Auto Snapshot Manager" tools seem to make it ridiculously easy to get hosts connected to the EQL and to take both LUN level and application (SQL/Exchange) aware snapshots of databases.

For example I'm not a SQL admin/expert but I put SQL on a test VM, created a DB and Log LUN on the EQL, created a test DB, took a DB aware snapshot of it, deleted the original, and from within the ASM tools restored it and watched it appear in real-time.

Maybe I'm easily impressed but I really do find this thing impressive given the cost/all-in licensing nature.

***

I've had EQ units for 5 years and was always happy with the improved functionality that came out with subsequent firmware updates. Even when Dell bought them the SANHQ program moved light years ahead of the beta I had tested for almost a year.

***

The EqualLogic gear is super-easy to setup. I have a pair of them racked--just waiting for the switches to come i and I am golden.

Sunday, March 7, 2010

Important concept to understand with LH

http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1268007997457+28353475&threadId=1411427


Hello, we just purchased a lefthand P4500 10.8 TB SAN.We created a network raid 1 volume (spanned across the two lefthand nodes)We configured MPIO as documented and it discovered well 4 paths.However, we started to do failover tests by scheduling a power off of one of the nodes (the node on which the virtual manager was running)Of course quorum is lost and access to the volume is lost, but we cannot manage to restore quorum by restarting the virtual manager on the other node. The CMC console asks us to delete first the VM, and to stop it beforehand, but it is marked as offline already.Could you please give me some advice about regaining quorum or some documentations ?Thanks.

Fran Garcia
Mar 2, 2010 09:10:24 GMT Unassigned
Hola Rodrigo :-)You need to configure a FailOver Manager to achieve a resilent cluster. In order to have cluster quorum you need to have at least (n/2)+1 active nodes, and of course that cannot be done with a 2-node cluster.There is a FOM VMWare appliance included in the Lefthand installation CD.
Mark...
Mar 2, 2010 09:53:31 GMT Unassigned
Hi,HP/LHN recommend a failover manager (FOM) at a third site as mentioned above. With the FOM then your cluster should stay up with no disruptionYou can use the Virtual Manager (VM)but the thing with the VM is NOT to start it on a node -just create it. Then, in your config of two nodes, should one of the nodes fail then you start the VM on the remaining node. This keeps disruption to a minimum. If, as in you case, you have the VM started and it is on the node that crashes you will not be able to start a new VM - as you have found out! Only one VM or FOM allowed per management group.

Thursday, March 4, 2010

Complexity of LH

This post (although take with a grain of salt) points to possible more complexity and lock ups for LH.

Sorry, you will have to http://www.avivadirectory.com/bethebot/ in order to view it

http://www.experts-exchange.com/Hardware/Servers/Q_25200410.html

A couple of out takes

Theire units work well for the most part, but we've had far more issues over the past years with the units locking up than you'd expect, almost always due to an incompatabilities to their OS and the underlyring RAID controller - I belive at this point we've had to upgrade the controller on 3 different occasions in 2 years. The feature set from Lefthand is quite nice for the price, but its missing a couple things, like being able to manually move the cluster VIP between boxes (its an automatic thing, so iy only does it when you down a box).

Performance on the units is fairly good, but make sure you calculate what IOPS you'll get from the units versus what you'll need to run your servers and VMs off of it. Both vendros can give estimates on what the units will give you, and what sort of number they'd expect in your environment.

OR

Not a ton (we're still using them), but more than I feel we should have. The units don't failover as gracefully as they should when one has a problem, and so we've had to do do a fair number of manual work when there's an issue to get things back running.

Example: at our HQ, we're using 2 mirrored units to host a number of mid to high importance servers - a couple VMs, data store to a couple important but low use linux servers, and a fileserver that hosts the folder redirection to laptops. When one unit hands/locks up, its supposed to gracefully fail over to the other unit, with no loss or connection drops. What actually happens is the fileserver dismounts the drive, and both the VMs and the data store file systems go to a read only state. It's all fixable, but causes significant scrambling to fix the problem.

squigit, there are two ways to set up network RAID on LeftHand., in one mode it stops if a node goes down for a high level of data protection, in the other mode it keeps going. You can also migrate a LUN to another set of nodes by jiggling about with cluster membership. Maybe you just to install the failover manager on a 3rd box to maintain quorum. Sounds like you need to go on the install/config course.

Followers