Thursday, September 30, 2010

Few unknowns

DRS runs every 5 minutes to ensure per host resource utilization is below the threshold and distributed evenly across the cluster.Starting 4.1 DRS is able to manage FT enabled VM, provided EVC is enabled.If EVC is disabled, DRS will not move primary as well as secondary VM for loading balancing purposes but can do initial placement for secondary VM.

Best Practices for DRS:

  • Leave some CPU for vMotion task.When vMotion is done from one host to another, CPU is reserved at both destination and source.In vSphere4.1 this reservation is 30% of the core for 1 Gig NIC and 100% of the core for 10Gig Nic.
  • Keep sufficient number of drmdump files.By default this directory size is 10MB. for large cluster one should considered increasing the directory size. drmdump files are needed by support personnel for troubleshooting DRS performance

DPM uses 3 types of protocol, iLO,IPMI and WoL to power OFF/ON hosts when needed. DPM can be disabled on individual host.

vCenter can handle now 500 concurrent tasks. By default most operations time out after 60 seconds and while entering in maintenance mode & Add host operations time out after 120 seconds. By default task is listed in recent pane for 10 minutes, after that it can traced in “Tasks & Events” tabs. Total 200 tasks can be listed in recent pane. Both these values can be modified by editing vpxd.cfg file.

vCenter 4.1 Important performance updates

  • Maximum number of ESX host per vCenter :1000
  • Maximum number of Registered VM per vCenter: 15,000
  • Maximum number of Powered ON VM’s per vCenter: 10,000
  • Maximum number session(people) who can access vCenter(vSphere Client):100
  • Maximum Hosts in a Datacenter: 400
  • Maximum Number of VM’s in a cluster:3000
  • Maximum Number of host’s in a cluster:32
  • Maximum Number of VM’s on ESX host: 320

System Requirements for vCenter Server

  • Compute: Dual Core/ Two Socket 2.0 GHz
  • Memory: 3 GB
  • Disk: 3 GB
  • Network: 1 GB Nic
  • OS: Windows 2003/2008 x64 ONLY

 

  • Design Considerations for vCenter

Performance of vCenter server is directly affected by number of hosts and powered on VM

For 50 Hosts and 500 VM’s, 2 cores, 4 GB RAM and 5 GB –MEDIUM SCALE DEPLOYMENT

For 300 hosts and 3000 VM’s, 4 cores, 8 GB RAM and 10 GB –LARGE SCALE DEPLOYMENT

For 1000 hosts and 10,000 VM’s, 8 cores, 16 GB RAM and 10 GB –VERY LARGE SCALE DEPLOYMENT

Host and vCenter Relation

Host interacts with vCenter using two management agents: hostd and vpxa

hostd: hostd is daemon which starts when host boots. It is responsible to keep record of all transactions of host level entities ex. VM, Datastores, Networks. hostd is act as implementer for vSphere API request coming from vpxa. vCenter sends all hosts level request to host over web using SOAP. vpxa keep listening for SOAP requests and are received by vpxa.vpxa in turn dispatches it to hostd using vCenter API. vpxa agent is installed as part of host addition to vCenter. As vCenter communicates with hostd through vCenter API using SOAP interface, one of the key contributors to the operational latencies is the number of network hops between vCenter and ESX host. More Network hops more will be network latencies.

Host Design consideration:

  1. Virtual machine memory overhead reservation directly affect the number of VM’s we can power on a host
  2. memory reservation for vpxa and hostd agent. Maximum memory we can reserved for host agents is 1 GB.It can be configured here Host > Configuration > System Resource Allocation > Advanced > Name of host agent

Few Unknowns to Me:

Heart beat interval for HA is 1 seconds. das.failuredetectionintervaltime. It can be increased. If HB doesn’t receive response from other for 15 seconds, it assumes other host as failed.das.failuredetectiontime. Change of these values can be effective only when you disable and re-enable HA. For high latency network, you might wish to consider increasing HA value. HA value cannot be more that failure detection value.

you cannot vMotion, Storage vMotion and use vDS switch across the datacenter

you cannot start more than 32 VM’s concurrently when HA failover occurs. This number(das.perhostconcurrentfailoverslimit) can be increased but will affect overall VM recovery time.

Maximum number of FT host per ESX host is only 4