A logical partition, commonly called an LPAR, is a subset of computer's hardware resources, virtualized as a separate computer. In effect, a physical machine can be partitioned into multiple logical partitions, each hosting a separate operating system.
The terms PR/SM and LPAR are often used interchangeably
PR/SM is a type-1 hypervisor that runs directly on the CPU. PR/SM allocates system resources and allows multiple logical partitions to share physical resources such as CPUs, direct access storage device (DASD), and memory.
The changing of resource allocations without restart of the logical partition is called dynamic logical partitioning.
LPARs safely allow combining multiple test, development, quality assurance, and production work on the same server, offering advantages such as lower costs, faster deployment, and more convenience.
Friday, December 3, 2010
Tuesday, November 30, 2010
Enhanced VMotion Compatibility (EVC)
What is EVC?
EVC automatically configures server CPUs with Intel FlexMigration or AMD-V Extended Migration technologies to be compatible with older servers. What is the benefit of EVC?
No. An EVC-enabled cluster only allows CPUs from a single vendor in the cluster. VirtualCenter and vCenter Server do not allow you to add a host from a different vendor into an EVC-enabled cluster.
What is the difference between EVC and the old CPUID masking feature (accessed from the Virtual Machine Settings dialog box, Options tab, CPUID mask option)?
vCenter Server does not permit the addition of hosts that do not provide support for EVC into an EVC-enabled cluster. To use EVC, you must be running ESX 3.5 Update 2 or higher with VirtualCenter 2.5 Update 2 or higher and have compatible processors in your servers. EVC does not allow for migration with VMotion between Intel and AMD processors.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1005764
The older masking feature involved applying manual masks to individual virtual machines. EVC takes effect on a whole cluster and all virtual machines in the cluster. More accurately, EVC affects the hosts themselves, making all the hosts in the cluster appear to be the same type of CPU hardware, even if they are different.
What happens when a host is removed from an EVC-enabled cluster?
When a host leaves an EVC-enabled cluster, it reverts to its normal behavior. New virtual machines started on that host can access all the features of the CPU, and are not limited by the EVC mode that was in effect while the host was in the EVC cluster. Note that virtual machines that were once able to migrate to the host might no longer be permitted to do so.
Can I add an ESX/ESXi 3.5 Update 1 or earlier host to an EVC-enabled cluster?
No. EVC is supported only on ESX/ESXi 3.5 Update 2 and later.
Because EVC allows you to migrate virtual machines between different generations of CPUs, with EVC you can mix older and newer server generations in the same cluster and be able to migrate virtual machines with VMotion between these hosts. This makes adding new hardware into your existing infrastructure easier and helps extend the value of your existing hosts.
How do I use EVC?
EVC is enabled for a cluster in the VirtualCenter or vCenter Server inventory. After it is enabled, EVC ensures that migration with VMotion is possible between any hosts in the cluster. Only hosts that preserve this property can be added to the cluster.
After EVC is enabled for a cluster in the VirtualCenter inventory, all hosts in that cluster are configured to present identical CPU features and ensure CPU compatibility for VMotion.
Does EVC allow AMD and Intel CPUs to be VMotion compatible?
EVC is short for Enhanced VMotion Compatibility. EVC allows you to migrate virtual machines between different generations of CPUs.
Enhanced VMotion Compatibility (EVC) simplifies VMotion compatibility issues across CPU generations.
All VM files
Detailed list of all the VM files
*.nvram file
vmdk files
o *flat.vmdk file - This is the actual raw disk file that is created for each virtual hard drive. Almost all of a .vmdk file's content is the virtual machine's data, with a small portion allotted to virtual machine overhead. This file will be roughly the same size as your virtual hard drive.
o *.vmdk file - This isn't the file containing the raw data anymore. Instead it is the disk descriptor file which describes the size and geometry of the virtual disk file. This file is in text format and contains the name of the flat.vmdk file for which it is associated with and also the hard drive adapter type, drive sectors, heads and cylinders, etc. One of these files will exist for each virtual hard drive that is assigned to your virtual machine. You can tell which flat.vmdk file it is associated with by opening the file and looking at the Extent Description field.
o *delta.vmdk file - This is the differential file created when you take a snapshot of a VM (also known as REDO log). When you snapshot a VM it stops writing to the base vmdk and starts writing changes to the snapshot delta file. The snapshot delta will initially be small and then start growing as changes are made to the base vmdk file, The delta file is a bitmap of the changes to the base vmdk thus is can never grow larger than the base vmdk. A delta file will be created for each snapshot that you create for a VM. These files are automatically deleted when the snapshot is deleted or reverted in snapshot manager.
*.vmx file
*.vswp file
*.vmss file
*.log file
*.vmxf file
*.vmsd file
*.vmsn file - This is the snapshot state file, which stores the exact running state of a virtual machine at the time you take that snapshot. This file will either be small or large depending on if you select to preserve the VM's memory as part of the snapshot. If you do choose to preserve the VM's memory then this file will be a few megabytes larger then the maximum RAM memory allocated to the VM. This file is similar to the vmss (Suspend) file. A vmsn file will be created for each snapshot taken on the VM, these files are automatically deleted when the snapshot is removed. - This file is used to store metadata and information about snapshots. This file is in text format and will contain information such as the snapshot display name, uid, disk file name, etc. It is initially a 0 byte file until you create your first snapshot of a VM and from that point it will populate the file and continue to update it whenever new snapshots are taken. This file does not cleanup completely after snapshots are taken. Once you delete a snapshot it will still leave the fields in the file for each snapshot and just increment the uid and set the name to Consolidate Helper presumably to be used with Consolidated Backups - This is a supplemental configuration file in text format for virtual machines that are in a team. Note that the .vmxf file remains if a virtual machine is removed from the team. Teaming virtual machines is a Vmware Workstation feature and includes the ability to designate multiple virtual machines as a team, which administrators can then power on and off, suspend and resume as a single object making it particularly useful for testing client-server environments. This file still exists with ESX server virtual machines but only for compatibility purposes with Workstation.- This is the file that keeps a log of the virtual machine activity and is useful in troubleshooting virtual machine problems. Every time a VM is powered off and then back on a new log file is created. The current log file for the VM is always vmware.log. The older log files are incremented with a -# in the filename and up to 6 of them will be retained. (ie. vmware-4.log) The older .log files are always deleteable at will, the latest .log file can be deleted when the VM is powered off. As the log files do not take much disk space, most administrators let them be - This file is created when a VM is put into Suspend (pause) mode and is used to save the suspend state. It is basically a copy of the VM's RAM and will be a few megabytes larger than the maximum RAM memory allocated to the VM. If you delete this file while the VM is in a suspend state It will start the VM from a normal boot up instead of starting the vm from the state it was when it was suspended. This file is not automatically deleted when the VM is brought out of Suspend mode. Like the Vswp file this file will only be deleted when the VM is powered off (not rebooted). If a Vmss file exists from a previous suspend and the VM is suspended again then the previous file is re-used for the subsequent suspensions. Also note that if a vswp file is present it is deleted when a VM is suspended and then re-created when the VM is powered on again. The reason for this is that the VM is essentially powered off in the suspend state, it's RAM contents are just preserved in the vmss file so it can be quickly powered back on. - This is the VM swap file (earlier ESX versions had a per host swap file) and is created to allow for memory overcommitment on a ESX server. The file is created when a VM is powered on and deleted when it is powered off. By default when you create a VM the memory reservation is set to zero, meaning no memory is reserved for the VM and it can potentially be 100% overcommitted. As a result of this a vswp file is created equal to the amount of memory that the VM is assigned minus the memory reservation that is configured for the VM. So a VM that is configured with 2GB of memory will create a 2GB vswp file when it is powered on, if you set a memory reservation for 1GB, then it will only create a 1GB vswp file. If you specify a 2GB reservation then it creates a 0 byte file that it does not use. When you do specify a memory reservation then physical RAM from the host will be reserved for the VM and not usable by any other VM's on that host. A VM will not use it vswp file as long as physical RAM is available on the host. Once all physical RAM is used on the host by all its VM's and it becomes overcommitted then VM's start to use their vswp files instead of physical memory. Since the vswp file is a disk file it will effect the performance of the VM when this happens. If you specify a reservation and the host does not have enough physical RAM when the VM is powered on then the VM will not start.- This file is the primary configuration file for a virtual machine. When you create a new virtual machine and configure the hardware settings for it that information is stored in this file. This file is in text format and contains entries for the hard disk, network adapters, memory, CPU, ports, power options, etc. You can either edit these files directly if you know what to add or use the Vmware GUI (Edit Settings on the VM) which will automatically update the file. - These are the disk files that are created for each virtual hard drive in your VM. There are 3 different types of files that use the vmdk extension, they are: - This file contains the CMOS/BIOS for the VM. The BIOS is based off the PhoenixBIOS 4.0 Release 6 and is one of the most successful and widely used BIOS and is compliant with all the major standards, including USB, PCI, ACPI, 1394, WfM and PC2001. If the NVRAM file is deleted or missing it will automatically be re-created when the VM is powered on. Any changes made to the BIOS via the Setup program (F2 at boot) will be saved in this file. This file is usually less then 10K in size and is not in a text format (binary).
Understanding Snapshots
A good rule of thumb is to allow for disk space of at least 25% of the virtual machine's (VM's) total disk size. But this amount can vary depending upon the type of server, how long you keep the snapshots, and if you plan on using multiple snapshots. If you plan on including the memory state with your snapshots, you'll also need to allow for extra disk space equal to amount of RAM assigned to the VM.
A VM with only one snapshot requires no extra disk space when deleting, or committing, it. (The term committing is used because the changes saved in the snapshot's delta files are now committed to the original virtual machine disk file, or VMDK.) But if you have multiple snapshots, you will need extra disk space available when deleting all snapshots. This is because of the way they are merged back into the original disk file.
An alternate method of deleting multiple snapshots that requires less additional disk space is to delete the snapshots one by one, starting with the VMs farthest down the snapshot tree. This way, the snapshots grow individually when they are merged into the previous snapshot, and subsequently deleted. If a little more tedious, this method requires far less extra disk space.
Important: Don't run a Windows disk defragmentation while the VM has a snapshot running. Defragment operations change many disk blocks and can cause very rapid growth of snapshot files.
If you've created a snapshot of a VM, and run the VM, the snapshot is active. If a snapshot is active, the performance of the VM will be degraded because ESX writes to delta files differently and less efficiently than it does to standard VMDK files. Because there is a lock on the metadata, nothing else can be written to the delta file when a write is made to the disk. Also, as the delta file grows by each 16 MB increment, it will cause another metadata lock. This can affect your VMs and ESX hosts. How big an impact on performance this will have varies based on how busy your VM and ESX hosts are.
Never expand a disk file with a snapshot running, If you do expand a virtual disk using vmkfstools while a snapshot is active, the VM will no longer start and you will receive an error: "Cannot open the disk ".vmdk" or one of the snapshot disks it depends on. Reason: The parent virtual disk has been modified since the child was created."
delta.vmdk file - This is the differential file created when you take a snapshot of a VM (also known as REDO log). When you snapshot a VM it stops writing to the base vmdk and starts writing changes to the snapshot delta file. The snapshot delta will initially be small and then start growing as changes are made to the base vmdk file, The delta file is a bitmap of the changes to the base vmdk thus is can never grow larger than the base vmdk. A delta file will be created for each snapshot that you create for a VM. These files are automatically deleted when the snapshot is deleted or reverted in snapshot manager
A snapshot can never grow larger the the original disk.
vSphere 4 Guided Consolidation
Guided Consolidation is a tool that will allow you to monitor a physical computer and determine it’s potential for adding to your virtual environment.
It has an easy to use interface with a more simplified approach than using the full VMware Capacity Planner utility.
Once the tool begins to scan the system, it will gather data and display CPU information and utilization, memory information and utilization as well as the computer name. Sometimes it may take up to 1hr before this process begins. Be prepared to allow 24 to 48hrs for this process to complete as Guided Consolidation builds a confidence metric level. Once a high enough confidence level is reached, the status of the system changes to "Ready for consolidation".
At this point, a consolidation plan is available by selecting the analyzed computer and clicking the "Plan Consolidation" button.
This plan will include a star rating which will indentify how likely the physical computer is for virtualization and will even make a recommendation for the target host. The rating system ranges from 1 to 5 stars, with 5 stars indicating the system is a high candidate for the proposed host. From the Consolidation Wizard, you can change several things, the name of the VM, the host being assigned too or even remove a VM altogether from the list.
Thursday, November 18, 2010
Diffrence SAN & NAS
NAS | SAN |
NAS uses TCP/IP Networks: Ethernet, FDDI, ATM (perhaps TCP/IP over Fibre Channel someday) | SAN uses Fibre Channel. |
NAS uses File Server Protocols: NFS, CIFS, HTTP. | SAN uses Encapsulated SCSI. |
Almost any machine that can connect to the LAN (or is interconnected to the LAN through a WAN) can use NFS, CIFS or HTTP protocol to connect to a NAS and share files. | Only server class devices with SCSI Fibre Channel can connect to the SAN. The Fibre Channel of the SAN has a limit of around 10km at best |
A NAS identifies data by file name and byte offsets, transfers file data or file meta-data (file's owner, permissions, creation data, etc.), and handles security, user authentication, file locking | A SAN addresses data by disk block number and transfers raw disk blocks. |
A NAS allows greater sharing of information especially between disparate operating systems such as Unix and NT. | File Sharing is operating system dependent and does not exist in many operating systems. |
File System managed by NAS head unit | File System managed by servers |
Backups and mirrors (utilizing features like NetApp's Snapshots) are done on files, not blocks, for a savings in bandwidth and time. A Snapshot can be tiny compared to its source volume. | Backups and mirrors require a block by block copy, even if blocks are empty. A mirror machine must be equal to or greater in capacity compared to the source volume. |
Wednesday, October 6, 2010
Service Console Memory, a common misunderstanding (ESX 4.0+)
ESX 4.x hosts – the default amount of RAM is dynamically configured to a value between 300MB and 800MB, depending on the amount of RAM that is installed in the host. For example, if the host has 32GB of memory the service console RAM will be set to 500MB, while a host which has 128GB of RAM will see the service console RAM set to 700MB. The maximum has not changed from 800MB, which would be seen on hosts with 256GB of RAM or higher, if it is being dynamically allocated.
ESX Host – 8GB RAM -> Default allocated Service Console RAM = 300MB
ESX Host – 16GB RAM -> Default allocated Service Console RAM = 400MB
ESX Host – 32GB RAM -> Default allocated Service Console RAM = 500MB
ESX Host – 64GB RAM -> Default allocated Service Console RAM = 602MB
ESX Host – 96GB RAM -> Default allocated Service Console RAM = 661MB
ESX Host – 128GB RAM -> Default allocated Service Console RAM = 703MB
ESX Host – 256GB RAM -> Default allocated Service Console RAM = 800MB
Tuesday, October 5, 2010
Jumbo Frames
In computer networking, jumbo frames are Ethernet frames with more than 1500 bytes of payload. Conventionally, jumbo frames can carry up to 9000 bytes of payload, but variations exist and some care must be taken when using the term. Many Gigabit Ethernet switches and Gigabit Ethernet network interface cards support jumbo frames, but all Fast Ethernet switches and Fast Ethernet network interface cards support only standard-sized frames.
Super jumbo frames (SJFs) are generally considered to be Internet packets which have a payload in excess of the tacitly accepted jumbo frame size of 9000 bytes.
Monday, October 4, 2010
VMware vCloud Director (vCD)
VMware vCloud Director is a new abstraction layer.VMware vCloud Director does not only abstracts and pools resources it also adds a self service portal.it is more or less bolted on top of vCenter/ESX(i).
As stated before, vCD abstracts resources which are managed by vCenter. Below each of the resource types I have mentioned what it links to on a vSphere layer so that it makes a bit more sense:
Compute
- clusters and resource pools
Network
- dvSwitches and/or portgroups
Storage
- VMFS datastores and NFS shares
As a vCD Administrator you can use the vCD portal to carve up these resources as required and assign these to a customer or department, often referred to in vCD as an “Organization”.
In order to carve up these resources a container will need to be created and this is what we call a Virtual Datacenter. There are two different types of Virtual Datacenter’s:
Provider Virtual Datacenter (Provider vDC)
A Provider Virtual Datacenter is the foundation for your Compute Resources. When creating a Provider Virtual Datacenter you will need to select a resource pool, however this can also be the root resource pool aka your vSphere cluster. At the same time you will need to associate a set of datastores with the Provider vDC, generally speaking this will be all LUNs masked to your cluster. Some describes the Provider vDC as the object where you specify the SLA and I guess that explains the concept a bit more.
Organization Virtual Datacenter (Org vDC)
After you have created a Provider vDC you can create an Org vDC and tie that Org vDC to a vCD Organization. Please note that an Organization can have multiple Org vDCs associated to it.
To summarize, vCD offers a self service portal. This portal enables you to provision resources to a tenant and enables the tenant to consume these resources by creating vApps. vApps are a container for one or multiple virtual machines and can contain isolated networks.
As stated before, vCD abstracts resources which are managed by vCenter. Below each of the resource types I have mentioned what it links to on a vSphere layer so that it makes a bit more sense:
Compute
- clusters and resource pools
Network
- dvSwitches and/or portgroups
Storage
- VMFS datastores and NFS shares
As a vCD Administrator you can use the vCD portal to carve up these resources as required and assign these to a customer or department, often referred to in vCD as an “Organization”.
In order to carve up these resources a container will need to be created and this is what we call a Virtual Datacenter. There are two different types of Virtual Datacenter’s:
Provider Virtual Datacenter (Provider vDC)
A Provider Virtual Datacenter is the foundation for your Compute Resources. When creating a Provider Virtual Datacenter you will need to select a resource pool, however this can also be the root resource pool aka your vSphere cluster. At the same time you will need to associate a set of datastores with the Provider vDC, generally speaking this will be all LUNs masked to your cluster. Some describes the Provider vDC as the object where you specify the SLA and I guess that explains the concept a bit more.
Organization Virtual Datacenter (Org vDC)
After you have created a Provider vDC you can create an Org vDC and tie that Org vDC to a vCD Organization. Please note that an Organization can have multiple Org vDCs associated to it.
To summarize, vCD offers a self service portal. This portal enables you to provision resources to a tenant and enables the tenant to consume these resources by creating vApps. vApps are a container for one or multiple virtual machines and can contain isolated networks.
Cisco's Unified Computing System
The Cisco Unified Computing System (UCS) is Cisco's multi-chassis blade computing system based on Intel Xeon 5500 processors and lossless 10Gb unified fabric (FCoE and 10Gb Ethernet functions). UCS incorporates simplified management and service profiles to facilitate dynamic system provisioning.
Compute. New Cisco UCS B-Series blades based on Intel's Xeon Nehalem processors. The blades' extended memory promises to supports more virtual machines per server than does standard memory.
Network. The system supports "wire once" unified fabric over a 10 Gbps Ethernet. That network foundation consolidates LANs, SANs and high-performance computing networks to reduce the number of network adapters, switches, and cables.
Storage access. Support for unified fabric so the system can access storage over Ethernet, Fibre Channel, Fibre Channel over Ethernet, or iSCSI.
Management. Cisco UCS Manager provides a graphical user interface (GUI), command line interface (CLI) and an application programming interface (API) for all components of the system.
Unified Service Delivery (USD) unites the datacenter and IP-NGN for service providers to create a common infrastructure from which services can be deployed in a secure virtualized fashion. The Unified Computing System is leveraged by USD to deliver physical and virtual servers required by a service.
Compute. New Cisco UCS B-Series blades based on Intel's Xeon Nehalem processors. The blades' extended memory promises to supports more virtual machines per server than does standard memory.
Network. The system supports "wire once" unified fabric over a 10 Gbps Ethernet. That network foundation consolidates LANs, SANs and high-performance computing networks to reduce the number of network adapters, switches, and cables.
Storage access. Support for unified fabric so the system can access storage over Ethernet, Fibre Channel, Fibre Channel over Ethernet, or iSCSI.
Management. Cisco UCS Manager provides a graphical user interface (GUI), command line interface (CLI) and an application programming interface (API) for all components of the system.
Unified Service Delivery (USD) unites the datacenter and IP-NGN for service providers to create a common infrastructure from which services can be deployed in a secure virtualized fashion. The Unified Computing System is leveraged by USD to deliver physical and virtual servers required by a service.
Wednesday, September 29, 2010
HA Limitations
Limitations
HA in vSphere 4.1 has these limitations:
- 320 virtual machines per host
- 3,000 virtual machines per cluster
- 32 host clusters
Managed services
Managed services is the practice of transferring day-to-day related management responsibility as a strategic method for improved effective & efficient operations. The person or organization who owns or has direct oversight of the organization or system being managed is referred to as the offer-er, client, or customer. The person or organization that accepts and provides the managed service is regarded as the service provider.
Thursday, September 23, 2010
VMotion HA & FT
VMotion is for a situation where everything works, and continues working. It allows to move a VM to another ESX, for load balancing or host evacuation. There's no dropped connections, and no reboot. However, as soon as an ESX isn't there anymore (HW crash, ...), forget about VMotion, it can't help you anymore.
HA is for a situation where an ESX goes down. The VMs that were running there go down with it, instantly. Other ESXes in the same cluster will react by restarting the crashed VMs. This involves a reboot of the OS, maybe a file system check, and the starting of the application. Obviously all connections that existed to the VM originally are dropped.
FT is for a situation where an ESX goes down. A protected VM runs on one ESX, but has an identical twin (secondary "shadow" VM) on another ESX. If the primary VM goes down because of an ESX crash (HW crash, ...), the secondary will instantly become primary and continue the workload. There's no reboot, no dropped connections. There are a lot of limitations to FT, including max 1 CPU in the VM, etc.
HA is for a situation where an ESX goes down. The VMs that were running there go down with it, instantly. Other ESXes in the same cluster will react by restarting the crashed VMs. This involves a reboot of the OS, maybe a file system check, and the starting of the application. Obviously all connections that existed to the VM originally are dropped.
FT is for a situation where an ESX goes down. A protected VM runs on one ESX, but has an identical twin (secondary "shadow" VM) on another ESX. If the primary VM goes down because of an ESX crash (HW crash, ...), the secondary will instantly become primary and continue the workload. There's no reboot, no dropped connections. There are a lot of limitations to FT, including max 1 CPU in the VM, etc.
What is VMware Fault Tolerance?
VMware Fault Tolerance is a feature that allows a new level of guest redundancy, The feature is enabled on a per virtual machine basis .
What happens when I turn on Fault Tolerance?
In very general terms, a second virtual machine is created to work in tandem with the virtual machine you have enabled Fault Tolerance on. This virtual machine resides on a different host in the cluster, and runs in virtual lockstep with the primary virtual machine. When a failure is detected, the second virtual machine takes the place of the first one with the least possible interruption of service.
How do I tell if my environment is ready for Fault Tolerance?
The VMware SiteSurvey Tool is used to check your environment for compliance with VMware Fault Tolerance.
What happens during a failure?
When a host running the primary virtual machine fails, a transparent failover occurs to the corresponding secondary virtual machine. During this failover, there is no data loss or noticeable service interruption. In addition, VMware HA automatically restores redundancy by restarting a new secondary virtual machine on another host. Similarly, if the host running the secondary virtual machine fails, VMware HA starts a new secondary virtual machine on a different host. In either case there is no noticeable outage by an end user.
What is the logging time delay between the Primary and Secondary Fault Tolerance virtual machines?
The actual delay is based on the network latency between the Primary and Secondary. vLockstep executes the same instructions on the Primary and Secondary, but because this happens on different hosts, there could be a small latency, but no loss of state. This is typically less than 1 ms. Fault Tolerance includes synchronization to ensure that the Primary and Secondary are synchronized.
In a cluster with more than 3 hosts, can you tell Fault Tolerance where to put the Fault Tolerance virtual machine or does it chose on its own?
You can place the original (or Primary virtual machine). You have full control with DRS or VMotion to assign to it to any node. The placement of the Secondary, when created, is automatic based on the available hosts. But when the secondary is created and placed, you can VMotion it to the preferred host.
What happens if the host containing the primary virtual machine comes back online (after a node failure)?
This node is put back in the pool of available hosts. There is no attempt to start or migrate the primary to that host.
Is the failover from the primary virtual machine to the secondary virtual machine dynamic or does Fault Tolerance restart a virtual machine?
The failover from primary to secondary virtual machine is dynamic, with the secondary continuing execution from the exact point where the primary left off. It happens automatically with no data loss, no downtime, and little delay. Clients see no interruption. After the dynamic failover to the secondary virtual machine, it becomes the new primary virtual machine. A new secondary virtual machine is spawned automatically
Does Fault Tolerance support Intel Hyper-Threading Technology?
Yes, Fault Tolerance does support Intel Hyper-Threading Technology on systems that have it enabled. Enabling or disabling Hyper-Threading has no impact on Fault Tolerance.
http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&externalId=1013428
Additional
When VMware FT is enabled for a virtual machine ("the primary"), a second instance of the virtual machine (the "secondary") is created by live-migrating the memory contents of the primary using VMware® VMotion™. Once live, the secondary virtual machine runs in lockstep and effectively mirrors the guest instruction
execution of the primary.
If either the primary or secondary dies, a new secondary is spawned and is placed on the candidate host determined by HA. The candidate host determined by HA may not be an optimal placement for balancing, however one can manually VMotion either the primary or the secondary virtual machines to a
different host as needed.
VMware Fault Tolerance is a feature that allows a new level of guest redundancy, The feature is enabled on a per virtual machine basis .
What happens when I turn on Fault Tolerance?
In very general terms, a second virtual machine is created to work in tandem with the virtual machine you have enabled Fault Tolerance on. This virtual machine resides on a different host in the cluster, and runs in virtual lockstep with the primary virtual machine. When a failure is detected, the second virtual machine takes the place of the first one with the least possible interruption of service.
How do I tell if my environment is ready for Fault Tolerance?
The VMware SiteSurvey Tool is used to check your environment for compliance with VMware Fault Tolerance.
What happens during a failure?
When a host running the primary virtual machine fails, a transparent failover occurs to the corresponding secondary virtual machine. During this failover, there is no data loss or noticeable service interruption. In addition, VMware HA automatically restores redundancy by restarting a new secondary virtual machine on another host. Similarly, if the host running the secondary virtual machine fails, VMware HA starts a new secondary virtual machine on a different host. In either case there is no noticeable outage by an end user.
What is the logging time delay between the Primary and Secondary Fault Tolerance virtual machines?
The actual delay is based on the network latency between the Primary and Secondary. vLockstep executes the same instructions on the Primary and Secondary, but because this happens on different hosts, there could be a small latency, but no loss of state. This is typically less than 1 ms. Fault Tolerance includes synchronization to ensure that the Primary and Secondary are synchronized.
In a cluster with more than 3 hosts, can you tell Fault Tolerance where to put the Fault Tolerance virtual machine or does it chose on its own?
You can place the original (or Primary virtual machine). You have full control with DRS or VMotion to assign to it to any node. The placement of the Secondary, when created, is automatic based on the available hosts. But when the secondary is created and placed, you can VMotion it to the preferred host.
What happens if the host containing the primary virtual machine comes back online (after a node failure)?
This node is put back in the pool of available hosts. There is no attempt to start or migrate the primary to that host.
Is the failover from the primary virtual machine to the secondary virtual machine dynamic or does Fault Tolerance restart a virtual machine?
The failover from primary to secondary virtual machine is dynamic, with the secondary continuing execution from the exact point where the primary left off. It happens automatically with no data loss, no downtime, and little delay. Clients see no interruption. After the dynamic failover to the secondary virtual machine, it becomes the new primary virtual machine. A new secondary virtual machine is spawned automatically
Does Fault Tolerance support Intel Hyper-Threading Technology?
Yes, Fault Tolerance does support Intel Hyper-Threading Technology on systems that have it enabled. Enabling or disabling Hyper-Threading has no impact on Fault Tolerance.
http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&externalId=1013428
Additional
When VMware FT is enabled for a virtual machine ("the primary"), a second instance of the virtual machine (the "secondary") is created by live-migrating the memory contents of the primary using VMware® VMotion™. Once live, the secondary virtual machine runs in lockstep and effectively mirrors the guest instruction
execution of the primary.
If either the primary or secondary dies, a new secondary is spawned and is placed on the candidate host determined by HA. The candidate host determined by HA may not be an optimal placement for balancing, however one can manually VMotion either the primary or the secondary virtual machines to a
different host as needed.
A VMware HA Cluster consists of nodes, primary and secondary nodes. Primary nodes hold cluster settings and all "node states" which are synchronized between primaries. Node states hold for instance resource usage information. In case that vCenter is not available the primary nodes will have a rough estimate of the resource occupation and can take this into account when a fail-over needs to occur. Secondary nodes send their state info to the primary nodes.
Nodes send a heartbeat to each other, which is the mechanism to detect possible outages. Primary nodes send heartbeats to primary nodes and secondary nodes. Secondary nodes send their heartbeats to primary nodes only. Nodes send out these heartbeats every second by default. However this is a changeable value: das.failuredetectioninterval. (Advanced Settings on your HA-Cluster)
The first 5 hosts that join the VMware HA cluster are automatically selected as primary nodes. All the others are automatically selected as secondary nodes. When you do a reconfigure for HA the primary nodes and secondary nodes are selected again, this is at random. The vCenter client does not show which host is a primary and which is not. This however can be revealed from the Service Console:
cat /var/log/vmware/aam/aam_config_util_listnodes.logAnother method of showing the primary nodes is:
/opt/vmware/aam/bin/Cli (ftcli on earlier versions)
AAM> ln
The Limit of 5 is a soft limit, so you can manually add a 6th, but this is not supported.
To promote a node:
/opt/vmware/aam/bin/Cli (ftcli on earlier versions)
AAM> promotenode
/opt/vmware/aam/bin/Cli (ftcli on earlier versions)
AAM> demotenode
The promotion of a secondary host only occurs when a primary host is either put in "Maintenance Mode", disconnected from the cluster, removed from the cluster or when you do a reconfigure for HA.If all primary hosts fail simultaneously no HA initiated restart of the VMs will take place. HA needs at least one primary host to restart VMs. This is why you can only take four host failures in account when configuring HA.
You will need at least one primary because the "fail-over coordinator" role will be assigned to this primary, this role is also described as "active primary". The fail-over coordinator coordinates the restart of VMs on the remaining primary and secondary hosts. The coordinator takes restart priorities in account. Keep in mind, when two hosts fail at the same time it will handle the restart sequentially. In other words, restart the VMs of the first failed host (taking restart priorities in account) and then restart the VMs of the host that failed as second (again taking restart priorities in account). If the fail-over coordinator fails one of the other primaries will take over.
das.isolationaddress[x] – IP address the ESX hosts uses to check on isolation when no heartbeats are received, where [x] = 1‐10. VMware HA will use the default gateway as an isolation address and the provided value as an additional checkpoint. It is recommended to add an isolation address when a secondary service console is being used for redundancy purposes.
Power off – When a network isolation occurs all VMs are powered off. It is a hard stop.
Shut down – When a network isolation occurs all VMs running on that host are shut down via VMware Tools. If this is not successful within 5 minutes a "power off" will be executed.
Leave powered on – When a network isolation occurs on the host the state of the VMs remains unchanged.
http://www.yellow-bricks.com/vmware-high-availability-deepdiv/#HA-primariesandsecondaries
vSphere New Feature " FT "
HA, or High Availability is to ensure that if one of your hosts dies (poof, gone, power failure, hardware failure, network failure, etc) vCenter will detect the failure and then take the VM's that used to be running on the host that failed, and power up on another host in the cluser. What used to happen prior to HA was that if your host went down, those VM's were down until you repaired the host or manually registered them on another host and powered them up. HA powers them up automatically in the event of a host failure.
FT, or fault tolerance takes the concept of HA to a new level. Setting FT on a VM causes a standby VM to be setup. That VM is updated constantly so that in the case of a host failure, the standby VM immediately assumes processing. So, from a host OS perspective there is no power failure. The transition from primary FT VM to standby FT VM is nearly instantaneous. This ensures that there is no downtime for the FT VM.
FT, or fault tolerance takes the concept of HA to a new level. Setting FT on a VM causes a standby VM to be setup. That VM is updated constantly so that in the case of a host failure, the standby VM immediately assumes processing. So, from a host OS perspective there is no power failure. The transition from primary FT VM to standby FT VM is nearly instantaneous. This ensures that there is no downtime for the FT VM.
The Machine SID Duplication Myth
An article by Mark Russinovich specifying why New SID is irrelevent :
The reason that I began considering NewSID for retirement is that, although people generally reported success with it on Windows Vista, I hadn’t fully tested it myself and I got occasional reports that some Windows component would fail after NewSID was used. When I set out to look into the reports I took a step back to understand how duplicate SIDs could cause problems, a belief that I had taken on faith like everyone else. The more I thought about it, the more I became convinced that machine SID duplication – having multiple computers with the same machine SID – doesn’t pose any problem, security or otherwise. I took my conclusion to the Windows security and deployment teams and no one could come up with a scenario where two systems with the same machine SID, whether in a Workgroup or a Domain, would cause an issue. At that point the decision to retire NewSID became obvious.
http://blogs.technet.com/b/markrussinovich/archive/2009/11/03/3291024.aspx
The reason that I began considering NewSID for retirement is that, although people generally reported success with it on Windows Vista, I hadn’t fully tested it myself and I got occasional reports that some Windows component would fail after NewSID was used. When I set out to look into the reports I took a step back to understand how duplicate SIDs could cause problems, a belief that I had taken on faith like everyone else. The more I thought about it, the more I became convinced that machine SID duplication – having multiple computers with the same machine SID – doesn’t pose any problem, security or otherwise. I took my conclusion to the Windows security and deployment teams and no one could come up with a scenario where two systems with the same machine SID, whether in a Workgroup or a Domain, would cause an issue. At that point the decision to retire NewSID became obvious.
http://blogs.technet.com/b/markrussinovich/archive/2009/11/03/3291024.aspx
Subscribe to:
Posts (Atom)