Thursday, October 27, 2011

Exchange 2010 with NLB on HP ProCurve

Assuming you plan to implement Microsoft Exchange 2010 and have some budget left, what do you do?
We decided to support the project with an external Exchange-specialized consultant. So far so good.

After going through the design-phase we worked out the following Exchange installation plan:
- 2 mailbox stores, with database availability group (DAG)
- 2 combined clientaccess (CAS) and hub/transport (HT) servers, with network load balancing (NLB)

The installation went more or less quick and smoothly. But afterwards..

I'm picking out one (hopefully interesting) issue, we sorted out a few days later.

During an internal network scan, we were surprised about the following finding:
The broadcast got an intermittent strong increase.

Futher analyzing showed up: multicast traffic from NLB was flooding all ports within the same vlan, instead only the designated ports.

How can this happen?
Cause #1: The Exchange-specialist knows how to click through the NLB setup, but had no in-depth knowledge of it.
Cause #2: The network switch, on which Exchange NLB was attached to, was an HP ProCurve.

Solution: A static MAC entry needs to be configured on the switch.

So we thought this could be fixed quite easy, but were brought down to earth very quickly..
After trying to set the required static entry, the following error appeared:

 PROCURVE(config)# static 03bf0a-c8027e interface A2
 Value static is invalid.


It turned out, that the used HP ProCurve switches were not accepting Microsoft NLB MAC addresses...


How to succeed?
Buy other switches, VLAN it off or check the NLB multicast IGMP option.



There is a sample configuration for Cisco switches available at vmware.com:
http://kb.vmware.com/kb/1006525

Thursday, October 6, 2011

Add CustomAction to a MSI package

We used CustomActions to implement easy and quick (possibly dirty) changes before finishing the software installation.

Start editing the MSI file (e.g. using Orca)
It's best practice to not directly edit an MSI, but using a transform (MST) instead.

Open the table CustomAction
Add a new row, defining the following values:

Action:  <unique name>
Type:     3110
Type 38     = VBScript text stored in this sequence table.
Type 3072 = Queues for execution at scheduled point within script. Executes with no user impersonation. Runs in system context.
3072 + 38 = 3110
http://msdn.microsoft.com/en-us/library/aa372048(VS.85).aspx
Source: <empty>
Target:   <write a VBScript>

Example 1: Copying a file
Set FL = CreateObject("Scripting.FileSystemObject"):FL.CopyFile "\\server\share$\file.dat", "C:\Program Files\ExistingFolder\file.dat", TRUE

Example 2: Change registry settings
Set S = CreateObject("WScript.Shell"):X="HKLM\Software\JavaSoft\Java Update\Policy\":Y="REG_DWORD":S.RegWrite x&"EnableJavaUpdate",0,Y:S.RegWrite x&"EnableAutoUpdateCheck",0,Y:S.RegWrite x&"NotifyDownload",0,Y:S.RegWrite x&"NotifyInstall",0,Y:S.RegWrite x&"Frequency",0,Y:S.RegWrite x&"UpdateSchedule",0,Y

Change to the table InstallExecuteSequence
Add a new row, using the following values:

Action:          <same name as used in CustomAction>
Condition:    NOT Installed
Sequence:   <number before InstallFinalize>
To figure out a working sequence number, sort the rows by sequence.
Search for the "InstallFinalize" sequence number and notice this value.
Decrement the noticed value by 1 and check if this value is already used.
If not, you've found your sequence number.
If yes, renumber the values, until you got a free sequence number right before "InstallFinalize".

Wednesday, October 5, 2011

VMware VCB backup problem (with iSCSI LUN)

So you're using VMware Consolidated Backup (VCB) with iSCSI disks and the backup is no longer working?
First, check the output of a failed vcbMounter backup job:

[2011-06-30 10:17:24.315 'App' 5664 error] No path to device LVID:2bedb412-a987b654-1234-012a345b6cde/2bedb412-9bc87d6e-abcd-012a345b6cde/1 found.
[2011-06-30 10:17:24.315 'BlockList' 5664 error]
[2011-06-30 10:17:24.529 'vcbMounter' 5664 error] Error: Failed to open the disk: Cannot access a SAN/iSCSI LUN backing this virtual disk. (Hint: If you are using vcbMounter you can use the option "-m nbd" to switch to network based disk access if this is what you want.) If you were attempting file-level access, stop the vmount Service by typing "net stop vmount2" on a command prompt to force vmount to re-scan for SAN LUNs and re-try the command.
[2011-06-30 10:17:24.529 'vcbMounter' 5664 error] An error occurred, cleaning up

Executing the VCB SAN debug tool is another good idea to get more information:

C:\Program Files\VMware\VMware Consolidated Backup Framework>vcbsandbg
[2011-06-30 10:20:40.300 'App' 848 info] Current working directory: C:\Program Files\VMware\VMware Consolidated Backup Framework
[2011-06-30 10:20:40.316 'BaseLibs' 848 info] HOSTINFO: Seeing Intel CPU, numCoresPerCPU 1 numThreadsPerCore 2.
[2011-06-30 10:20:40.316 'BaseLibs' 848 info] HOSTINFO: This machine has 1 physical CPUS, 1 total cores, and 2 logical CPUs.
[2011-06-30 10:20:40.316 'App' 848 verbose] Building SCSI Device List...
[2011-06-30 10:20:40.378 'App' 848 trivia] Evaluating 1 paths.
[2011-06-30 10:20:40.378 'App' 848 trivia] Trying to open path \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}.
[2011-06-30 10:20:40.378 'App' 848 info] Now using Path \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}.
[2011-06-30 10:20:40.378 'App' 848 trivia] Reading 32256 bytes from offset 0.
[2011-06-30 10:20:40.394 'App' 848 trivia] Found 1 partition(s) on this device.
[2011-06-30 10:20:40.394 'App' 848 trivia] Evaluating 1 paths.
[2011-06-30 10:20:40.394 'App' 848 trivia] Trying to open path \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}.
[2011-06-30 10:20:40.394 'App' 848 info] Now using Path \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}.
[2011-06-30 10:20:40.394 'App' 848 trivia] Reading 32256 bytes from offset 0.
[2011-06-30 10:20:40.394 'App' 848 trivia] Found 1 partition(s) on this device.
[2011-06-30 10:20:40.394 'App' 848 error] Dumping SCSI Device/LUN List.
[2011-06-30 10:20:40.394 'App' 848 info] **** Begin SCSI Device LIst ****
[2011-06-30 10:20:40.394 'App' 848 info] Found SCSI Device: NAA:70a9800070654321098765432109876b5c432d102030
[2011-06-30 10:20:40.394 'App' 848 info] Visible on 1 paths:
[2011-06-30 10:20:40.394 'App' 848 info] Device Name: \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}, Bus: 0 Target: 0 Lun: 2
[2011-06-30 10:20:40.409 'App' 848 info] Lun does not contain any VMFS/LVM signatures.
[2011-06-30 10:20:40.409 'App' 848 info] Found SCSI Device: NAA:70a9800070654321098765432109876b5c432d102030
[2011-06-30 10:20:40.409 'App' 848 info] Visible on 1 paths:
[2011-06-30 10:20:40.409 'App' 848 info] Device Name: \\?\scsi#disk&ven_netapp__&prod_lun_____________&rev_7654#1&2abc3d45&6&000001#{23e45678-f9ab-12c3-45d6-01a2b34cde5f}, Bus: 0 Target: 0 Lun: 3
[2011-06-30 10:20:40.409 'App' 848 info] Lun does not contain any VMFS/LVM signatures.
[2011-06-30 10:20:40.409 'App' 848 info] **** End SCSI Device LIst ****

So, the VCB is no langer able to find the VMFS disk.
Checking the Windows disk management: you will find the VMware disk in an Unallocated state

Or in the vSphere Client: (selecting a host, choose "Configuration", "Storage" and change the view to "Devices") - No VMFS partition shows up here..

What happened?
Most probably, the Windows diskpart automount feature (which is enabled by default) has written its own signature to the VMware disks.
http://technet.microsoft.com/en-us/library/cc753703(WS.10).aspx
Btw, it's generally a good idea to disable this feature on a server which is connected to iSCSI LUNs.

Solution: To change the disk signature back to VMFS, connect to the console of a vSphere host server.
Login as 'root' and execute the following command:
(to search for a disk which has no detailed informations listed)
[root@host /]# fdisk -l
...
Disk /dev/sdd: 408.0 GB, 408063836160 bytes
255 heads, 63 sectors/track, 49610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 Device Boot      Start         End      Blocks   Id  System
                                                                                               <= missing information
[root@host /]#
[root@host /]# fdisk -u /dev/sdd
The number of cylinders for this disk is set to 49610.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK):

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First sector (63-614465535, default 63): 128
Last sector or +size or +sizeM or +sizeK (128-614465535, default 614465535):
Using default value 614465535

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): fb
Changed system type of partition 1 to fb (VMware VMFS)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@host /]# vmkfstools -V
[root@host /]# 

For more information, visit the following site:
http://kb.vmware.com/kb/1002281

Attention: If you have several disks with missing VMFS signatures, change all disk signatures at the same time.
As long as you re-signature only one disk (e.g. for testing), you could have problems connecting to this 'repaired' disk.

Friday, September 2, 2011

Slow RDC / SQL performance using Windows 7 / 2008 R2

Are you experiencing slow remote desktop connections (RDC)?
Or is the database-connect to your SQL server not working properly?

Then have a look at the following Windows feature: Receive Window Auto-Tuning
http://support.microsoft.com/kb/934430/en-us

Right-click Command Prompt, and then click Run as Administrator.
To display the current setting, type the following command, and then press ENTER:

 netsh interface tcp show global

You will find the "Receive Window Auto-Tuning" set to normal.

To disable this feature, type the following command, and then press ENTER:

 netsh interface tcp set global autotuninglevel=disabled


Generally, it seems to be a good idea to disable this feature in a business-related environment.
I've seen this autotuning feature too often screwing up network traffic..



For more information, visit the following site:
http://technet.microsoft.com/en-us/magazine/cc162519.aspx

dot1q - Configuration on Cisco/HP switches

Have you ever mixed up Cisco and HP switches in a networking environment?
It's funny how similar features can be named differently - and vice versa (e.g. meaning of 'trunk').

Now, assuming you're planning to connect HP switches (4200 series) to Cisco switches (3500 series). The first steps for configuring such a 'trunk' on Cisco switches were:



interface GigabitEthernet0/1
 switchport trunk encapsulation dot1q
 switchport mode trunk

exit 
interface GigabitEthernet0/2
 switchport trunk encapsulation dot1q
 switchport mode trunk
 switchport nonegotiate

exit

It's recommended to disable Cisco's DTP signaling (using the nonegotiate option) when connecting to HP ProCurve switches.

By default, all VLANs were allowed to pass the Cisco trunk.
To manually specifiy the allowed VLANs (e.g. 1, 10 & 11), add the following command:
 switchport trunk allowed vlan 1,10,11

On HP switches, ensure that the "Native VLAN" (usually VLAN1) is set as untagged on the uplink port (A2):
 vlan 1
 untagged A2
 exit

Checking the port status on Cisco:
show interface GigabitEthernet0/1 switchport


For more information, visit the following sites:

"Trunking" on Cisco switches
http://www.ciscopress.com/articles/article.asp?p=29803&seqNum=3

ProCurve / Cisco Interoperability Guide
http://www.tecnocael.it/ftp/docs/ProCurve_Cisco.pdf