Blog: Networking

During troubleshooting it is often necessary to see what traffic is being passed between two networks or two hosts. The ASA software now features a built-in packet capture tool.

Below are the steps you need to take:

For the sake of this tutorial, let’s assume that we are troubleshooting traffic between a host with the address of 192.168.1.1 and a host with the address of 10.10.10.1.

Step 1. Define the traffic that you are interested in seeing via an ACL named “cap”: [more]

ASA(config)#access-list cap extended permit ip host 192.168.1.1 host 10.10.10.1
ASA(config)#access-list cap extended permit ip host 10.10.10.1 host 192.168.1.1
ASA(config)#access-list cap extended permit icmp host 192.168.1.1 host 10.10.10.1
ASA(config)#access-list cap extended permit icmp host 10.10.10.1 host 192.168.1.1

Step 2. Create and start the packet capture process named “capin”:

ASA(config)#capture capin access-list cap

Step 3. Generate some traffic between the two hosts.

Since our ACL in this case is set to detect all IP and ICMP traffic between the host we can just start a simple ping betweent the hosts.

From the host 192.168.1.1:
ping 10.10.10.1
From the host 10.10.10.1
ping 192.168.1.1

Step 4. Analyze the packet capture.

ASA#show capture capin
*This will output all of the traffic that it captured.

Step 5. Turn off the packet capture and remove the ACL:

ASA(config)#no capture capin
ASA(config)#clear configure access-list cap

Miscellaneous notes/commands:

You can clear the capture log by using this command:
ASA#clear capture capin

You can also use the pipe functionality when viewing the capture output:
ASA#show capture capin | inc 192.168.1.1

This can also be done via the ASDM, but what fun is that?


 

Vista/Windows7/Windows Server 2008 introduce a new format for the administrative templates for group policies.  Instead of replicating proprietary ADM files with the group policies, you now create a “central repository” for the ADMX (xml format) administrative templates.  Stored with the ADML files are language specific ADML files (in a subdirectory).  The trick here, is that once you create the central repository then Windows 2008 group policy editors cannot see the old ADM files, so if you have settings you wish to edit you have to create a complimentary ADMX template.  Further, older OS versions cannot read the ADMX files, so you have to be careful to perform a cutover.  Either use ADM files, or use ADMX files and edit the group policies only on newer OS versions. [more]

ADMX step-by-step guide:  http://technet.microsoft.com/en-us/library/cc709647%28WS.10%29.aspx

“New Windows Vista–based or Windows Server 2008–based policy settings can be managed only from Windows Vista–based or Windows Server 2008–based administrative machines running Group Policy Object Editor or Group Policy Management Console.”

“Group Policy Object Editor on Windows Server 2003, Windows XP, or Windows 2000 machines will not display new Windows Vista Administrative Template policy settings that may be enabled or disabled within a GPO.”

Inside ADM and ADMX Templates for Group Policy

Win Server 2008 Directory Services, Group Policy Templates


 

We have noticed some problems when SQL grabs all the memory on a machine and leaves no memory for other processes. This is especially true if you are running multiple instances (named instances) of SQL Server on the same box. There is an MSDN article that describes the issue and the steps to remedy the problem (http://msdn.microsoft.com/en-us/library/ms178067.aspx). Here is a blurb from that article: [more]

Running Multiple Instances of SQL Server

When you are running multiple instances of the Database Engine, there are three approaches you can use to manage memory:

  • Use max server memory to control memory usage. Establish maximum settings for each instance, being careful that the total allowance is not more than the total physical memory on your machine. You might want to give each instance memory proportional to its expected workload or database size. This approach has the advantage that when new processes or instances start up, free memory will be available to them immediately. The drawback is that if you are not running all of the instances, none of the running instances will be able to utilize the remaining free memory.
  • Use min server memory to control memory usage. Establish minimum settings for each instance, so that the sum of these minimums is 1-2 GB less than the total physical memory on your machine. Again, you may establish these minimums proportionately to the expected load of that instance. This approach has the advantage that if not all instances are running at the same time, the ones that are running can use the remaining free memory. This approach is also useful when there is another memory-intensive process on the computer, since it would insure that SQL Server would at least get a reasonable amount of memory. The drawback is that when a new instance (or any other process) starts, it may take some time for the running instances to release memory, especially if they must write modified pages back to their databases to do so. You may also need to increase the size of your paging file significantly.
  • Do nothing (not recommended). The first instances presented with a workload will tend to allocate all of memory. Idle instances or instances started later may end up running with only a minimal amount of memory available. SQL Server makes no attempt to balance memory usage across instances. All instances will, however, respond to Windows Memory Notification signals to adjust the size of their buffer pools. As of Windows Server 2003 SP1, Windows does not balance memory across applications with the Memory Notification API. It merely provides global feedback as to the availability of memory on the system.

You can change these settings without restarting the instances, so you can easily experiment to find the best settings for your usage pattern.


 

One of our new customers using VMware has not been happy with the performance of some of their virtual machines that had been setup before they were our client. Specifically, a couple of their Citrix VMs and a SQL Server 2005 VM have been “sluggish since they were built” according to the IT staff. I did some basic diagnostics on the SQL Server VM and it did seem to have some performance problems. However, since they had already bought a new physical server and started moving the databases to the new install we didn’t spend much time trying to make the VM run well. We decided to upgrade to VMware vSphere v4.1 upgrade before attempting to address any of the performance issues, since less than stellar performance on virtual terminal servers was normal on VMware v3.5.

During the upgrade, I needed to vMotion some of the VMs around to take down one of the ESX hosts. I kept getting a very generic error on several of the VMs and the migration would fail. I lmust have ooked at every setting a dozen times until I finally just shut the VM down and opened up the VMX file to see what might be causing the issue. The problem was that the person who had built the VMs originally had included processor affinity settings in the VMX file. This binds a VM to a specific subset of the physical processors/cores on the ESX hosts. For example, on the SQL Server VM, it was bound to cores 0 & 1. With this setting, ESX was forced to schedule core 0 & core 1 for all operations even though the server had 8 cores. Additionally, on ESX, the service console has processor affinity on cores 0 & 1, but it holds the highest priority. So basically the SQL Server VM (and the other VM I found that was co-scheduled on cores 0 & 1) were fighting with the service console for processor cycles. After removing the processor affinity, the CPU wait time counter in vCenter for that VM dropped 6x. I ended up finding 10-12 VMs with processor affinity set, so that explained why the performance was terrible. 

The moral of this story is to not manually schedule the processors when you configure ESX or Virtual Machines.  Chances are the ESX schedule will be much better than any manual configuration you could put together.


 

Upon rebooting a Terminal Server that had resource issues, we could not log back into the server through RDP.  We could log in through iLO, and it was apparent that the logins were working but they were very slow.  Upon examining the services, we could see that the IPSEC service was not started. 

Trying to manually start the service gave the following popup: "Could not start the IPSEC Services service on Local Computer.  Error 2: The system cannot find the file specified."  The event logs also showed that TCP/IP was in blocking mode. 

Disabling the service and rebooting restored all network communication, but trying to start the service would drop all connectivity again and slow down the server.  I found another article that said that IPSEC may need to be rebuilt.  When I looked for the registry keys for IPSEC, they were not there.  After I ran the following commands, the registry keys were populated, and IPSEC was able to run properly.

To rebuild IPSEC, follow these steps: [more]

  1. Click Start, click Run, type regedit, and then click OK.
  2. In Registry Editor, locate and then click the following subkey: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\IPsec\Policy\Local.  (In my case, the server’s registry ended before IPsec.  If this is the case, skip to step 6.)
  3. On the Edit menu, click Delete.
  4. Click Yes to confirm that you want to delete the subkey
  5. Quit Registry Editor
  6. Click Start, click Run, type regsvr32 polstore.dll, and then click OK.

 

A Windows 7 user was unable to connect to or display network documents and was notified the system could not load the user profile. The problem appeared to be caused by a corrupt profile. I logged the user off and then logged in as the administrator and removed her profile. Then tried logging in as the user to reload the profile. It still could not load the user’s profile. I decided to rebuild the profile but when I attempted to it failed again.  I placed the backed up profile back and tried another system and the profile worked. 

After some research I found this issue may occur if the user profile was manually deleted by using the command prompt or by using Windows Explorer. A profile that is manually deleted does not remove the security identifier (SID) from the profile list in the registry. If the SID is present, Windows will try to load the profile by using the “ProfileImagePath” that points to a nonexistent path. Therefore, the profile cannot be loaded. The profile had not been manually deleted but I decided to check anyway. I opened the registry editor and navigated to ”HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList” and found a SID for the user. I deleted it from the registry and then logged in as the user and the profile loaded fine.


 

Recently a user at a customers site was having trouble sending email.  I ran a script that connected to each mail server and specified the sender and recipient to see if any would get errors.  One refused to accept the email because the reverse DNS lookup on the source IP failed.  So the lesson to learn here is this.  If something does not work, try to figure out where it is broken and try to see exactly what is going on in that part that is broken.  But wait - that's not the end of the story because the user was sending email to 27 recipients and none of the messages were being delivered.

Mr. Peabody, set the WABAC machine to February 2004.  Microsoft has just published a paper "The Coordinated Spam Reduction Initiative". [more]

http://old.openspf.org/caller-id/csri.pdf

Section 11 is about Computational Puzzles For Spam Deterrence.  The idea is to have the computer sending email solve a puzzle that require a lot of resources, usually CPU time, but verifying that solution is fast.  The idea was to make it expensive for spammers to send out spam.  I know this sounds silly now with botnets having 1000s of machines sending spam.  But did you know Microsoft actually implemented this in Outlook 2003?  And did you know it is still in Outlook 2007?  And did you know it is still in Outlook 2010?  It's called postmarking now, but it is still the same computational puzzle.  This is only used when it thinks your email might look like spam.

Ok, so the way this works is that Outlook or your Exchange server generates the puzzle solution and adds it to the email headers.  It uses the header "x-cr-hashedpuzzle".  RFC 2821 (Simple Mail Transfer Protocol) states "The maximum total length of a text line including the <CRLF> is 1000 characters".  This x-cr-hashedpuzzle is quite long, so it is broken up into several lines.  The first line is 1000 characters, but the continuation lines have a <tab> inserted at the front, causing them to be 1001 characters long.  If this happens to be going through an ASA with ESMTP inspection enabled, it will send out resets to close the connection because it violates the RFC.

This is why the user I was working with could not send email to a list of 27 recipients. I removed the SMTP inspection on our ASA (which I have been wanting to do anyway) to work around this.


 

I have been looking for a multi-purpose network monitoring tool for use at several network customer location and came across a light-weight app called SpiceWorks. Its open source and has a number of nice features. Here is a list of some of the most useful ones

  • Schedulable network scans based on domain name or subnet
  • Asset classification based on collected data
  • Software inventory (including MS patches)
  • Basic service/system alerts & monitoring
  • User knowledge base portal w/integrated ticketing system
  • Warranty lookup for monitored assets
  • Antivirus tracking (no AV, outdated defs, versions, etc) for a number of popular packages

 

After converting a site to a new MPLS provider I began to experience about 20% packet loss to that site.  There were a lot of things that changed during the migration:

  • Added GRE Tunnels
  • Implemented EIGRP to handle routing all of the LAN subnets
  • Restricted BGP to only handle the WAN, or MPLS, interfaces

These are the troubleshooting steps I took to narrow down the problem:

  1. Ping from the tunnel interface at the main site to the tunnel interface at the branch site.  0% Packet Loss
  2. Ping from the LAN port on router at the main site to the tunnel interface at the branch site. 0% Packet Loss
  3. Ping from the LAN port on router at the main site to the LAN port at the branch site. 0% Packet Loss
  4. Ping from a client at the main site to the tunnel interface at the branch site. 0% Packet Loss
  5. Ping from a client at the main site to a client at the branch site. ~20% Packet Loss
  6. Ping from the LAN port on router at the main site to a client at the branch site. ~20% Packet Loss
  7. Ping from the tunnel interface at the main site to a client at the branch site. ~20% Packet Loss

This process seemingly narrowed it down to the problem originating at the branch site.  I checked for negotiation errors in the logs of the switch and the routers.  BGP appeared to be working fine because the peer was up and I was receiving all the routes that I expected.  The ping loss seemed to be very random.  I then decided to enable debugging on the router and start a continuous ping from a client at the main site to a client at the branch site.  I quickly noticed that every time I saw packet loss, I also so a BGP error message being logged.  There were a few different error messages that were being populated and each caused different amounts of ping loss. 

Apparently, the ping loss wasn’t as random as I thought!  After speaking with a coworker about the BGP turn up he was currently doing with another customer, he suggested that I needed to add a static route to the branch router for the BGP peer.  Everything began working!  So, to make a long story short, it is best to have a specific static route added for a BGP peer if that peer isn’t directly connected. Even if that static route has the same next-hop as the default route.


 

I was attempting to install SEP on a server that was local to me, but remote to the SEP manager. The problem here is that the SEP manager generates a 90 MB package before pushing it out to the machine and starting the install. This would’ve taken a good bit of time to copy over the VPN to the server here so I decided to take a different approach. I had the installation media for an unmanaged copy of SEP that I installed on the server. From there, I opened the SEP manager, went to clients, and exported the communications settings into a file I named “sylink.xml”. Then I copied the sylink.xml file to the server here and opened the SEP client. Inside Help & Support, click Troubleshooting, and then import communication settings. This tells the client where to look for management. After waiting for a minute or two, I went back into Troubleshooting and saw that the client was looking in the correct location for the server policies.