Revision: $Revision: 499 $
A candidate should be able to identify and correct common network setup issues to include knowledge of locations for basic configuration files and commands.
Key files, terms and utilities include:
/sbin/ifconfig |
/sbin/route |
/bin/netstat |
/etc/network or /etc/sysconfig/network-scripts/ |
system log files such as /var/log/syslog and /var/log/messages |
/bin/ping |
/etc/resolv.conf |
/etc/hosts |
/etc/hosts.allow, /etc/hosts.deny |
/etc/hostname or /etc/HOSTNAME |
/sbin/hostname |
/usr/sbin/traceroute |
/usr/bin/nslookup |
/usr/bin/dig |
/bin/dmesg |
host |
The purpose of this chapter is not providing solutions to all possible network problems but more to point out that you have to do it methodically and with knowledge of the surrounding environment.
Knowledge of the surrounding environment is an absolute necessity because you can't solve a problem if you have no idea whatsoever where problems can occur. You have to know how you are connected to the network because every component that plays a role in facilitating your network access can fail.
The key files, terms and utilities will not be described in this chapter since most of them have already been described in Chapter 5, Networking Configuration (205) and other chapters.
Let's assume you're working on a PC which is connected to the Internet by means of a local area network and a firewall. Suddenly you can't view that web page anymore.
The first step in determining the problem is to make a list of the components that are involved, this could be something like:
+ your PC with eth0 network interface to the LAN
+ the firewall with eth0 interface to the LAN
and eth1 interface to the ISP
+ the site you are trying to reach
Then think about how everything works together. You enter the URL in the browser. Your machine uses DNS to find out what the IP address is of the web site you are trying to reach etc.
Packets travel through your eth0 interface over the LAN to the eth0 interface of the firewall and out the eth1 interface of the firewall to the ISP and from the ISP in some way to the web server.
Now that you know how the different components interact, you can take steps to determine the source of the malfunction.
The graphic below should not be treated as the way to solve all problems. The purpose of the graphic is to give an example of step-by-step troubleshooting given a certain situation which in this case is fictitious :)
The cause of the problem has been determined and can be S(olved).
Can we reach other machines on the internet ? Try another URL or try pinging another machine on the internet. Be careful with ping though, your firewall could be blocking ICMP echo-requests and replies.
Is the machine we are trying to reach, the target, down ? Try reaching the machine via another network, contact a friend and let him try to reach the machine, call the person responsible for the machine etc.
Can we reach the firewall? Try pinging the firewall, login to it etc.
Is there a HOP down ? Use traceroute to find out what the hops are between you and the target host. The route from my machine to LPI's web-server for instance can be determined by issuing the command traceroute -I www.lpi.org:
# traceroute -I www.lpi.org traceroute to www.lpi.org (209.167.177.93), 30 hops max, 38 byte packets 1 fertuut.snowgroningen (192.168.2.1) 0.555 ms 0.480 ms 0.387 ms 2 wc-1.r-195-85-156.essentkabel.com (195.85.156.1) 30.910 ms 26.352 ms 19.406 ms 3 HgvL-WebConHgv.castel.nl (195.85.153.145) 19.296 ms 28.656 ms 29.204 ms 4 S-AMS-IxHgv.castel.nl (195.85.155.2) 172.813 ms 199.017 ms 95.894 ms 5 f04-08.ams-icr-03.carrier1.net (212.4.194.13) 118.879 ms 84.262 ms 130.855 ms 6 g02-00.amd-bbr-01.carrier1.net (212.4.211.197) 30.790 ms 45.073 ms 28.631 ms 7 p08-00.lon-bbr-02.carrier1.net (212.4.193.165) 178.978 ms 211.696 ms 301.321 ms 8 p13-02.nyc-bbr-01.carrier1.net (212.4.200.89) 189.606 ms 413.708 ms 194.794 ms 9 g01-00.nyc-pni-02.carrier1.net (212.4.193.198) 134.624 ms 182.647 ms 411.876 ms 10 500.POS2-1.GW14.NYC4.ALTER.NET (157.130.94.249) 199.503 ms 139.083 ms 158.804 ms 11 578.ATM3-0.XR2.NYC4.ALTER.NET (152.63.26.242) 122.309 ms 191.783 ms 297.066 ms 12 188.at-1-0-0.XR2.NYC8.ALTER.NET (152.63.18.90) 212.805 ms 193.841 ms 94.278 ms 13 0.so-2-2-0.XL2.NYC8.ALTER.NET (152.63.19.33) 131.535 ms 131.768 ms 152.717 ms 14 0.so-2-0-0.TL2.NYC8.ALTER.NET (152.63.0.185) 198.645 ms 136.199 ms 274.059 ms 15 0.so-3-0-0.TL2.TOR2.ALTER.NET (152.63.2.86) 232.886 ms 188.511 ms 166.256 ms 16 POS1-0.XR2.TOR2.ALTER.NET (152.63.2.78) 153.015 ms 157.076 ms 150.759 ms 17 POS7-0.GW4.TOR2.ALTER.NET (152.63.131.141) 143.956 ms 146.313 ms 141.405 ms 18 akainn-gw.customer.alter.net (209.167.167.118) 384.687 ms 310.406 ms 302.744 ms 19 new.lpi.org (209.167.177.93) 348.981 ms 356.486 ms 328.069 ms
Can other machines in the network reach the firewall? Use ping, or login to the firewall from that machine or try viewing a web page on the internet from that machine.
Does the firewall block the traffic to that particular machine? Maybe someone doesn't want you to look at that particular site and has instructed the firewall to block traffic to and/or from that site.
Inspect the firewall. The problem seems to be on the firewall. Test the interfaces on the firewall, inspect the firewalling rules, check the cabling etc.
Is our eth0 interface up? This can be tested by issuing the command ifconfig eth0.
Are your route definitions as they should be? Think of things like default gateway. The route table can be viewed by issuing the command route -n.
Is there a physical reason for the problem? Check if the the problem is in the cabling. This could be a defective cable or a badly shielded one. Putting power supply cabling and data cabling through the same tube without metal shielding between the two of them can cause unpredictable, hard to reproduce, errors in the data transmission.