Documenting a Network? 528
Philip writes "Three years ago I was appointed as a network manager to a barely functioning MS-based network. Since then I've managed to get it up and running — even thriving — but have been guilty of being too busy with the doing of it to document the changes and systems that were put in place. Now as I look back, I'm worried that I am the only one who will ever know how this network works. If I get hit by a bus or throw in the towel for any reason, I'd be leaving behind a network that requires some significant expertise to run. Ultimately, this won't be a good reference for me if they are trying to work out technical details for years to come. It looks like I'm going to have to document the network with all sorts of details that outside consultants could understand too (no, I don't want to be the outside consultant), especially since it's likely that my replacement will have less technical expertise (read 'cheaper'). Are there any good templates out there for documenting networks? Is anyone who has done it before willing to share some experiences? What did you wish your predecessor had written down about a network that you inherited?"
Just go ahead and quit already (Score:0, Informative)
It's really not as complex of a network as you may think. Go ahead and quit - the sun will still rise and the email will still be delivered....
What about? (Score:5, Informative)
schematics (Score:5, Informative)
Basic network documentation:
I've found that starting out with the very basic physical layout and working your way up in complexity is greatly beneficial.
i.e. start out documenting network cable runs including cable type. follow it by switch layout. follow that by routers and vlan setups. follow that by the servers that provide basic network functionality(e.g. DHCP, etc...). If this is a windows network, that would likely mean detailing the domain controller setups. From their systematically document the systems in order of importance to the business, etc...
Also, visual diagrams are extremely helpful.
Start with disaster scenarios (Score:5, Informative)
For me, visio's are great and everything, passwords too, but really the most valuable thing you can do is document single points of failure, outdated software/hardware, etc., license keys/expiration dates, cert expiration dates, personal support contacts you have and all vendor relationship details as well are essential. Do you use change control? If you do, go back and comment your changes, if not, do the best you can at explaining why things are the way they are. Get some open source software that is good at indexing data and create a searchable knowledge base from the information above. Don't concentrate on docs that can be found on the web at first because any admin worth their salt will know where to look for how to's, etc. Focus on the why's, the where's and disaster recovery.
My two cents...
Here is what I would get (Score:5, Informative)
1. Viseo overview of the network drawing with complex areas drawn out specific detailed viseo's (even a scanned sketch or paint drawing is better than nothing)
2. A spreadsheet with circuit ID's mapped to router and interfaces.
3. Document the trunk interfaces as well as the LAG's (Link Agrregation Groups, port channels, whatever you want to call it)
4. TACACS passwords / domain logins in a secure location (or radius or diameter or whatever you use)
5. Data center capacity as a function of 1. Rack Space, Cooling Capacity, Electrical Load.
6. Write brief knowledge articles describing any problem areas and explaining a history of anything you think would be hard to figure out easily. No need to go hog wild, just re-brand the RCA documentation you have. You do have Root Cause Analysis right?
7. Network protocol hierarchy map. Where are your major redistribution points, what is your routing strategy etc.
8. If you have a voice network document all your DID's, PRI trunks, Gatekeepers, Dial Plan, and any translations you use on h.323 gateways or how MGCP or SIP is configured. If you have a complex call center you should probably pay to have it professionally documented in down to the minute detail.
9. SSID's and BSSID's for any wireless you may have as well as passwords, 802.1x authentication methods, along with linking documentation.
10. Make the documentation part of your CMR process (Change Management Review) and incorporate it into the time allotted for a change.
I know these are just rough ideas and you should get many more ideas from all the smarter people on here than myself, but whatever advice you get I would say you would need to have the documentation update able via subversion, or some document control system and have some kind of review process for it, even if it means getting together over pizza with some of the other groups and asking them about their environment and getting pointers and possibly help on documenting it all. Documentation is a full time effort and IMHO there is no such thing as too much documentation. You would be surprised how good documentation can aid you in problem resolution down the road or aid vendor support in helping you resolve a major outage. The three basic principles of network care are document document document. :)
Cheers,
Anonymous Coward.
TiddlyWiki (Score:5, Informative)
On the side, I manage a small network, and I've also wondered the same sort of thing: if someone else needed to find their way around, where would they start.
A Wiki makes for a really nice way to document things, not least because you can include all sorts of cross references. For example, a list of servers, with links to the services they provide - and a list of services, with links to the servers. But Wiki's normally run on servers, which leaves your successor with a chicken-and-egg problem.
A bit of random surfing turned up TiddlyWiki [tiddlywiki.com], which is a Wiki in a single HTML file. A really elegant bit of engineering, and very handy for self-contained documentation. Since the entire Wiki is just a single file, it's easy to protect. I wound up with two: one with "public" information describing the general architecture and one with private information (including passwords). The private one you can put on a USB-stick in a safe, hand to your boss, or whatever seems appropriate...
Re:Start with disaster scenarios (Score:4, Informative)
Get some open source software that is good at indexing data and create a searchable knowledge base from the information above.
A Wiki?
False Info (Score:0, Informative)
The fact is, you're doing this because "Ultimately, this won't be a good reference for me..."
As you leave, say "The network started bad, and will always be bad. I will sue you if you give a bad reference."
Problem solved.
Why do people make things hard for themselves? (Score:5, Informative)
Why not use an automated too?
www.open-audit.org
Re:What about? (Score:3, Informative)
In other words, the tool you recommend is Etherape?
Any other tools you would use to grab a snapshot?
@the submitter, what about a tool like spiceworks?
I've been doing this for a while (Score:5, Informative)
My job requires me to do exactly what you're looking to do but for multiple companies/networks. Then, as soon as I'm done, I usually pack up and go or get hired in and fix the network.
Since I'm writing the Network Overview for managers AND potential future network managers I tend to write mine in the following format:
1) Synopsis of what the network does for the company, what general technologies they use (Windows AD vs *nix OD, thin clients vs Windows boxes, Cisco vs Brand X), and what the LOB software is.
2) Points of contact for the ISP and other providers (anti-spam, anti-virus, hardware, etc). Passwords for various accounts and services.
3) Logical network overview map (visio), containing firewalls, routers, switches, other devices, open/forwarded ports, IPs, what the servers do, what vlans are in place, Quick explainations for why (such as why vlan vs a seperate subnet).
4) Physical map of devices if the complexity of the network calls for it.
5) Software notes, what apps are critical for the business and which systems they rely on.
Then, for my specific job I have to do the following:
6) Licensing issues.
7) Network weaknesses/points of failure.
9) Other rec's.
Re:What about? (Score:3, Informative)
Etherape is gorgeous. I wish I had time to understand what it all means, but still I love firing it up and staring at the network traffic.
Another tool that I've enjoyed much more personal success with is cheops-ng. For documenting a networking, cheops-ng and a decent icon collection provides a pretty snazzy view of what NMAP can see. (NMAP being another tool I haven't had time to fully grok, yet)
http://cheops-ng.sourceforge.net/screenshots.php [sourceforge.net]
Anonymous Coward (Score:1, Informative)
http://www.spiceworks.com/
Good free tool.
The MACK(TM) Truck Rule (Score:3, Informative)
Ah, you're not following the MACK(TM) Truck Rule.
The MACK Truck Rule (MTR for short) is a measuring stick which we use do determine if a solution is good for us. Basically, it's an objective measurement of the level of expertiese required to do something. Basically, the MTR has you ask yourself (Or your team) the following question:
If the person(s) responsible for a task was suddenly hit by a MACK(TM) truck, How much time would it take for somebody else, untrained, to complete that task if needed?
If that amount of time is unreasonable*, It doesn't follow the MTR. Notice the caveat for unreasonable; this is the subjective part. What' unreasonable for one may be reasonable for another. This needs to be decided for yourselves.
Documentation always helps difficult tasks pass the MTR. So can good support. I try to leave a readme in the place where the installer is for a difficult program. I'm now begining to use FreeMind to map out networks and servers. I have a good ticket system for all our repairs. Hopefully these things will make things easier the day I want to take a vacation.
--Pathway
Wiki Wiki Wiki (Score:5, Informative)
I have a nicely formatted template page with all those categories set out. I also maintain a page of IP address assignments and an inventory of harware specs of all the machines in the office (which is helpful in the cases of "We need to reproduce a bug that only happens on ____ processor with ____ video card" and of "We're getting new machines. Who is in most dire need of an upgrade?").
I write down everything in these, and find myself referring to them very often. My predecessor gave me a Word document with all his notes in it, which has been very useful, and I used that as a starting point for my pages. The wiki has saved me a ton of time, kept me organized, and serves as a great reference for me and for the inevitable next admin.
The only caveat is if the wiki (or the server it's on) goes down. This has happened once, and my instructions for fixing the wiki were... on the wiki, so extra troubleshooting for me. Thus, I find it good practice to maintain a hard copy of the wiki pages, especially the page that tells how to fix the wiki.
I'm running this on Redmine, which has proven to be bleedingly simple to use and administer, and much easier than trac, which we used before. It's especially nice having it on the intranet, as I'll just have a browser open to the wiki as I work on systems and refer to and update it as appropriate. It's very handy to document exactly how I performed a strange or experimental installation of some software that I'll want to replicate later without making myself crazy, and I'll take the extra few seconds to retype the commands I just used into the wiki from anywhere in the building, though I probably wouldn't do the same into a Word doc.
It's not so much the mundane day-to-day that I find that important to document. It's the weird fixes, the trouble spots, the command line parameters, the installation procedures, the changes that shouldn't have fixed it but did, and the horrific chain reaction situations that make one piece of software crash because a seemingly unrelated piece of software has the wrong version of the 64-bit library. Things that take 4 hours to figure out and 3 seconds to implement... those are the ones to document, and those are the ones that I'd be kicking myself 20 months later for neglecting to write down. In an afternoon, any schmuck could walk into the building and figure out which network cable goes where. Documenting the strange bits (and the frustrations), though, can get a malfunctioning mail server back up and running in 3 minutes instead of 3 hours (which, of course, is secondary to good administration keeping the server from going down in the first place).
Re:Lots of flowcharts! (Score:1, Informative)
Dude, I can brag too. I have managed networks for hundreds of customers, and you know what? A 150 page document is worse than fricken useless. We had idiots who tried to get us to use that kind of documentation. Fortunately, they were laughed out of the office. And you get people 'fired for their incompetency'??? There was a poster up above who had a much more useful checklist with REALISTIC guidelines, who deserved to be modded up.
Like:
Change management, with documentation requirements
Network diagrams, both logical and physical. You may not agree with Cisco, but their layered networking model does make documenting networks easier.
Circuit ids labeled on diagrams (so you can find them when you're troubleshooting)
Port numbers for important ports like trunks and circuit termination points
Include the routing protocols, if used
VLAN or VSAN ids allowed on trunk ports, if used
Rack space diagrams, so you know WHERE your devices are, and if you have space for more.
Have a centralized database for passwords
Keep centralized records for after action or root cause analysis reviews and documentation
Keep an inventory
network devices
critical spares
circuit ids
Know
how much power you need
how much cooling you need
Now, in all honesty what you provide may be nice for an IT department who has no documentation whatsoever, but for a network admin who's looking to build something useful for himself and for future lackeys? Get real and quit bragging.
Re:I know... (Score:4, Informative)
How exactly do you get a local admin account on a domain controller? I didn't think there was any such thing.
You're the next guy! (Score:2, Informative)
99,99% of all known possible successors will just hotfix problems as they arise and blame everything on the predecessor. So just write up the things you need and tend to forget in a way you can use...
Yes... I use dust-off (Score:5, Informative)
the cans of compressed air in every office supply store? inverted they throw out a very cold liquid that does exactly what you describe.
Re:What about? (Score:3, Informative)
has anyone ever used The Dude?
http://www.mikrotik.com/thedude.php [mikrotik.com]
i just got a work-study job with the campus adming at the community college i attend. hes been there almost a decade and has no network monitoring system, so he has no idea when something goes down until he gets a complaint or cant get something to work himself. i thought an interesting project during my time with him might be to see if hed let me help implement a network monitoring system, and ive seen a couple of people use The Dude before and it seems pretty capable.
Re:I know... (Score:3, Informative)
do remember that the Directory Restore password is diffrent than the domain admin password..
from directory restore you can change the domain admin.
Re:I know... (Score:4, Informative)
If the predecessor does write the passwords down, he deserves to be fired.
You can't always take it for granted. I've been on the cleanup-end more than once. Sometimes it's "engineered job security"/"they don't dare fire me", sometimes it's "I don't have time for that", sometimes it's just plain forgetting about a piece of hardware you set up the first week you started work there, and sometimes it's a legacy password issue. ("nobody's been able to login to that box since Phil left in '05") The rare treat is finding a mystery box that nobody knows what it does nor has any idea how to login to it. Too many managers don't understand the danger and consider it a waste of time or a bad risk to try to fix problems like that.
Sometimes it's an uphill battle when taking over, too. You want to document the settings on all the routers, but two of them are "legacy" password issues, and the router only supports hard reset (clears password, AND all settings) so you can't get the settings from it once you reset it, and you need the settings from it in case it gets reset. That's never fun, but you will find yourself in that catch-22 occasionally, and it's hard to blame someone for not fixing it because it's utterly unpleasant to deal with. (hint: get another router and program it how you think the mystery box is set up. swap. test. immediately swap the mystery back in. adjust the settings on the new one and test. Swap back out. repeat until you get it right, and don't reset the mystery immediately, keep it onhand for at least a month in case some uncommon thing requires a setting you haven't yet discovered)
Occasionally you can get lucky - contact the hardware vendor and see if the box has an undocumented soft reset. ("open it up and short together the two pads left of D-15, and you can login for 5 minutes with no password")
Re:I know... (Score:3, Informative)
Yes, it's easy to open, but you'd know whether someone tried to tamper with it.
Try spraying the envelope with refrigerant. The paper becomes translucent when wetted and you can sometimes read what's inside, and then it dries without a trace (unlike wetting it with water, which swells up the paper fibers leaving the telltale signs of tampering.)
Learned this one from a history of the U.S. Black Chamber [wikipedia.org].
Does anyone not use "security" envelopes anymore? (The ones that are printed with a dense line pattern on the inside?) I didn't know they sold plain envelopes anymore, except for greeting cards.
Re:I know... (Score:2, Informative)
Re:I know... (Score:4, Informative)
Reboot the DC into "Directory Services Restore Mode" (this is on Server 200 and above) and the local Security Accounts Manager is used again, not AD.
This is actually a way of resetting the admin password on a domain, if you ever need to. Boot from the Linux password reset CD (http://home.eunet.no/pnordahl/ntpasswd/bootdisk.html), blank the Administrator password (which resets the _local_ admin password), then reboot into DS Restore Mode. Log in and then copy cmd.exe over on top of logon.scr and reboot into "normal mode" (with AD functional).
Wait until the "screensaver" pops open and you have a command prompt that just opened that's running with SYSTEM privledges. From there you can run mmc.exe (or whatever else you like).
Oh and of course, this is exactly why you don't allow physical access to servers....
In reference to the PDC statement, I'm not aware of any way to use local accounts on NT 4 or below, but I think he meant just DC instead of "PDC."
Re:Windows != SPAM (Score:3, Informative)
SAFARI on OSX is it's easier, not Mac OSX.p>
OK fanboi, did you even read the link I referred to? Here's an excerpt:
Those aren't my words, they're the guy's who pwned the Mac in a matter of seconds at CanSecWest.
DoDAF For the Masochists (Score:3, Informative)
Re:I know... (Score:2, Informative)
Microsoft Windows [Version 5.2.3790]
(C) Copyright 1985-2003 Microsoft Corp.
C:\>whoami
DOMAIN\administrator
C:\>time
01:05 PM
C:\>at 13:06
Added a new job with job ID = 1
Wait for a minute, and a new command window will pop up.
Microsoft Windows [Version 5.2.3790]
(C) Copyright 1985-2003 Microsoft Corp.
C:\WINDOWS\system32>whoami
nt authority\system
C:\WINDOWS\system32>
This will work on XP and Server 2003 for sure, and doesn't work on Vista. Probably also doesn't work on Server 2008. Depending on how effective your domain administrator is, you might even be able to do this from a regular user account. Be careful who you let run the AT command
Have fun!
Re:I know... (Score:3, Informative)
Sounds like that's the root of our disagreement - when I think of a small business, I think of one that doesn't have five IT people, much less senior IT people.
Re:I know... (Score:2, Informative)
Documenting your network is not about teaching people how to configure X vendor devices, it's about telling them the identity of X vendor, the identity of the devices, how they are in general configured.
I.e. what are the statically assigned IPs go to, which servers have which roles, which server is DHCP, which server is DNS, etc, what subnets exist, whose on them, what each device actually does on the network, e.g. why isn't it in some closet or in the trash can instead of plugged in and running?
And then there are things like design and planning... if you want to allocate a new subnet, where should it (as a matter of planning) start?
What's the special procedure to do X ? e.g. in the case of adding a subnet for a new department, the documentation might indicate things like which DHCP server should be used, how DHCP should be configured, should a certain existing server be used (with dhcp relay), or should a new server be racked up, in general...?
Documentation also generally includes diagrams, things like visual maps that serve as a guide to fully understand how things are setup and working, AND to understand in a manner that permits an understanding of what changes might need to be made...
+ ability to make new changes smoothly.