Stupid Data Center Tricks

Please create an account to participate in the Slashdot moderation system

Stupid Data Center Tricks 305

Posted by timothy on Sunday August 15, 2010 @09:31AM from the ejector-seat-toilet dept.

jcatcw writes "A university network is brought down when two network cables are plugged into the wrong hub. An employee is injured after an ill-timed entry into a data center. Overheated systems are shut down by a thermostat setting changed from Fahrenheit to Celsius. And, of course, Big Red Buttons. These are just a few of the data center disasters caused by human folly."

This discussion has been archived. No new comments can be posted.

Stupid Data Center Tricks

Load All Comments

Search 305 Comments Log In/Create an Account

Comments Filter:

bad article is bad (Score:5, Insightful)

by X0563511 ( 793323 ) writes: on Sunday August 15, 2010 @09:33AM (#33256488) Homepage Journal

The summary reads like a digg post, and has two different links that, in actuality, link to the exact same thing.
This needs some fixin'.

Share
twitter facebook
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  I seem to remember in the early days of Telehouse London an engineer switched off power to the
  entire building. Only two routes out of the UK remained (one was a 256k satellite connection)
  that had their own back-up power.
  - None of us are innocent. (Score:3, Interesting)
    
    by BrokenHalo ( 565198 ) writes:
    
    Good judgement comes from experience. And most experience comes as a result of bad judgement.
    
    Just about anyone who has been in the line of fire as sysadmin for long enough will recall some ill-concieved notion that caused untold trouble. Since my earliest experience with commercial computers was in a batch-processing environment, my initial mishaps rarely inconvenienced anybody other than myself. But I still recall an incident much later (early '90s) when I inadvertently managed to delete the ":per" direc
    - Re:None of us are innocent. (Score:4, Interesting)
      
      by Helen O'Boyle ( 324127 ) writes: on Sunday August 15, 2010 @08:18PM (#33259636) Journal
      
      Good post title, BrokenHalo. I'll chime in with my two. 1987, my first full time job. I was a small ISV's UNIX guru. I wanted to remove everything under /usr/someone. I cd'd to /usr/someone and typed, "rm -r *", then I realized, hey, I know that won't get everything, better add some more, and the command became, "rm -r * .*". I realized, oh, no, this'll get .. too, so I better change it to: "rm -r * .?*". It took about 12 microseconds after I hit enter to realize that ".?*" still included "..". Yes, disastrous results ensued, even though I was able to ^C to avoid most of the damage, and I had the backup tape (back in the day, we used reels) in the tape drive just as users (other devs) began to notice that /usr/lib wasn't there. Yep, I have my own memories of red-facedly telling my boss, "oops, I did this, I'm in the process of fixing it now. Give me half an hour." In the future, "rm -r /usr/someone" did the trick nicely. Early 1990's, I was consulting in the data center of a company with 8 locations around the world. It contained the company's central servers that were accessed by about 700 users. Being a consultant, they didn't have a good place to put me, so I ended up at a desk in the computer room. Behind me was a large counter-high UPS that the previous occupant had used as somewhat of a credenza, and I carried on the tradition. That is, until the day I had put my cape on there, and the cape slid down and through one of those Rube Goldberg miracles caught the UPS master shutoff handle, pulled it down, and I heard about 30 servers (thank goodness there weren't more) powering down instantaneously. Amazingly, I lived, based on the ops manager pointing out to the powers that be that it was a freak accident and that others had been sitting similar stuff in the same place for years. The cape, however, was not allowed back in the data center. Fortunately, I've had better luck and/or been more careful over the past 20 years.
      
      Parent Share
      twitter facebook
      - Re:None of us are innocent. (Score:4, Funny)
        
        by afidel ( 530433 ) writes: on Sunday August 15, 2010 @10:26PM (#33260282)
        
        Wait, this was a 700 person company and they had single power source servers? Yeah the root cause of that one was not your cape =)
        
        Parent Share
        twitter facebook
- Re:bad article is bad (Score:5, Insightful)
  
  by macwhizkid ( 864124 ) writes: on Sunday August 15, 2010 @10:33AM (#33256716)
  
  Article also needs fixin' in the lessons learned from the incidents described. Look, I'm sorry, but if your hospital network was inadvertently taken down by a "rogue wireless access point", the lesson to be learned isn't that "human errors account for more problems than technical errors" -- it's that your network design is fundamentally flawed.
  Or the woman who backed up the office database, reinstalled SQL server, and backed up the new (empty) server on the same tape. Yeah, a new tape would have solved that problem. Or, you know, not being a mindless automaton. Reminds me of a quote one of my high school teachers was fond of: "Life is hard. But life is really hard if you're stupid."
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by amorsen ( 7485 ) writes:
    
    Or the woman who backed up the office database, reinstalled SQL server, and backed up the new (empty) server on the same tape. Yeah, a new tape would have solved that problem. Or, you know, not being a mindless automaton.
    It is not obvious to someone replacing the backup tape whether the backup is appended to the previous backup or replaces the previous one entirely. The former was not all that uncommon back when backup tapes had decent sizes. These days where you need 4 tapes to backup a single drive no one appends.
    Of course there are tons of other things wrong with a one-tape backup schedule, but again she couldn't necessarily be expected to know about them.
    - Re: (Score:2, Insightful)
      
      by macwhizkid ( 864124 ) writes:
      
      It is not obvious to someone replacing the backup tape whether the backup is appended to the previous backup or replaces the previous one entirely. The former was not all that uncommon back when backup tapes had decent sizes. These days where you need 4 tapes to backup a single drive no one appends.
      Yeah, it's not clear from TFA whether she thought there was enough space, or was just clueless. Regardless, though, when you have mission critical data on a single drive you shut it down, put in a fire safe until you're ready to restore, whatever. But you don't just casually keep using it. And who backs up a test database install anyway?
      It's just interesting that the first story in the article was a technical problem (poor network design/admin) being blamed on user error (unauthorized wireless AP/network ca
    - Re: (Score:2, Insightful)
      
      by fishbowl ( 7759 ) writes:
      
      >These days where you need 4 tapes to backup a single drive no one appends.
      These days with LTO-4, my biggest problem is having enough time to guarantee a daily backup.
  - Re: (Score:3, Insightful)
    
    by mlts ( 1038732 ) * writes:
    
    This is one reason that D2D2T setups are a good thing. If the tape gets overwritten, most likely the copy sitting on the HDD is still useful for recovering.
    One thing I highly recommend businesses get is a backup server. It can be as simple as a 1U Linux box connected to a Drobo that does an rsync. It can be a passive box that does Samba/CIFS serving, one account for each machine and each machine's individual backup program dump to it. Or the machine can have an active role and run a backup utility like
  - - - Re: (Score:3, Insightful)
        
        by internewt ( 640704 ) writes:
        
        Seriously?! Your job title is Network Administrator! Administer the damn network! It's what you were hired to do!
        And if management doesn't want to make the change?
        Get it in writing.
        Then when you are set up for a fall when the inevitable happens, you have something to cover your arse with.
- - Re:bad article is bad (Score:4, Funny)
    
    by Timex ( 11710 ) writes: <smithadminNO@SPAMgmail.com> on Sunday August 15, 2010 @09:39AM (#33256518) Journal
    
    the summary text is, verbatim, the first part of the article.
    It is my personal observation that this seems to be the best way to get anything on the front page: using the article text as the "summary". Isn't it nice to see that Slashdot submitters are so original in their writing skill? :D
    
    Parent Share
    twitter facebook
    - Re: (Score:2, Interesting)
      
      by commodore64_love ( 1445365 ) writes:
      
      But.....
      I only got a 200 on my English SAT. I's got no writin' skills. That's why I became a computer geek instead.
      - Re:bad article is bad (Score:5, Funny)
        
        by Sponge Bath ( 413667 ) writes: on Sunday August 15, 2010 @12:45PM (#33257286)
        
        I only got a 200 on my English SAT. I's got no writin' skills.
        You has done been promoted to /. editor. Collect your "Grammer be important!" t-shirt at the door.
        
        Parent Share
        twitter facebook
    - Re: (Score:2, Offtopic)
      
      by The Grim Reefer2 ( 1195989 ) writes:
      
      Isn't it nice to see that Slashdot submitters are so original in their writing skill? :D
      I believe you meant to say, "Ain't it gr8 2 C th@ /. submitters R so 0Ri9IN4l @ writin' skillz?"
    - - Re: (Score:3, Interesting)
        
        by OnlineAlias ( 828288 ) writes:
        
        The first one, at the IU school of medicine, I'm very familiar with that place...they have no data center to speak of, and I do not know that person. I never heard of that incident. Also, who doesn't run spanning tree with BPDU gaurd and other such protections. I know IU does, for a fact.
        Something is very very wrong with that article.
        
        Re: (Score:2)
        
        by Cylix ( 55374 ) * writes:
        
        They said hub so maybe it was from the 90s?
        That was a big danger back in the day when running a lot of hubs and reserving switches closer to the core.
        So either it was a limitation of funds that led to the problem or a limitation of intelligence.
        
        Re: (Score:3, Informative)
        
        by green1 ( 322787 ) writes:
        
        To this day most cheap switches still can't handle a network cable with the 2 ends plugged in to the same switch. As a telco company technician I can't count the number of times I've solved someone's Internet connectivity problem by unplugging said cable. (I sort of understand it when there's a big mess of cables and you can't see where they all go, but I've also seen some really ridiculous ones where the troublesome cable is less than a foot long and therefore extremely obviously out of place!)
        And before s
        
        Re: (Score:3, Interesting)
        
        by afidel ( 530433 ) writes:
        
        Actual Cisco stuff (as opposed to Linksys gear with a Cisco badge) will discover a loop in an adjacent switch and shutdown the uplink port. Of course if you haven't turned on sw portfast the switch will do spanning tree which will keep the port from ever coming up, so yes better switches will definitely solve the problem. I had a network where the training room and C* row were serviced from the same 48 port switch, our very ADD CEO was in the training room trying to ignore a boring meeting and plugged two a
  - Re: (Score:3, Informative)
    
    by dsoltesz ( 563978 ) writes:
    
    *yawn* That's because it was on digg [digg.com], posted in a nearly identical fashion, two days ago. Agreed. Bad article is bad. And now it's old.
Network meltdown due to hub cross-connects (Score:5, Interesting)

by Florian Weimer ( 88405 ) writes: <fw@deneb.enyo.de> on Sunday August 15, 2010 @09:37AM (#33256508) Homepage

Can this really happen easily? I thought for really ugly things to happen, you need to have switches (without working STP, that is).

Share
twitter facebook
- Re: (Score:2)
  
  by Lehk228 ( 705449 ) writes:
  
  a hub can also be a switch. I have worked with people who referred to both switches and repeaters as hubs
  - Re:Network meltdown due to hub cross-connects (Score:4, Interesting)
    
    by ianalis ( 833346 ) writes: on Sunday August 15, 2010 @10:33AM (#33256718) Homepage
    
    According to CCNA Sem 1, a hub is a multiport repeater that operates in layer 1. A switch is a multiport bridge that operates in layer 2. I thought these definitions are universally accepted and used, until I used non-Cisco devices. I now have to refer to L2 and L3 switches even if CCNA taught me that these are switches and routers, respectively.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by X0563511 ( 793323 ) writes:
      
      It's so irritating when you ask for a hub, and someone hands you a switch. Stores do the same thing. It's hard enough to find hubs, let alone find them when the categorization lumps them together.
      No, I said hub. I don't want switching. I want bits coming in one port to come back out of all the others.
      You can do that with a switch, but getting a switch that can do that is a bit more pricey than a real hub...
      - Re: (Score:3, Insightful)
        
        by coryking ( 104614 ) * writes:
        
        Why would you want a hub in the first place? The only hub on newegg was some $650 24-port 10/100 deal. But unless you were trying to keep some crazy legacy network alive, why not spring for a modern 10/100/1000 switch? [newegg.com]
        
        Re: (Score:3, Interesting)
        
        by Mad Bad Rabbit ( 539142 ) writes:
        
        Cheap deep-packet inspection (using an old hub and Wireshark) ?
        
        Re: (Score:3, Insightful)
        
        by X0563511 ( 793323 ) writes:
        
        Well.
        The foundry switch I was screwing around with today... wasn't letting the IP Engineer send all the vlans to the mirror port. I could only watch management traffic (STP, etc) and nothing of any actual use.
        It was great! Finally I got pissed off and shoved a homemade passive tap on the uplink and was -then- able to see the issue.
        A hub would have made this a 5 minute job.
    - Re: (Score:3, Informative)
      
      by bsDaemon ( 87307 ) writes:
      
      There is such a thing as a Layer 3 switch. They have routing functionality built-in, mostly to reduce latency for inter-vlan routing across a single switch. Cisco makes devices called Layer 3 switches, which are different from routers.
      - Re: (Score:3, Informative)
        
        by bsDaemon ( 87307 ) writes:
        
        The physical difference is pretty much the key. The Layer 3 switch will have a bunch of Ethernet ports, but generally no serial ports (other than the console and auxiliary, of course). The layer 3 switch tends to push most of the work logic off onto ASICs rather than doing it in software on CPU time, too. That way you don't suffer much performance loss when routing between VLANs, but you wouldn't put on at a WAN uplink or network border.
    - Re:Network meltdown due to hub cross-connects (Score:5, Informative)
      
      by Geoff-with-a-G ( 762688 ) writes: on Sunday August 15, 2010 @07:15PM (#33259310)
      
      I'm CCNP, taking my CCIE lab next month, I'll give this a shot.
      Yes, the "cow goes moo" level definitions you get are "hub = L1, switch = L2, router = L3" but the reality is more complex.
      A hub is essentially a multi-port repeater. It just takes data in on one port and spews it out all the others.
      A switch is a device that uses hardware (not CPU/software) to consult a simple lookup table which tells it which port(s) to forward the data, and does so very fast (if not always wire-speed). Think like the GPU/graphics card in your PC. Something specific super fast.
      A router is a device that understands network hierarchy/topology (in the case of IP, this is mainly about subnetting, but there are plenty of other routed protocols) and can traverse that hierarchy/topology to determine the next hop towards a destination.
      Now, because of the protocol addressing in Ethernet and IP, these lend themselves easily to hub/switch/router = L1/L2/L3, but they're not really defined that way.
      These days, most Cisco switches (3560, 3750, 6500, etc) run IOS, the software which can do routing, and which uses CEF. CEF in a nutshell takes the routing table (which would best be represented as a tree) and compiles it into a "FIB", which is essentially a flat lookup-table version of that same (layer 3, IP) table. It also caches a copy of the L2 header that the router needs to forward an L3 packet. The hardware (ASICs) in the switches hold this FIB, and thus allow them to "switch" IP/L3 packets at fast rates and without CPU intervention, thus making them still "switches", even if they run a routing protocol and build a routing table.
      Meanwhile, when Cisco refers to a "router" in marketing terms, they're talking about a device with a (relatively) powerful CPU, which can not only perform actual routing, but also usually more CPU-intensive inter-network tasks like Netflow and NBAR.
      
      Parent Share
      twitter facebook
- Re:Network meltdown due to hub cross-connects (Score:4, Informative)
  
  by Pentium100 ( 1240090 ) writes: on Sunday August 15, 2010 @09:51AM (#33256566)
  
  This should work quite OK with hubs. A hub, after all, sends the packet to every port except the one where it came from. So two hubs in a loop should just forward the same packet back and forth all the time.
  
  Parent Share
  twitter facebook
- Re:Network meltdown due to hub cross-connects (Score:5, Informative)
  
  by omglolbah ( 731566 ) writes: on Sunday August 15, 2010 @09:53AM (#33256584)
  
  Oh yes, it works quite well for sabotaging a network.
  It used to be a constant issue at LAN parties where "pranksters" would do it before going to sleep... Usually we never found them but when we did we flogged them with cat5 cables stripped of insulation :p
  
  Parent Share
  twitter facebook
  - Re:Network meltdown due to hub cross-connects (Score:5, Funny)
    
    by pushing-robot ( 1037830 ) writes: on Sunday August 15, 2010 @12:59PM (#33257354)
    
    Ah, yes, what network technician hasn't felt the sting of the old "cat5 o' eight tails"?
    
    Parent Share
    twitter facebook
    - - Re:Network meltdown due to hub cross-connects (Score:4, Funny)
        
        by ColdWetDog ( 752185 ) writes: on Sunday August 15, 2010 @02:10PM (#33257708) Homepage
        
        When you're 16 working at a LAN party you get somewhat motivated when an 18 year old girl wearing duct-tape clothing (skimpy at that :p) wields such a tool :p
        
        Yes, and now look at you. Years later, life wasted. Posting to Slashdot on a weekend.
        
        If only you had listened to your mother and gone into welding.
        
        Parent Share
        twitter facebook
- Re: (Score:2)
  
  by betterunixthanunix ( 980855 ) writes:
  
  I saw this happen at my high school once -- someone thought it would be funny to connect one port of an old switch to another port on that same switch. The entire network was flooded for a day while the IT staff tried to figure out where the switch was.
  
  That was years ago though, I would have thought that by now, these issues had been resolved.
  - Re: (Score:2)
    
    by jimicus ( 737525 ) writes:
    
    It has in theory. Spanning tree should take care of it.
    Though I have seen interop issues which prevent any traffic from going between two different vendors' STP-enabled switches.
- Human error rate (Score:2)
  
  by frisket ( 149522 ) writes:
  
  Human error rate is enormously variable [hawaii.edu], but for infrequently-occurring tasks (those you only do occasionally, not every day), a value of between 1% and 2% is a useful approximation.
  I am fortunate in working in an organisation with perhaps the best and most competent ops manager I have ever worked with, but even with well-written procedures and well-trained ops staff, errors still occur — but very rarely.
- Re: (Score:2)
  
  by MoogMan ( 442253 ) writes:
  
  Reading TFA, it was almost certainly because STP wasn't set up correctly. For instance, if the switchport in question had bpduguard enabled then it would have become disabled as soon as the erroneous hub was added, resulting in a localised issue not a network-wide problem.
  It's an issue that many Network Engineers learn the hard way exactly once and fix quickly by reviewing their STP configuration and in many cases, introduce QoS for sanity.
  "We didn't do an official lessons learned [exercise] after this, it was just more of a 'don't do that again,'" says Bowers
  Well, apart from that guy.
- Re: (Score:3, Informative)
  
  by Shimbo ( 100005 ) writes:
  
  Can this really happen easily? I thought for really ugly things to happen, you need to have switches (without working STP, that is).
  Spanning tree can not deal with the situation where there is a loop on a single port, which you can do easily by attaching a consumer grade switch. There are various workarounds (such as BPDU protection) but they aren't standard, and require manual configuration. Once your network gets big enough, you probably can't afford not to use them, though.
Router Plugged Into Itself (Score:5, Funny)

by Anonymous Coward writes: on Sunday August 15, 2010 @09:39AM (#33256514)

Where I work a couple years ago one of the non-technical people decided to plug a router into itself. Ended up bringing down the whole network for ~25 people in a company which depended on the Internet (Internet marketing company).
Unfortunately one of the tech guys figured it out literally as everyone was standing by the elevator waiting for it to take us home. We were that close to freedom :(

Share
twitter facebook
- - Re: (Score:2)
    
    by X0563511 ( 793323 ) writes:
    
    Nah, those guys just lease dedicated servers until they get an abuse takedown, then move on (or bitch and whine to squeeze that server for all it's worth)
Don't try this at work... (Score:2, Interesting)

by alphatel ( 1450715 ) * writes:
- Plug all the ethernet-like T1 cables into a switch
- Change the administrator password and forget what you changed it to
- Hang everything off a single power strip, no UPS
- Buy expensive remote management cards but don't bother to configure them
- Re:Don't try this at work... (Score:4, Interesting)
  
  by v1 ( 525388 ) writes: on Sunday August 15, 2010 @10:03AM (#33256610) Homepage Journal
  
  - run thinnet lines along the floor under people's desks, for them to occasionally get kicked and aggravate loose crimps, taking entire banks of computers (in a different wing of the building) off the LAN with maddening irregularity
  - plug a critical switch into one of the ups's "surge only" outlets
  - install expensive new baytech RPMs on the servers at all remote locations, and forget to configure several of the servers to "power on after power failure".
  - on the one local server you cannot remote manage, plug its inaccessible monitor into a wall outlet
  honorable mention:
  - junk the last service machine you have laying around that has a scsi card in it while you still have a few servers using scsi drives
  
  Parent Share
  twitter facebook
Not using Cisco ACLs (Score:4, Interesting)

by Nimey ( 114278 ) writes: on Sunday August 15, 2010 @09:48AM (#33256556) Homepage Journal

Our entire network was brought down a few years ago when a student plugged a consumer router into his dorm room's port. Said router provided DHCP, and having two conflicting DHCP servers on the network terminally confused everything that didn't use static IPs.
Took our networking guys hours to trace that one down.

Share
twitter facebook
- Re:Not using Cisco ACLs (Score:4, Insightful)
  
  by omglolbah ( 731566 ) writes: on Sunday August 15, 2010 @09:52AM (#33256574)
  
  Amusingly anyone who ever worked as tech crew at a lan party knows that this is the first thing you look for... :p
  
  Parent Share
  twitter facebook
- Re: (Score:3, Interesting)
  
  by GuldKalle ( 1065310 ) writes:
  
  I had that error too, on a city-wide network. The solution? Get an IP from the offending router, go to its web interface, use the default password to get in, and disable DHCP.
  - Re:Not using Cisco ACLs (Score:5, Insightful)
    
    by blair1q ( 305137 ) writes: on Sunday August 15, 2010 @11:45AM (#33256992) Journal
    
    Or unplug it.
    The slow part is figuring out that that's the problem. The first time it happens to you.
    Which is why it's good to have oldbies around, to whom lots of weird shit has happened.
    
    Parent Share
    twitter facebook
- Re:Not using Cisco ACLs (Score:5, Informative)
  
  by jimicus ( 737525 ) writes: on Sunday August 15, 2010 @10:16AM (#33256656)
  
  Hours?
  You get something on the network which has an IP from the offending DHCP server, use ARP to establish what that DHCP servers' MAC address is then lookup the switches' own tables to figure out which port that MAC is plugged into and switch that port off and wait for the equipment owner to start complaining. Takes about 3-5 minutes to do by hand, and some switches can do it automatically.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Informative)
    
    by eric2hill ( 33085 ) writes:
    
    Cisco switches have a wonderful feature called dhcp snooping.
    ip dhcp snooping
    Followed by
    ip dhcp snooping trust
    on your port that supplies DHCP to the network. This ensures that only the trusted port can hand out dhcp addresses, and as a bonus, the switch tells you which MAC has which IP.
    show ip dhcp snooping binding
    - Re:Not using Cisco ACLs (Score:5, Informative)
      
      by fluffy99 ( 870997 ) writes: on Sunday August 15, 2010 @01:49PM (#33257616)
      
      Cisco switches have a wonderful feature called dhcp snooping.
      Not supported on many of the lower end Cisco edge switches. It believe it also interferes with DHCP relaying.
      Another great tool is "ip verify source vlan dhcp-snooping
      " which can be used to block traffic from IPs/macs that did not obtain their IP from the DHCP server. This nicely prevents users from statically assigning addresses and/or spoofing their mac address.
      
      Parent Share
      twitter facebook
  - Re: (Score:2)
    
    by Nimey ( 114278 ) writes:
    
    *shrug* Most likely they'd never considered a "hostile" DHCP server on the network (lots of other things could have killed the network, so they thought), and had never seen what that looks like.
    OTOH we can't pay very well, so we can't get top-notch talent.
    - Re: (Score:2)
      
      by jimicus ( 737525 ) writes:
      
      *shrug* Most likely they'd never considered a "hostile" DHCP server on the network (lots of other things could have killed the network, so they thought), and had never seen what that looks like.
      OTOH we can't pay very well, so we can't get top-notch talent.
      My employer develops router firmware. Our engineers are experts at finding odd ways to kill the network ;)
  - Re: (Score:2)
    
    by Darth_brooks ( 180756 ) * writes:
    
    That just tells you what it's plugged in to. Doesn't necessarily tell you *where* it is, it just narrows it down. and if you can't disable that switch port remotely....hoo boy...and since it's in a dorm you have the risk of multiple patches in a single room or worse, someone smart enough to say "hey, this doesn't work in my room, lemme try my friend's room down the hall..."
    Goes back to the old line "I've lost a server. Literally lost it. It's up, it responds to ping, i just cant *find* it."
  - - Re: (Score:2)
      
      by X0563511 ( 793323 ) writes:
      
      Hmm. People seem to get an address from one of two subnets, randomly. I wonder what the problem could be!?
      That, and people seem to be afraid of firing up the o'le packet sniffer... it would have been REALLY clear (immediatly) what the problem is, should someone do that.
      If you don't have (or don't know how to make) a passive tap, GTFO.
      - Re: (Score:3, Insightful)
        
        by Gumbercules!! ( 1158841 ) writes:
        
        I have to agree with this guy. As soon as IP addresses started being assigned incorrectly, the first thing I would be doing is checking the DHCP server. ipconfig /all on a windows box (so may 3 seconds of typing) would give this answer.
        
        More to the point, though - why was another DHCP allowed on the network? Can your switches not block or refuse to route DHCP traffic from the wrong host?? Otherwise every single student who brings in their own wifi box is going to shut down the network.
- Re: (Score:3, Funny)
  
  by contrapunctus ( 907549 ) writes:
  
  I have done this error before :)
  What surprised me was that the linksys router assigned IP numbers up thorough the uplink connection. I thought that was impossible, guess not.
Quad Graphics 2000 (Score:5, Interesting)

by Anonymous Coward writes: on Sunday August 15, 2010 @09:51AM (#33256570)

In the summer of 2000 I worked at Quad/Graphics (printer, at least at that time, of Time, Newsweek, Playboy, and several other big-name publications). I was on a team of interns inventorying the company's computer equipment -- scanning bar coded equipment, and giving bar codes to those odds and ends that managed to slip through the cracks in the previous years. (It's amazing what grew legs and walked from one plant to another 40 miles away without being noticed.)
One of my co-workers got curious about the unlabeled big red button in the server room. Because he lied about hitting it, the servers were down for a day and a half while a team tried to find out what wiring or environmental monitor fault caused the shutdown. That little stunt cost my co-worker his job and cost the company several million dollars in productivity. It slowed or stopped work at three plants in Wisconsin, one in New York, and one in Georgia.
The real pisser was the guilty party lying about it, thereby starting the wild goose chase. If he had been honest, or even claimed it was an accident, the servers would have all been up within the hour, and at most plants little or no productivity would have been lost.
The reality: a 20 year old's shame cost a company millions.

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by Anonymous Coward writes:
  
  Why the fuck was the button unlabeled? That's the REAL MISTAKE.
  - - How about... (Score:2)
      
      by denzacar ( 181829 ) writes:
      
      ICBM Launch Control - Moscow, Leningrad, Novosibirsk WAIT FOR PRESIDENT'S ORDERS BEFORE PRESSING
      
      Deathly silence after someone does press the button should be adequate punishment.
      Naturally, potential super-criminals, James Bond villains and right-leaning survivalist nationalist employees should be explained button's real purpose to avoid accidents caused by someone deciding to rid the world of communism during their lunch break.
- Re:Quad Graphics 2000 (Score:5, Funny)
  
  by FictionPimp ( 712802 ) writes: on Sunday August 15, 2010 @10:16AM (#33256660) Homepage
  
  Well, where I work some maintenance genius decided that the location of the red button (near the entrance door) was too risky. They said people coming in the door could hit it while trying to turn on the lights.
  Their solution? They moved it to behind the racks. So every time I bend down to move or check something I have to be conscious not to turn off the power to the entire room with my ass.
  
  Parent Share
  twitter facebook
  - Re:Quad Graphics 2000 (Score:4, Informative)
    
    by X0563511 ( 793323 ) writes: on Sunday August 15, 2010 @11:02AM (#33256830) Homepage Journal
    
    Hmm, if only someone could invent some kind of cover [wiktionary.org] to prevent accidental use...
    I think a compounding issue is that the facilities guy (or higher up) is a cheapass.
    
    Parent Share
    twitter facebook
- Re:Quad Graphics 2000 (Score:5, Funny)
  
  by drsmithy ( 35869 ) writes: <drsmithy&gmail,com> on Sunday August 15, 2010 @06:05PM (#33259024)
  
  One of my co-workers got curious about the unlabeled big red button in the server room. Because he lied about hitting it [...]
  At a previous job we had one of these (albeit with a "Do not push this, ever" label above it) that did nothing more than set off a siren and snap a photo of the offender with a hidden camera. Much amusement was had by all when some new employee's curiosity inevitably got the better of them.
  
  Parent Share
  twitter facebook
Obligatory: The Etherkiller (Score:2, Funny)

by Anonymous Coward writes:

The Etherkiller [fiftythree.org]
From TFA (Score:2)

by ep32g79 ( 538056 ) writes:

Sure, technology causes its share of headaches, but human error accounts for roughly 70% of all data-center problems.
And 70% of all statistics are made up on the spot.
Video (Score:5, Funny)

by AnonymousClown ( 1788472 ) writes: on Sunday August 15, 2010 @10:11AM (#33256644)

Here's a video of a tech worker explaining why these things happen. [youtube.com]
It's very disturbing and you'll see why these things happen.

Share
twitter facebook
- Re:Video FTW (Score:3, Insightful)
  
  by dsoltesz ( 563978 ) writes:
  
  Thank you... you've single-handedly made spending my time on recycled, old digg news completely and totally worth it.
- Obligatory (Score:4, Funny)
  
  by garyisabusyguy ( 732330 ) writes: on Sunday August 15, 2010 @01:41PM (#33257580)
  
  THE WEBSITE'S DOWN!!!
  http://www.youtube.com/watch?v=W8_Kfjo3VjU [youtube.com]
  
  Parent Share
  twitter facebook
Don't forget accidentally triggering the Halon (Score:2)

by RogueWarrior65 ( 678876 ) writes:

Way back in the day at the B.U. computer center, the machine room had an extensive Halon fire system with nozzles under the raised flooring and on the ceiling. Pretty big room that housed an IBM mainframe, about a half dozen tape drives, maybe 50 refrigerator-sized disk drives, racks and racks of magnetic tape, a laser printer the size of a small car, networking hardware, etc. etc. One day, the maintenance people were walking through and their two-way radios set off the secondary fire alarm. At that poin
My favourite human error - a true story (Score:5, Interesting)

by Kupfernigk ( 1190345 ) writes: on Sunday August 15, 2010 @10:50AM (#33256782)

This was a server room at an (unnamed) UK PLC. The air conditioning had remote management, and the remote management notified the maintenance people that attention was needed. So someone was sent out, on a Friday afternoon.
When he arrived, most of the staff had gone home and the skeleton IT staff didn't want to hang around. So, they sent him away on the basis that his work wasn't "scheduled".
Everybody came back on Monday to find totally fried servers.

Share
twitter facebook
- Re:My favourite human error - a true story (Score:5, Funny)
  
  by dirk ( 87083 ) writes: <dirk@one.net> on Sunday August 15, 2010 @02:51PM (#33257918) Homepage
  
  I have a better AC story. We had a second AC unit installed in server room, as the first was cranking 24/7 and was just barely keeping up, with the thought that the 2 of them in tandem could handle the load. A few days after it was installed, we noticed the room was hot when we got in in the morning. Not enough to cause alarms, but hotter than it should be. As the day went on, it dropped, so we chalked it up to a one time fluke. This happened a time or 2 more throughout the week, but it always dropped during the day. Finally the weekend came, and it got hot enough to cause an alarm. We got in and the AC units kicked on without us actually doing anything, and the room started to cool down. We called out AC guys and they checked both system and couldn't find anything wrong with either of them. Well, the same thing happened again that night. Finally, someone was there late, trying to see if they could see what was going on. Everything was fine throughout the evening, so they finally decided to leave. Luckily, they noticed as they walked out the door and flipped off the lights that the AC units both turned off. HE went back in to verify, and when he turned the lights back on, the AC units both started again. Turned the lights off, and they both shut off again. The genius (lowest bid) company that we hired to install the new AC unit had wired both units into the wall switch for the lights! So when we were there checking, we had the lights on and everything worked perfectly. We went home for the day and turned off the lights, and the AC units. Needless to say, that company isn't even allowed inside out building anymore!
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Funny)
    
    by Linker3000 ( 626634 ) writes:
    
    More AC fun - all in the same room - as refurbed into a computer room in the 1980s by the in-house maintenance team:
    
    1) They re-lined the walls, but also boxed in the radiators without turning them off - we had numerous AC engineers turning up and scratching their heads while they re-did their thermal load calculations until we realised our walls were warm to the touch.
    
    2) They put the AC stat on a pillar by the windows so in the summer, the heat radiation falling on the stat from outside made the AC run
    - Re:My favourite human error - a true story (Score:4, Funny)
      
      by internewt ( 640704 ) writes: on Sunday August 15, 2010 @11:06PM (#33260454) Journal
      
      Best cock-up I saw was a computer room with a 4ft under-floor void. There should have been a 4 inch void, but there was a major cock-up between architects and builders. The floor panels sat on some spookily-sized pillars (which must have been specially made) and the IT staff actually put some servers under the floor.
      Was Nigel Tufnel the architect?
      
      Parent Share
      twitter facebook
cascade failures (Score:4, Interesting)

by Velox_SwiftFox ( 57902 ) writes: on Sunday August 15, 2010 @11:02AM (#33256822)

How can this leave out the standard cascade failure scenario?
Trying to achieve redundancy, someone gets what they think is worst-case-30A of servers with multiple power supplies, plugs one power supply on each into one PDU rated 30A, one power supply into the other.
They may or may not know that the derated capacity of of the circuit is only 24A, the data center is unlikely to warn them as they only appear to be using 15A per circuit at most.
Anyway, something happens to one of the PDUs and the power is lost from it. Perhaps power factor corrections (remember the derating?) and cron jobs running at midnight on all the servers that raise the load high simultaneously. Maybe just the failure of one of the PDUs that was feared, causing the attempt at "redundancy".
In any case, all of the load is then put on the remaining circuit, and it always fails. The whole rack loses power.

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by omglolbah ( 731566 ) writes:
  
  Yep, it is one of the specific steps when we define requirements for server racks. Sadly not all the customers pay attention and then yell for us to come fix the mess when they find out years later :p
  This is especially fun if the trip to the "datacenter" involves a helicopter ride to the oil rig where it is located :p
Power strips (with on/off buttons) are bad (Score:2, Funny)

by gavving ( 1689168 ) writes:

So I'm working in this company's datacenter on their networking equipment. But it's installed is such a crappy way that there's a floor tile pulled right next to the rack and the cables are run down into that hole. I'm working around on the equipment and step down into the hole by accident, at that point I notice that it's suddenly alot quieter where I'm standing, I look down and realize I'd just stepped on the power button of a power strip that most of the networking equipment was plugged into. Oh Sh!t.
- Re:Power strips (with on/off buttons) are bad (Score:4, Insightful)
  
  by Velox_SwiftFox ( 57902 ) writes: on Sunday August 15, 2010 @11:20AM (#33256910)
  
  Covering those power strip buttons with a hardened glob fixing them in the "on" position is what an electric glue gun is for.
  
  Parent Share
  twitter facebook
data centers 101 (Score:5, Funny)

by ei4anb ( 625481 ) writes: on Sunday August 15, 2010 @11:28AM (#33256938)

Those data centers in the article sound huge, some may even have up to ten servers!

Share
twitter facebook
- Re: (Score:2)
  
  by cowboy76Spain ( 815442 ) writes:
  
  Well, they are those that probably will have less people and with less experience servicing it.... you can try to manage the first couple of servers with some "flexibility"; when you have hundreds of them everything must be done "by the book" or thing go definitely wrong.
  When I got to my current job, a couple of servers (our first rack servers) where installed, and nobody was "in charge" of them. Being myself a guy with initiative, I did the best that I could with them even if I had only experience in progr
Mainframe days story (Score:5, Interesting)

by assemblerex ( 1275164 ) writes: on Sunday August 15, 2010 @11:45AM (#33256994)

The old tape machines (six foot tall) used to put out a tremendous amount of heat. Space is at a premium, so in the mainframe room the drives were normally put edge to edge,
with one pushing air in and the other pulling air out. The machines had two 10-12" fans per unit, so stacking two or three units was fine. One site had so many machines side to
side (over 7), the air coming out the last machine regularly set things on FIRE. It was not uncommon for the machine to ignite lint going through the stack, with it coming out the
end as a small explosion like dust in a grain silo explosion. A fire extinguisher was kept on hand, and the wall eventually got a stainless steel panel because it was so common.

Share
twitter facebook
- Re: (Score:2)
  
  by Idarubicin ( 579475 ) writes:
  
  One site had so many machines side to side (over 7), the air coming out the last machine regularly set things on FIRE. It was not uncommon for the machine to ignite lint going through the stack, with it coming out the end as a small explosion like dust in a grain silo explosion. A fire extinguisher was kept on hand, and the wall eventually got a stainless steel panel because it was so common.
  I call BS.
  Thermodynamics 101: If the air coming out of the last unit is hot enough to ignite things, then what is the minimum temperature of the stuff inside?
  I can maybe believe that there was some sort of electrical fault inside that was infrequently arcing (maybe when a dust bunny passed through the fans?) and that might have caused the apparent problem. But there's no way to have functional electronics that are hot enough to ignite organic matter.
  - There is. Tubes. (possibly before you were born) (Score:3, Informative)
    
    by Kupfernigk ( 1190345 ) writes:
    
    I don't know how old these tape machines were, but I can assure you that back in the day we had power systems that used vacuum tubes, and the tube space needed to be air cooled. The air temperature could reach several hundred Celsius if the fans stopped. Shortly after this would come the plop of inrushing air as the envelope of a KT88 collapsed at the hottest point. It would not be good design practice to series the units like this, but again back in the day thermal management wasn't even a black art. The l
FedEx, get insurance/ship your server (Score:4, Interesting)

by AnAdventurer ( 1548515 ) writes: on Sunday August 15, 2010 @11:52AM (#33257020)

When I was IT manager for a big retail mfg we had a cross-country move from the SF bay area to TN (closer to shipping hubs and lower tax rates). I was hired for the new plant, and I was there setting up everything (I did not know the company knew next to nothing about technology) and the last thing shipped before the company shutdown for the move was ship the data server via 2 day FedEx. The CFO packed it up and shipped it out, as the driver pulled away from the bay the server fell off the bumper and onto the cement. They picked it up (looking undamaged in it's box). When I opened it there was a shower of parts. A HD drive had detached from the case but not the cable and had swung around in that case like a flail. CFO had NOT INSURED the shipment or taken anything apart. That and much more to save $50 here and there.

Share
twitter facebook
Data center power (Score:4, Interesting)

by PPH ( 736903 ) writes: on Sunday August 15, 2010 @12:22PM (#33257156)

Back when I worked for Boeing, we had an "interesting" condition in our major Seattle area data center (the one built right on top of a major earthquake fault line). It seems that the contractors who had built the power system had cut a few corners and used a couple of incorrect bolts on lugs in some switchgear. The result of this was that, over time, poor connections could lead to high temperatures and electrical fires. So, plans were made to do maintenance work on the panels.
Initially, it was believed that the system, a dually redundant utility feed with diesel gen sets, UPS supplies and redundant circuits feeding each rack could be shut down in sections. So the repairs could be done on one part at a time, keeping critical systems running on the alternate circuits. No such luck. It seems that bolts were not the only thing contractors skimped upon. We had half of a dual power system. We had to shut down the entire server center (and the company) over an extended weekend*.
*Antics ensued here as well. The IT folks took months putting together a shut down/power up plan which considered numerous dependencies between systems. Everything had a scheduled time and everyone was supposed to check in with coordinators before touching anything. But on the shutdown day, the DNS folks came in early (there was a football game on TV they didn't want to miss) and pulled the plug on their stuff, effectively bringing everything else to a screeching halt.

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by thegarbz ( 1787294 ) writes:
  
  Basic rules of redundancy. A UPS isn't!
  
  We had a similar situation to yours except we actually had a dual power system. The circuit breakers on the output however had very dodgy lugs on their cables which caused the circuit breakers to heat up, A LOT. This moved them very close to their rated trip current. When we eventually came in to do maintenance on one of the UPSes we turned it off as per procedure, naturally the entire load moved to the other. About 30 seconds later we hear a click come from a dis
This Simply Demonstrates ... (Score:2)

by smpoole7 ( 1467717 ) writes:

... that an idiot with his/her hand on a switch, a breaker or a power cord is more dangerous than even the worst computer bug.
(Judging from the houses that I see on my way to work each morning, some people shouldn't even be allowed to buy PAINT without supervision. And we provide them with computers and access to the Internet nowadays!)
(If that doesn't terrify you, you have nerves of steel.)
Big red button: a true story (Score:3, Informative)

by eparker05 ( 1738842 ) writes: on Sunday August 15, 2010 @12:52PM (#33257324)

My mother, who is a database admin for a county office (and has been for a long time), was getting a tour of a brand new mainframe server in the basement of her department's building back in the early 80's. At some point during the tour a large red button was pointed out that controlled the water-free fire suppression system. When pressed it activated a countdown safety timer that could be deactivated when the button was pulled back out.
Always wanting to try things for herself, she went to the red button at the end of the tour and pressed it. No timer was activated, instead a noticeable shutting down sound was heard as the buzzing of the mainframe died down. She accidentally hit the manual power-off button for the mainframe which was situated very close to the fire suppression button and happened to look similar.
All the IT staff of that building got to go home early that day because the mainframe took several hours to reboot and it was already lunch. She was very embarrassed and I have heard that story many times.

Share
twitter facebook
Ah, the memories! And lessons, too. (Score:5, Funny)

by martyb ( 196687 ) writes: on Sunday August 15, 2010 @01:03PM (#33257376)
Ah, the memories! Here are some of the stories I've heard and or witnessed over the years.
1. Orientation: As a co-op student at DEC in 1980, I was told this (possibly apocryphal) story. On seemingly random occasions, a fixed-head disk drive would crash at the main plant in Maynard, Massachusetts. Not all of the drives, just a couple. Apparently the problem was isolated when someone was midway between the computer room and the loading dock. They heard the bump of a truck backing hard into the loading dock followed very shortly by a curse from the computer room! It apparently caused enough of a jolt to cause platters to tilt up and hit the heads... but only on the drives which were oriented north-south; those oriented east-west were not affected. So came the directive that all drives, henceforth, needed to be oriented north-south.
2. Hot Stuff: Seems that a mini-computer developed a nasty tendency to crash in the early afternoon. But only on some days. Diagnostics were run. Job schedules were checked and evaluated. All the software and hardware checked out A-OK. This went on for quite a while until someone noticed that there was a big window to the outside and that in the early afternoon the sun's light would fall upon the computer. This additional heat load was enough to put components out of expected operational norms and caused a crash.
3. Cool!: A friend of mine was a field engineer for DEC back in the day when minicomputers had core memory. He was called into a site where their system had some intermittent crashes. He ran diagnostics. All seemed to be within spec. He replaced memory boards. Still crashed. Replaced mother boards. Reloaded the OS from fresh tapes. Still crashed. He finally noticed that one of the fans on the rack was not an official DEC fan. Though it WAS within spec for airflow and power draw, it was NOT within spec for magnetic shielding... it would sporadically cause bit flips in the (magnetic) core memory. Swapping out the fan solved the problem.
4. This sucked: Another place had a problem with a computer that would sometimes crash in the early evening after everyone went home for the day. Well, not everyone. The cleaning staff apparently noticed a convenient power strip on a rack and plugged their vacuum cleaner into it. The resulting voltage sag took down the server!
5. Buttons: Every couple years, IBM would hold an open house where anyone in the community could come in and get a tour of the facility (Kingston, NY). This was back in 1984, IIRC. PCs were just starting to make an impact at this time... big iron was king. We're talking about a huge raised-floor area with multiple mainframes, storage, tape drives... MANY millions of dollars per system. A few hundred users on a system was quite an accomplishment back then and these boxes could handle a thousand users. We were also in the midst of a huge test effort of the next release of VM/SP. I had come in that Sunday afternoon to get several tests done (death marches are no fun). All of a sudden the mainframe I was on crashed. Hard. I'd grown accustomed to this as we were at a point where we were "eating our own dog food"; the production system was running the latest build of the OS. But, an hour later and it was STILL down. Apparently, a tour guide had led a group to one of the operator consoles and a child could not resist pressing buttons. Back in those days, booting a mainframe meant "re-IPL" Initial Program Load. Unless the computer was REALLY messed up and wouldn't boot. Only then would someone re-IML the system. Initial Microcode Load. Guess which button the kid pressed? It left the system in such a wonky state that it had to be reloaded from tape. All the development work of that weekend was lost and had to be recreated and rebuilt. (It was a weekend and backups were only done on weekday nights.) It took us a week to get things back to normal.
6. Drivers: A friend of mine at IBM told me of an
Read the rest of this comment...
Share
twitter facebook
- Magic/More Magic (Score:3, Informative)
  
  by Dadoo ( 899435 ) writes:
  
  I can't believe no one's posted Guy Steele's Magic/More Magic story, yet:
  http://everything2.com/user/Accipiter/writeups/Magic [everything2.com]
- - Re: (Score:3, Informative)
    
    by martyb ( 196687 ) writes:
    
    but only on the drives which were oriented north-south; those oriented east-west were not affected. So came the directive that all drives, henceforth, needed to be oriented north-south.
    That seems counter-productive. They were oriented into the less optimal position?
    Yes, I blew that one... Oops! But let me take this opportunity to point out something that I realized only after posting the GP post... That I was able to deduce the problem I had with the PBX, because I applied what I learned from the situation with the cleaning staff using a slot on a rack's outlet strip to plug in their vacuum cleaner.
    IOW, although some of these stories seem funny in retrospect, they can also prove to be great learning opportunities, too! I'm looking forward to reading the other post
Washer in the UPS (Score:5, Interesting)

by Bob9113 ( 14996 ) writes: on Sunday August 15, 2010 @01:43PM (#33257594) Homepage

My favorite was at a big office building. An electrician was upgrading the fluorescent fixtures in the server room. He dropped a washer into one of the UPSs, where it promptly completed a circuit that was never meant to be. The batteries unloaded and fried the step-down transformer out at the street. The building had a diesel backup generator, which kicked in -- and sucked the fuel tank dry later that day. For the next week there were fuel trucks pulling up a few times a day. Construction of a larger fuel tank began about a week later.

Share
twitter facebook
Know your colo contracts (Score:3, Interesting)

by 1984 ( 56406 ) writes: on Sunday August 15, 2010 @01:45PM (#33257604)

I had one a few years back which highlighted issues with both our attention to the network behavior, and the ISP's procedures. One day the network engineer came over and asked if I knew why all the traffic on our upstream seemed to be going over the 'B' link, where it would typically head over the 'A' link to the same provider. The equipment was symmetrical and there was no performance impact, it was just odd because A was the preferred link. We looked back over the throughput graphs and saw that the change had occurred abruptly several days ago. We then inspected the A link and found it down. Our equipment seemed fine, though, so we got in touch with the outfit that was both colo provider and ISP.
After the usual confusion it was finally determined that one of the ISP's staff had "noticed a cable not quite seated" while working on the data center floor. He had apparently followed a "standard procedure" to remove and clean the cable before plugging it back in. It was a fiber cable and he managed to plug it back in wrong (transposed connectors on a fiber cable). Not only was the notion of cleaning the cable end bizarre -- what, wipe it on his t-shirt? -- and never fully explained, but there was no followup check to find out what that cable was for and whether it still worked. It didn't, for nearly a week. That highlighted that we were missing checks on the individual links to the ISP and needed those in addition to checks for upstream connectivity. We fixed those promptly.
Best part was that our CTO had, in a former misguided life, been a lawyer and had been largely responsible for drafting the hosting contract. As such, the sliding scale of penalties for outages went up to one-month free for multi-day incidents. The special kicker was that the credit applied to "the facility in which the outage occurred", rather than just to the directly effected items. Less power (not included in the penalty) the ISP ended up crediting us over $70K for that mistake. I have no idea if they train their DC staff better these days about well-meaning interference with random bits of equipment.

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by Jayfar ( 630313 ) writes:
  
  After the usual confusion it was finally determined that one of the ISP's staff had "noticed a cable not quite seated" while working on the data center floor. He had apparently followed a "standard procedure" to remove and clean the cable before plugging it back in. It was a fiber cable and he managed to plug it back in wrong (transposed connectors on a fiber cable). Not only was the notion of cleaning the cable end bizarre -- what, wipe it on his t-shirt? -- and never fully explained, but there was no followup check to find out what that cable was for and whether it still worked. It didn't, for nearly a week.
  Actually there's nothing odd about cleaning a fiber connection at all and it is a very exacting process (see link below). Apparently exacting in this case just didn't include re-inserting the ends in the right holes.
  Inspection and Cleaning Procedures for Fiber-Optic Connections
  http://www.cisco.com/en/US/tech/tk482/tk876/technologies_white_paper09186a0080254eba.shtml [cisco.com]
  - Re: (Score:3, Informative)
    
    by 1984 ( 56406 ) writes:
    
    That's what I was getting at -- it's not as if it's a simple case of blowing on the end to clear out some fluff. Detailed procedures, including not least unplugging the other end of said cable to make sure it's unlit, which would include finding said other end. And likely go and get various the items required for the cleaning procedure. Which would add up at least to a conversation or two, and perhaps one with us the customer discussing the topic. I'm not disagreeing with cleaning of fiber cables sometimes
Fun with PIX (Score:3, Insightful)

by mkiwi ( 585287 ) writes: on Sunday August 15, 2010 @01:52PM (#33257628)

I had fun with a company awhile back. They are about 300 employees and ~90mil/year, so this is a small corporation.
Anyway, the company was trying to get a VPN tunnel established to their China office, and they were having a hell of a time at it. The employees on the China side had no IT experience so everything was done remotely.
It just so happens that one of the Chinese employees was recruited to make a change to the PIX firewall on the China side in order to get everything working. To our astonishment, it worked, and we had a secure VPN tunnel established.
The problem was accounts in the US started to get locked out, alphabetically, every 30 minutes. Our Active Directory was getting tons of password crack attempts from inside our internal network. I was using LDAP to develop an application at the time, so naturally I was suspect for causing all these lockouts.
Fast-forward a week. We look at the configuration of the Chinese firewall and it allowed all access from any IP address on the Chinese side. In other words, crackers were trying to get into our systems through our VPN tunnel in China. In effect, our corporate LAN had been directly connected to the Internet. Once we figured that out, I was free to go back to work and the network lived to see another day, but that incident caused major trouble for all our employees.
Moral of the story: Don't trust a Chinese firewall.

Share
twitter facebook
USB drive running mission critical WAFS (Score:4, Interesting)

by gagol ( 583737 ) writes: on Sunday August 15, 2010 @02:14PM (#33257732)

I was employed in a 50 employees publicity company. They have a couple of offices across the country and need to share a filesystem through WAFS. The main repository for the WAFS was running off a USB drive, connected to the server using a wire too short. I pointed the problem multiple times to my IT boss (no IT background what so ever) without success, tried to talk the issue to the owner of the company, without success, and one day tyhe worst happenned. The USB controller of the drive fried and we lost the last day of work. Thw windows server system went AWOL. It took an external consultant 3½ days to rebuild the main server, which was running the AD, WAFS, Exchange and our enterprise database. It costed us an account worth 12 MILLIONS $. The big boss then hired consultants and gave them over a thousand box to get her told the exact same thing I pointed to 3 months earlier when I audited the IT infrastructure. Two months later she comes top me and ask me how much it would cost to have a bullet-proof infrastructure. I told her to invest arounbd 80K in virtualisation solution with scripts to move VM around when workload changes and go with a consolidated storage with live backups and replication. It was too expensive. Another three months pass, she hire some consultants, gave them another thousands $ to get told basically the same thing I told her 3 months earlier... Than is where i quitted.

Share
twitter facebook
Onsite Training = Bad (Score:3, Insightful)

by Bruha ( 412869 ) writes: on Sunday August 15, 2010 @08:49PM (#33259802) Homepage Journal

I dont care where you work, if you're on site doing training, you're probably also sucked back into the work cycle. I see it all the time at work, I have always preferred offsite training, turn off the cell phones. It also helps if you have to use your laptop on the lab, because 99% of the time it means you can not vpn into work so email is not a concern either.
I think my other Data Center operators would agree were all understaffed, and I work on a network with hundreds of millions of customers using it on a 24/7 cycle. The other danger nobody speaks of is that some companies are too passive when it comes to testing redundancy because half the time while there's redundancy in the system to keep a DMZ up and running, there's no spare DMZ capacity to handle a true outage such as a fiber ring failure that isolates the data center or other disaster. Companies need to design their redundancy so you can unplug the entire data center and your customers never knows it, because if you do not, you will rue the day a true outage happens that impacts the entire datacenter and you will hear about it on the news later. Not a good thing.

Share
twitter facebook
- Re: (Score:2)
  
  by tagno25 ( 1518033 ) writes:
  
  Took 3 days for the admins to find out the source of the problem and where the router was... abysmal loss of productivity needless to say I gave them a good speech on not routing 192.168 packets on the network and isolating their networks.
  The biggest problem there is that the servers where getting their IP from a DHCP server.
  - Re: (Score:2)
    
    by Bengie ( 1121981 ) writes:
    
    or that the servers are not on their own vLAN with an ACL that doesn't block other vLAN's DHCP
- Re: (Score:2)
  
  by ledow ( 319597 ) writes:
  
  And that the switches weren't blocking DHCP from anything but the authorised DHCP server, and that it took 3 days to track down a rogue DHCP server (not hard, you usually get the MAC address in seconds, trace that to a port, disconnect the port and see who shouts that their network connection isn't working - if it's a remote switch on the end of that port, go to that switch, rinse and repeat).
  Hell, it would take less that an hour if you just pulled cables at random until that MAC disappeared.
  Like most of t
- Re:I got a good one too! (Score:5, Funny)
  
  by Yvan256 ( 722131 ) writes: on Sunday August 15, 2010 @10:47AM (#33256774) Homepage Journal
  
  192.168.x.x? That's amazing. I've got the same IPs on my luggage.
  
  Parent Share
  twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

bad article is bad (Score:5, Insightful)

Re: (Score:2, Interesting)

None of us are innocent. (Score:3, Interesting)

Re:None of us are innocent. (Score:4, Interesting)

Re:None of us are innocent. (Score:4, Funny)

Re:bad article is bad (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re:bad article is bad (Score:4, Funny)

Re: (Score:2, Interesting)

Re:bad article is bad (Score:5, Funny)

Re: (Score:2, Offtopic)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3, Interesting)

Re: (Score:3, Informative)

Network meltdown due to hub cross-connects (Score:5, Interesting)

Re: (Score:2)

Re:Network meltdown due to hub cross-connects (Score:4, Interesting)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re: (Score:3, Informative)

Re: (Score:3, Informative)

Re:Network meltdown due to hub cross-connects (Score:5, Informative)

Re:Network meltdown due to hub cross-connects (Score:4, Informative)

Re:Network meltdown due to hub cross-connects (Score:5, Informative)

Re:Network meltdown due to hub cross-connects (Score:5, Funny)

Re:Network meltdown due to hub cross-connects (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Human error rate (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Router Plugged Into Itself (Score:5, Funny)

Re: (Score:2)

Don't try this at work... (Score:2, Interesting)

Re:Don't try this at work... (Score:4, Interesting)

Not using Cisco ACLs (Score:4, Interesting)

Re:Not using Cisco ACLs (Score:4, Insightful)

Re: (Score:3, Interesting)

Re:Not using Cisco ACLs (Score:5, Insightful)

Re:Not using Cisco ACLs (Score:5, Informative)

Re: (Score:2, Informative)

Re:Not using Cisco ACLs (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:3, Funny)

Quad Graphics 2000 (Score:5, Interesting)

Re: (Score:3, Insightful)

How about... (Score:2)

Re:Quad Graphics 2000 (Score:5, Funny)

Re:Quad Graphics 2000 (Score:4, Informative)

Re:Quad Graphics 2000 (Score:5, Funny)

Obligatory: The Etherkiller (Score:2, Funny)

From TFA (Score:2)

Video (Score:5, Funny)

Re:Video FTW (Score:3, Insightful)

Obligatory (Score:4, Funny)

Don't forget accidentally triggering the Halon (Score:2)

My favourite human error - a true story (Score:5, Interesting)

Re:My favourite human error - a true story (Score:5, Funny)

Re: (Score:3, Funny)

Re:My favourite human error - a true story (Score:4, Funny)

cascade failures (Score:4, Interesting)

Re: (Score:3, Interesting)

Power strips (with on/off buttons) are bad (Score:2, Funny)

Re:Power strips (with on/off buttons) are bad (Score:4, Insightful)

data centers 101 (Score:5, Funny)

Re: (Score:2)

Mainframe days story (Score:5, Interesting)

Re: (Score:2)