Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Unix Technology IT

Why You Shouldn't Reboot Unix Servers 705

GMGruman writes "It's a persistent myth: reboot your Unix box when something goes wrong or to clean it out. Paul Venezia explains why you should almost never reboot a Unix server, unlike say Windows."
This discussion has been archived. No new comments can be posted.

Why You Shouldn't Reboot Unix Servers

Comments Filter:
  • Uh.. no (Score:5, Informative)

    by Anrego ( 830717 ) * on Monday February 21, 2011 @01:48PM (#35269474)

    I for one believe in frequent-ish reboots.

    I agree it shouldn't be relied upon as a troubleshooting step (you need to know what broke, why, and why it won't happen again). That said, if you go years without rebooting a machine... there is a good chance that if you ever do (to replace hardware for instance) it won't come back up without issue. Verifying that the system still boots correctly is imo a good idea.

    Also, all that fancy high availability failover stuff... it's good to verify that it's still working as well.

    The "my servers been up 3 years" e-pene days are gone folks.

  • Re:Persistent myth? (Score:5, Informative)

    by SCHecklerX ( 229973 ) <greg@gksnetworks.com> on Monday February 21, 2011 @01:52PM (#35269518) Homepage

    Windoze admins who are now in charge of linux boxen. I'm now cleaning up after a bunch of them at my new job, *sigh*

    - root logins everywhere
    - passwords stored in the clear in ldap (WTF??)
    - require https over http to devices, yet still have telnet access enabled.
    - set up sudo ... to allow everyone to do everything
    - iptables rulesets that allow all outbound from all systems. Allow ICMP everywhere, etc.

  • by Anonymous Coward on Monday February 21, 2011 @01:56PM (#35269572)

    Often system upgrades (eg. security fixes) include new versions of libraries and such. It's impossible for the package manager to know which processes are using those libraries so it can't automatically restart everything. Consider if you have custom processes running, the package manager wouldn't even know about them.

    Therefore you have to do it manually, but then you have the same problem. It's damn hard to know which processes are using the libraries that were upgraded. Really, really hard if it's a big server running hundreds or thousands of processes. Often it's easier just to reboot so you make sure everything is running the current version of all the libraries. If you don't then you can't be sure that all the security fixes are actually running on the system since it will be using the old cached versions of the libraries in RAM.

  • Re:HP-UX says... (Score:5, Informative)

    by sribe ( 304414 ) on Monday February 21, 2011 @02:12PM (#35269768)

    Seriously. I don't know what HP is doing, but NFS hangs/stuck processes that you can't kill -9 your way out of is just wrong.

    Kind of a well-known, if very old, problem. From Use of NFS Considered Harmful [time-travellers.org]:

    k. Unkillable Processes

    When an NFS server is unavailable, the client will typically not return an error to the process attempting to use it. Rather the client will retry the operation. At some point, it will eventually give up and return an error to the process.
    In Unix there are two kinds of devices, slow and fast. The semantics of I/O operations vary depending on the type of device. For example, a read on a fast device will always fill a buffer, whereas a read on a slow device will return any data ready, even if the buffer is not filled. Disks (even floppy disks or CD-ROM's) are considered fast devices.

    The Unix kernel typically does not allow fast I/O operations to be interrupted. The idea is to avoid the overhead of putting a process into a suspended state until data is available, because the data is always either available or not. For disk reads, this is not a problem, because a delay of even hundreds of milliseconds waiting for I/O to be interrupted is not often harmful to system operation.

    NFS mounts, since they are intended to mimic disks, are also considered fast devices. However, in the event of a server failure, an NFS disk can take minutes to eventually return success or failure to the application. A program using data on an NFS mount, however, can remain in an uninterruptable state until a final timeout occurs.

    Workaround: Don't panic when a process will not terminate from repeated kill -9 commands. If ps reports the process is in state D, there is a good chance that it is waiting on an NFS mount. Wait 10 minutes, and if the process has still not terminated, then panic.

  • Re:Uptime (Score:4, Informative)

    by 19thNervousBreakdown ( 768619 ) <davec-slashdot@@@lepertheory...net> on Monday February 21, 2011 @02:35PM (#35270108) Homepage

    They're made of considerably smaller platters, so there's much less gyroscopic force (or whatever the fuck it's called), they spin down within minutes of being idle on most laptops, and every laptop these days comes with an accelerometer-based parking utility that stops the drive no matter what it's doing if there's too much force--they're almost certainly configured to be over-conservative from the factory, but generally it's difficult to even carefully pick a laptop up without it parking the drive.

  • Re:Persistent myth? (Score:4, Informative)

    by Waffle Iron ( 339739 ) on Monday February 21, 2011 @03:11PM (#35270566)

    And yes, either one works, but '\\' is not necessary and it's a POS pattern that too many people follow because they don't or can't read the docs.)

    Here's a snippet from Microsoft's own current MSDN example on the PathMatchSpec() API call:

    ...
    void main(void)
    {
    // String path name 1.
    char buffer_1[ ] = "C:\\Test\\File.txt";
    char *lpStr1;
    lpStr1 = buffer_1;
    ...

    Gee, I wonder where these people get their path separator ideas? Maybe it's because they *did* read the docs.

  • Re:Persistent myth? (Score:2, Informative)

    by John Hasler ( 414242 ) on Monday February 21, 2011 @04:40PM (#35271558) Homepage

    ...only Apple has had theirs certified by the Open Group, which makes it not just Unix but Unix(tm).

    No. That makes it Unix(tm) but not Unix. With a hacked Mach kernel, a modified BSD userland, and a totally custom GUI it is considerably less like Unix than is Linux. BSD, on the other hand, is a direct descendant of Seventh Edition Unix. The fact that Open Group was willing to sell Apple a trademark license shows just how worthless that trademark is.

  • Re:Persistent myth? (Score:3, Informative)

    by TheHedonismBot ( 1856060 ) on Monday February 21, 2011 @04:53PM (#35271722)

    Maybe. I see what you are saying, but as a counter-example, I sometimes run tcpdump from within my home directory when troubleshooting problems. tcpdump has to run as superuser, and I have a lot more faith in giving myself and other admins permission to run "sudo tcpdump" than running tcpdump setuid 0. Again, maybe I'm just missing something, but I really don't have a huge problem with tcpdump (or other admin tools) writing UID 0 data to an admin user's home directory.

    You don't have to be root to use tcpdump. On ubuntu, do this:

    sudo aptitude install libcap2-bin
    sudo setcap cap_net_raw,cap_net_admin=eip `which tcpdump`

    If you run: getcap `which tcpdump` and it shows: /usr/sbin/tcpdump = cap_net_admin,cap_net_raw+eip then you're good to go. Now try running tcpdump as a regular user.

  • by The Moof ( 859402 ) on Monday February 21, 2011 @05:47PM (#35272238)

    Why should I bother disabling it?

    Generally, good administrators tend to disable service that aren't wanted or needed in their systems. Who's to say that there's not going to be a vulnerability for the service discovered down the road (*coughSolaris [cert.org]cough*) that would make you vulnerable?

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...