Mark Russinovich On Vista Network Slowdown 423
koro666 writes "In his latest blog post, Mark Russinovich analyzes the network slowdown experienced by some users when playing multimedia content. 'Tests of MMCSS during Vista development showed that... heavy network traffic can cause enough long-running DPCs to prevent playback threads from keeping up with their media streaming requirements, resulting in glitching. MMCSS' glitch-resistant mechanisms were therefore extended to include throttling of network activity. It does so by issuing a command to the NDIS device driver... [to] pass along, at most 10 packets per millisecond (10,000 packets per second)... [T]he networking team is actively working with the MMCSS team on a fix that allows for not so dramatically penalizing network traffic, while still delivering a glitch-resistant experience.'"
Failed engineering (Score:4, Insightful)
Re:Well that's just not true (Score:2, Insightful)
But don't let logic or common sense get in your way.
Re:Okay... (Score:2, Insightful)
Re:Failed engineering (Score:5, Insightful)
Dumb dumb dumb (Score:5, Insightful)
Because the standard Ethernet frame size is about 1500 bytes, a limit of 10,000 packets per second equals a maximum throughput of roughly 15MB/s. 100Mb networks can handle at most 12MB/s, so if your system is on a 100Mb network, you typically won't see any slowdown. However, if you have a 1Gb network infrastructure and both the sending system and your Vista receiving system have 1Gb network adapters, you'll see throughput drop to roughly 15%."
That is one of the dumbest things I have heard in a while. Let's see:
What an over-engineered non-solution to what should have been a non-problem in the first place. Microsoft is supposed to employ some of the smartest engineers in the world: can none of them optimise their code?
Wow... (Score:5, Insightful)
It is these sorts of things and things like the teams and teams debating the "Shutdown Menu" in Vista that are really showing Microsoft needs to really change if they are going to survive. It amazes me how a bunch of open source developers with all their own agendas do a better job then a bunch of folks all paid by the same company. Of course then there is Apple of an example of a group that shows you can pull it off and still all look like the same organisation.
Re:Well that's just not true (Score:5, Insightful)
However, this actually does make sense. In all honesty, they probably would have worked on a better answer than cutting back on networking, but with the time crunch on releasing it, they probably cut corners here and there (and by probably, i mean definitely and by here and there, i mean everywhere). They probably viewed this as an acceptable cut for the time being because for a majority of users, they use very little of their networking bandwidth. If its just a PC connected to the internet, they'd most likely never notice. The only time this would be an issue is for heavy network usage, which would normally only occur on work-related machines because let's face it, aside from geeks and techies, not many people have systems set up that max out their network bandwidth, so, if they were work-related machines, well, they probably wouldn't be playing that much music to begin with.
I'm not a MS shill, though I don't assume everything they do has evil intentions. We have to admit that they are great code writers, just not the best. Just because they do shady things here and there (mostly in business practices however) doesn't mean everything they do is evil. This was a problem they ran into and they made a workaround that would only affect a relatively small amount of their users. They were probably hoping no one would notice it at all until they either A) had a fix or B) just let it go because maybe no one would notice it.
Remember, this wouldn't really slow down your internet unless you have an *extremely* high bandwidth and even then, bottlenecks on the information before reaching you would probably still mask the problem. This is only an issue on system that have heavy network usage on some sort of intranet or other type of local area network, because these would account for the majority of networks that could even use a decent amount of your possible networking bandwidth.
Re:Okay... (Score:5, Insightful)
Winodws XP -- can play an MP3 file and video file at the same time with no reduction in network speed.
Vista -- same computer, same hardware, -- major reduction in network speed.
In other words, Microsoft tried to "fix" something that wasn't broken.
Re:Failed engineering (Score:5, Insightful)
Fix for this can be downloaded from here. (Score:1, Insightful)
Re:Dumb dumb dumb (Score:1, Insightful)
It's incorrect. Did no one even bother to calculate the drop-off? Was there not one single engineer amongst them who ever said "Hey, you know, Gigabit is pretty popular these days."?
It should be unnecessary. Why does standard media playback and networking require so much power that there is not enough time to schedule both of them correctly?
It is wrong. Why is media playback is more important than network performance? If the network is heavily loaded, well gee, maybe there's a reason for that?
Tag: defectivebydesign (Score:0, Insightful)
New definition of a Kludge here, I think (Score:4, Insightful)
As goes without saying, arbitrarily throttling one particular task, at some arbitrary level, is the wrong thing.
Perhaps this could go in Wikipedia under "Kludge"?
Re:Failed engineering (Score:5, Insightful)
Vista's worst engineering decision is to make a system optimized for restrictions and money-farming, not for user experience. The WGA breakdown is the best example. The legitimate users who paid a ridiculous sum to use Vista's 'ultimate' features (you know, the ones which are free in Linux and at least standard in MacOSX) had their systems crippled, and the pirates who bypassed WGA were not even affected. The whole feature does exactly the opposite of what it was supposed to do. That's failed engineering, any way you look at it.
Re:Wow... (Score:3, Insightful)
Re:Okay... (Score:2, Insightful)
Hang on a minute... (Score:5, Insightful)
Re:Dumb dumb dumb (Score:5, Insightful)
And this seems like a strange conclusion to jump to...especially coming from Mark.
maybe I am just confused, but the NDIS driver handles sending and receiving of pkts, so is the pkt rate limited to 10,000 pps coming and going? (he mentions packets received by network adapter drivers, but I am still curious). if it is limited to 10,000 pps in either direction...then you the theoretical limit comes down by quite a bit.
Even at that, he is assuming full sized packets, which is a bit of stretch, there is a good chance that not all of them will be the full 1500 bytes, factor in broadcast traffic, and other crud which may be running...and you start seeing a noticable drop even on a 100mbit connection.
Re:Dumb dumb dumb (Score:5, Insightful)
Actually, this is 2007, with stupidly fast processing, memory levels, and network throughput. There's no reason whatsoever that either effect should be showing up when both activities are happening at the same time.
And it's not "slight network performance penalties". It's ridiculously harsh network performance penalties.
Re:Failed engineering (Score:2, Insightful)
Cause no one needs more than 100mb, YET. I don't care that my network is slowing down, it won't slow down enough to hamper my internet connection.
And then again... (Score:5, Insightful)
Microsoft has a long history of hardcoding stuff without thinking of power users. Remember the 10-limit for open TCP connections per program? They did this because viruses and malware open many TCP connections. "Hey, what about P2P?" "What's P2P?".
Re:Failed engineering (Score:5, Insightful)
Furthermore, in the workplace, many people listen to music and access large files on network shares. Clearly, Vista is *broken* for these uses. Not a good indication of Vista being business ready.
Frankly, I don't know why Windows is considered the best business OS. You're much better off with a unixy OS in any environment where gaming isn't important.
Re:Failed engineering (Score:5, Insightful)
One thing I don't get is how he managed 41.61% CPU utuilization [technet.com] while transferring a file. Did he have the ethernet equivalent of a winmodem?
Re:Failed engineering (Score:4, Insightful)
it's a far better user experience than windows XP. if they did put some DRM related stuff in there, I haven't noticed, nor will 99.99% of its userbase.
Jesus, have *you* used vista? The user intended user experience could be orgasmic, but I'll be damned if I can get the thing stable given the state of drivers for my vista approved hardware.
In a year it may be better than XP ( and at best, marginally so ), but right now it's hit and miss.
How short-sighted? (Score:3, Insightful)
Re:Engineering for profit vs. for improvement (Score:2, Insightful)
Did you read the article? It was obviously tweaked to improve the "user experience"; the painful difference between OSS and this being that Microsoft arbitrarily decided for all of Vista's users what "user experience" they would like to experience (i.e., skipless media playback as opposed to maximum network performance). There were bugs in Microsoft's solution, but there are also bugs in OSS.
OSS projects, however, are (usually) much less dictatorial in deciding what the user wants; they can't be, actually, because if he doesn't like what they give him, he can just fork-and-run.
Re:Dumb dumb dumb (Score:3, Insightful)
Re:And then again... (Score:5, Insightful)
Better yet, allow "throttling as needed if multimedia buffers run low". That would allow unimpaired network performance in systems with enough CPU power.
But then again, that would have required early planning to include the necessary feedback in audio and graphics drivers. I speculate that the problem was discovered late in the development of Vista, and since nobody wanted to be responsible for another delayof Vista's release, some quick hack was applied
Re:Dumb dumb dumb (Score:4, Insightful)
They know why.. it's the kernel-mode encryption required to send audio to the card. There's two engineering failures here:
1. thread locking is accomplished by raising the interrupt level to DPC (KeAcquireSpinLock)
2. Requiring several steps/levels of encryption to interract with the audio card.
The real issue is a combination of utilizing DPC interrupts for basic thread locking (which thrashes the scheduler during long halts) and encryption (which requires long halts).
The real fallacy, IMHO, is thatMS thinks that because it's in kernel mode that it's immune or safer from attacks - so they created lots of "security features" in the kernel. In many ways this makes attacks much simpler - as you can simply move your code into kernel mode which has fewer limits than user mode!
Corporate Lingo (Score:2, Insightful)
P2P? (Score:2, Insightful)
From the article... (Score:2, Insightful)
Besides activity by other threads, media playback can also be affected by network activity. When a network packet arrives at system, it triggers a CPU interrupt, which causes the device driver for the device at which the packet arrived to execute an Interrupt Service Routine (ISR). Other device interrupts are blocked while ISRs run, so ISRs typically do some device book-keeping and then perform the more lengthy transfer of data to or from their device in a Deferred Procedure Call (DPC) that runs with device interrupts enabled. While DPCs execute with interrupts enabled, they take precedence over all thread execution, regardless of priority, on the processor on which they run, and can therefore impede media playback threads. They're saying that every packet received causes an interrupt request, which causes the CPU to get loaded at high transfer speeds.
Apparently they haven't heard of interrupt moderation [google.com] or polling [google.com], technologies that are used by network cards to offload the CPU.
Even my Marvell semi-hardware (I think) Gigabit on-board network card used about 14% CPU (Barton 1833Mhz) when transferring files at about 45Mbps.
I don't know, everything seems really stupid, and I'm not sure it's just a "bug", or their description is just a part of what really happens behind the scene.
Re:Russinovich (Score:3, Insightful)
Re:Hard coded numbers (10k packets/sec)? (Score:3, Insightful)
On Linux, with the CFS and/or SD schedulers, if your nice levels are set correctly, sound (MP3) will play just fine with your processor(s) pegged at 100. Heck, forget about sound; you can run multiple Quake 4s with high-speed LAN transfers in the background, and everything works just fine (network transfers slowdown slightly, Quake 4's FPS scales down linearly with the number of sessions running, but there are no "hitches" or "glitches", and everything runs smoothly).
A common Microsoft approach to problems with Windows is to create a new daemon (oh, excuse me, Service) that "regulates" the offending behavior. This is not the correct way to fix these problems; rather, there are underlying issues that need to be resolved.
You say:
The MMCSS is for improving multimedia performance on EXTREMELY heavily-loaded processors. I use XP, and my PC is occasionally heavily loaded with a dozen threads, and in those cases I occasionally experience glitches. Thus, I have to manually adjust thread priorities, but it's annoying anyway.
I say it's not about manually adjust thread priorities, or creating a Service that will automatically (dynamically or not) do that for you. Rather, you should have a kernel that better manages multitasking in processor starved scenarios. There's no reason that a particular program running at a particular nice level shouldn't demand a minimum CPU percentage, which for stuff like playing MP3s cannot possibly be much.
Re:And then again... (Score:4, Insightful)
Most coders don't want to add a registry setting. Most users don't want to touch it.
There's obviously just something wrong with their big picture view if they can't get this shit straight. It's probably because the network and multimedia teams are separate and don't know what the others' doing.
Re:Failed engineering (Score:2, Insightful)
Yea, it's totally reasonable to think that it takes more faith to exist in the real world than it does to believe in ghosts and boogie-men.
Re:Failed engineering (Score:3, Insightful)
>you begin to appreciate what Microsoft has accomplished with windows
I've always assumed there's more to it than just "Windows sucks", but I've never had the time to learn about how Windows and Linux work more in-depth so I can meaningfully compare them (nor will I anytime soon).
Care to give an example or two of things Windows gets right?
Mod parent redundant ;-) (Score:3, Insightful)
So, basically the GP poster was right: 1% goes to WGA.
Re:Failed engineering (Score:3, Insightful)
I can just understand Microsoft not being aware of that scenario - except you can guarantee that EVERY SINGLE MICROSOFT EMPLOYEE just does that. So nobody thought to test that scenario - that was just dumb of Microsoft.
The real question is why the engineers involved didn't understand the size of the impact on performance. I mean, if you made the mod in order to avoid network performance screwing up media playback, then why didn't they explicitly test the degree of impact on the networking performance AS WELL AS the media playback? They were explicitly degrading network performance in favor of media playback. Why didn't they SEE the performance hit?
So one has to conclude that this is correct: they simply didn't test it. They just tested the media playback - if in fact they tested that at all.
I'm reminded of the post at a Microsoft employee's blog last year where a member of the Vista testing team explicitly said that setting up tests was a nightmare that took most of a week - and then when Vista failed the tests horribly, management would STILL sign off on the components as having passed.
And this is the obvious result of that process.