Slashdot Log In
Microsoft To Banish Memcpy()
Posted by
kdawson
on Fri May 15, 2009 10:26 AM
from the good-riddance dept.
from the good-riddance dept.
kyriacos notes that Microsoft will be adding memcpy() to its list of function calls banned under its secure development lifecycle. This reader asks, "I was wondering how advanced C/C++ programmers view this move. Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
No - there are plenty of safer alternatives (Score:5, Insightful)
Lame story (Trying for flamebait here?)
Re:No - there are plenty of safer alternatives (Score:4, Insightful)
safe versions - if you prefer to blindly program away, not worrying about where your objects end up in memory. But - what is "safe"? Is there any replacement for properly testing all I/O from all possible sources?
Parent
Re:No - there are plenty of safer alternatives (Score:5, Informative)
That's physically impossible, even given infinite time. Read up on the halting problem.
However, programming a framework in which we may rule out certain things, for example a process jumping over and altering the OS, is perfectly possible. It just has to be verified through reasoning, rather than testing. The unit testing methodology is really the problem here. You cannot unit test everything.
Don't get me wrong, testing is a good start, but it's no proof of security, and a proof of security, while very hard, is possible. Kudos to Microsoft.
And to expand on the GP for those that didn't RTFA, they replaced Memcpy with a memcpy that forced you to state the size of the destination buffer, which is a constant time operation, and a much needed one. So this only forces C coders to make their code a little more clear.
And when you're being intentionally unclear to the computer in addition to the reader, your code has no place in a secure production setting.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Funny)
Parent
Re:No - there are plenty of safer alternatives (Score:5, Informative)
Just like removing printf, scanf, and most other copy/string functions. There are safe versions of memcpy that work just fine and are just as easy to use...
There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).
strcpy() is unsafe because you don't know how many bytes you are going to be copying. strncpy() is completely safe as long as you aren't brain dead and set the 'n' to the size of the destination buffer (as opposed to strlen(src) which would be brain dead) and then slap an '\0' into the last index of the dest. sprintf, same deal, just use snprintf and tell it the max bytes it can print.
So what's unsafe about memcpy()? You explicitly specify the number of bytes to copy. If that number of bytes is greater than the known size of the destination buffer, then you've got a problem that simply adding a second 'size of dest' paramater to the copy won't fix because you already screwed the pooch on figuring that out now didn't you?
Yes memcpy() doesn't work if src and dest overlap. When that's happening, you typically know about it (you've got some clever in-situ array modification going on) and can use memmove(). memmove(), on the other hand, is equally unsafe if you can't properly specify the number of bytes to copy.
Bottom line: There's no such thing as a "safe" copy in C when we're assuming the programmer can't figure out the destination buffer size.
Parent
Re:No - there are plenty of safer alternatives (Score:4, Informative)
There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).
%n writes to the stack. It's disabled by default in VS2005 onwards. More at http://weblogs.asp.net/george_v_reilly/archive/2007/02/06/printf-n.aspx and http://julianor.tripod.com/bc/formatstring-1.2.pdf
Parent
Re:No - there are plenty of safer alternatives (Score:4, Informative)
So why is strncpy in the banned [microsoft.com] function list?
I think this is just Microsoft trying to embrace and extend. There's no better way to do that then making most existing C and C++ code invalid. The quickest alternative, of course, is to write it in C# or some other embraced language.
Hypocritically, Microsoft did NOT add memset to the banned list despite it having almost exactly the same problems as memcpy. Why? Almost every MSDN example begins with "memset(somestruct,0,sizeof(somestruct))" and invalidating every MSDN example would probably look bad.
As you pointed out, the size of the destination buffer makes no sense when dealing with pure pointers. Often memcpy is used to move memory around inside larger buffers, which completely invalidates memcpy_s as a safe replacement. memcpy is also often used to copy smaller buffers into larger ones, and accidentally copying the uninitialized (or carefully crafted by some exploit) data that comes after the source object can be just as dangerous. The correct replacement, memcpy_overkill(void *source_object, size_t source_size, size_t source_offet, void *dest_object, size_t dest_size, size_t dest_offset, size_t count) is what they're REALLY looking for, but this is impractical primarily because of the heavy use of context-less pointers (to objects within arrays, or within some other structure; the void * in memcpy's prototype hints at further possibilities) in C and C++.
Parent
Re:No - there are plenty of safer alternatives (Score:4, Interesting)
So why is strncpy in the banned function list?
Because strncpy() is as bad as strcpy(). The problem lies in the fact that if the source string is longer than the destination len, then strncpy simply stops the copy without writing a NULL. The next str* function used on the string is likely to crash.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Interesting)
If you're a competent programmer then nothing is unsafe, but obviously there are a lot of stupid programmers out there who make fundamental mistakes fucking with memory when they don't understand what they're doing. What Microsoft is trying to do here is to eliminate a low hanging fruit of software security that has led to hundreds if not thousands of buffer overflow conditions and associated vulnerabilities/exploits.
The trouble is, it doesn't. Banning functions like strcpy made sense, because they were nearly always unsafe to use. On the other hand, if you're memcpying too much data for the destination, there's probably something more fundamentally wrong with your code. This, at best, conceals the bug by truncating the copy - leading to unpredictable issues later in execution instead.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Insightful)
What Microsoft is trying to do here is to eliminate a low hanging fruit of software security that has led to hundreds if not thousands of buffer overflow conditions and associated vulnerabilities/exploits.
They might be trying, but they are failing, because the mistake that leads to the error in the first place (miscalculating destination buffer size) has the same effect (buffer overrun) whether you use memcpy() or memcpy_s().
Parent
Re:No - there are plenty of safer alternatives (Score:5, Informative)
Are you high? It already takes a size argument. If this were about strcpy(3), then you'd have a point, but I do not think memcpy(3) means what you think it means.
I'm not saying you can't get yourself into trouble with inappropriate use of memcpy(3), but buffer overruns aren't the go-to threat every time.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Insightful)
It's a psychological thing. Having a separate parameter for the size of the destination buffer forces the programmer to think about what that size is. Too often programmers call memcpy passing the size of the data that needs copying and forget to check that the destination is big enough. And that's why we see so many buffer overflows.
If you never make this mistake continue to use memcpy. I don't care and neither does Microsoft.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Insightful)
It still will not help.
If they are a sloppy enough programmer not to look at what is going on, and to ensure the size of the destination, they will be sloppy enough to use the same dratted variable in both spots, drool all over the keyboard and move on to the next sloppy bit of code.
Parent
Re:No - there are plenty of safer alternatives (Score:4, Insightful)
Parent
Re:No - there are plenty of safer alternatives (Score:4, Funny)
I'm not saying you can't get yourself into trouble with inappropriate use of memcpy(3), but buffer overruns aren't the go-to threat every time.
Didn't we already defeat the goto threat?
More to the point, if the developer doesn't know what memcpy does and how to use it correctly ... I mean ...
You might aswell write the 3 lines of code behind memcpy yourself.
Parent
The goto threat == Raptors (Score:4, Funny)
Parent
Re:No - there are plenty of safer alternatives (Score:5, Insightful)
Parent
Re:No - there are plenty of safer alternatives (Score:4, Insightful)
Technically one size argument is enough, but in a large enough software project the code that allocates the destination buffer is maintained separately from the code that copies into it. Any failure in communication (e.g. building against an outdated library) will lead to someone's linker writing a binary with code that will overrun a buffer.
With an explicit destination size parameter, the buffer copy code is no longer as sensitive to changes at the allocation site. A breakdown in communication will lead to a binary that produces a controlled runtime error instead of a buffer overrun.
Parent
Re:No - there are plenty of safer alternatives (Score:5, Insightful)
Whilst you are correct, if Microsoft is going to essentially replace the standard C library with one that has an incompatible API, why not just call it a new library and have done with it?
Or, better yet, if security really was the goal, develop a C-like language that was secure by design?
By simply making things awkward for people to write portable code, all they do is ensure that there are multiple code bases for projects (which increases the opportunity for error) or ensures that people won't write portably. Which is a more likely goal, given who we are talking about.
Parent
Re:No - there are plenty of safer alternatives (Score:4, Interesting)
>Or, better yet, if security really was the goal, develop a C-like language that was secure by design?
Or, better yet, if security really was the goal, use Ada.
There, fixed that for you :o)
Parent
Re:No - there are plenty of safer alternatives (Score:4, Insightful)
I understand the problem you are describing, but I fail to see how this solution addresses it. If there is already a disconnect between the programmer doing the copying and the programmer doing the allocating, then making the programmer doing the copying repeat himself is not going to fix the problem.
The only problem this function solves is buffer over flows caused by a programmer calculating a number of bytes to copy at runtime (e.g. by reading it from a Content-Length header) and failing to check the calculated value against what he believes is the actual size of the buffer. If the value that he believes to be the size of the buffer is wrong, changing from memcpy to memcpy_s will not catch the mistake. In other words, changing from memcpy to memcpy_s will only protect against sloppy programmers, and if they don't understand what the function is supposed to be protecting them from (which is likely) they'll probably just use the same value for copy_size and dst_size anyway (or switch to memmove), which will completely defeat the purpose of blacklisting memcpy in the first place.
Not to mention, if you're doing any pointer arithmetic and writing to an offset some number of bytes past *buffer, then passing the size of *buffer doesn't really help, unless the function is smart enough to know that (I don't see how it could be unless we pass that as a parameter as well), or the user is smart enough to calculate the remaining size of *buffer. If the user is one of the sloppy programmers that this function is meant to protect against in the first place, I think that is highly unlikely, don't you?
Parent
malloc() and free() (Score:5, Funny)
the worst offender is main() (Score:5, Funny)
Most any security problem can be traced back to this function.
Parent
Re:the worst offender is main() (Score:5, Funny)
you mean WinMain()
Parent
Python is done (Score:4, Funny)
Figures, Microsoft had to go kill of python and do it all in the name of security. No more accessing MEMory in C structures from our .PY files, damn it this really pisses me off.
Re:Python is done (Score:5, Informative)
No its not. This is only banned under Microsoft's Security Development Lifecycle, which means you only care about this if you're following those set of development guidelines. Its still in the language. And you can always use memcopy_s:
Developers who want to be SDL compliant will instead have to replace memcpy() functions with memcpy_s, a newer command that takes an additional parameter delineating the size of the destination buffer.
Parent
No mention of memmove... (Score:5, Informative)
Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
You can replace memcpy entirely with memmove (the latter is slightly slower and handles overlaps), and nothing in the article suggests that memmove is banned.
But, no, it shouldn't hurt creativity--they're introducing a memcpy_s, which is the same aside from taking a size parameter for the destination. That's something that is generally easy to track in new code (obviously this secure developement lifecycle is not backwards compatible).
Re:No mention of memmove... (Score:4, Informative)
Okay, I'm obviously missing something here. How is having an extra parameter for the destination size any safer? I always thought the third parameter to memcpy was the amount of data to copy, and since obviously it should never be set to anything larger than the size of the destination, how will having the destination size explicitly passed in help any?
That's the error that this is trying to fix. I'm skeptical as to how much this will help; if you're that lazy, you can just set the destination size parameter to the same value as the amount to copy.
But it might be easier to enforce at a code-review level in the organization: destination size always has to be a size tracked based on memory allocation.
Parent
Re:No mention of memmove... (Score:4, Informative)
Now developers will write
memcpy_s(dst, sizeof(dst), src, sizeof(dst));
I get the feeling that this is mainly for Microsoft internally developed code which conforms to their security guidelines. As such, it's probably mainly intended to help in code reviews. Still pretty dubious.
Now the coders that have been using something like
MIN(sizeof(dst), bytes_to_copy)
for the last parameter for years will have to change their code.
That fails in the common case of dst being a real pointer (whether it's indexing into a static array or dynamically allocated memory or whatever).
Parent
First they take my gets.. (Score:5, Funny)
Re:First they take my gets.. (Score:5, Funny)
First they came for gets, And I didn't speak up because I didn't use gets
Then they came for scanf, And I didn't speak up because I didn't use scanf
Then they came for strcpy, And I didn't speak up because I didn't use strcpy
And then... they came for memcpy... And by that time there was no one left to speak up.
Parent
What an idiotic idea. (Score:5, Informative)
Silly and useless (Score:5, Insightful)
This is nothing like sprintf. In sprintf there is no way to know how much data will be created ahead of time, so limit on buffer size is useful to make sure there is no buffer overrun.
With memcpy it is *precisely* known how much data will be copied. It is right there, 3rd parameter. If a developer can't do "if (sizetocopy = sizeofdstbuffer)", it is just as unlikely that he will be able to properly state that additional parameter that specifies the destination buffer size.
Of course if Microsoft is so concerned with security, why the heck did it take them years to add snptinf()? All this is is another attempt to make crossplatform development that much harder (much like all those "obsolete" POSIX functions that will barf warnings unless you use a cryptic define).
That said, if this silliness ever becomes a rule, I have an easy solution:
#define memcpy(dst, src, size) memcpy_s((dst), (src), (size), (size))
Problemo solved, now let's go actually write some real code.
When will MS learn? (Score:5, Insightful)
Yes, you read that right. Microsoft is deprecating parts of an ISO Standard all by themselves. Not that this should surprise anyone. I would have absolutely no objection to them proposing to WG14 to deprecate those functions; heck, I'd encourage it! But besides going out and deciding to 'deprecate' parts of the standards, the replacement functions actually violate those same standards.
And the warnings are irritating. You can't write a nice cross-platform library without either spewing tons of warnings or having to put in a bunch of #defines to shut the compiler up. And if you do that, your users get irritated if they depend on these warnings because you just turned them off (and of course, if you don't, they'll complain that your library is unsafe).
Screw Microsoft.
Re:When will MS learn? (Score:5, Insightful)
In case anyone is curious, this is the type of thing that coppro is talking about:
c:\Program Files\Microsoft Visual Studio 8\VC\include\io.h(318) : see declaration of 'close'
Message: 'The POSIX name for this item is deprecated. Instead, use the ISO C++ conformant name: _close. See online help for details.'
Now, as far as I know, no ISO body has deprecated functions like close(2), open(2), read(2), and write(2). And I've always heard that methods that start with an underscore are internal compiler functions and shouldn't be called directly. I don't know why the MS compiler writers think they can do this, but it is really annoying to get hundreds of warnings like this when compiling. In addition, it hides legitimate warnings that could indicate real problems.
As to the article in question, I can't think of any good reason why memcpy(3C) would be considered unsafe, since it specifies the amount of memory to copy. Sure, you could use it to copy outside the bounds of dst, but that's just calling it incorrectly. It's not like sprintf(3C) where you could easily accidentally write outside the bounds of the string.
Parent
Re:When will MS learn? (Score:4, Insightful)
That's correct, because ISO C++ never included those functions in the first place. POSIX != ISO C. (Not that MSVC is on any kind of reasonable schedule for keeping up with ISO standards, but that's a whole different issue...)
Basically MS is deprecating their own terrible implementation of some POSIX compatibility. This is actually required for ISO C compliance: the compiler is not supposed to define a bunch of extraneous functions in the global namespace, because they might conflict with your names. Once those functions are removed entirely (and I believe you can #define them away right now) you can implement your own compatibility functions for software you're porting to Windows.
Now, this is all entirely separate from the SDL warnings GP is complaining about, which show up when you use standard ISO C functions like strcpy, sprintf, and apparently now memcpy. Which, honestly, I wish weren't quite so irritatingly implemented, although I'm torn because using those functions really is terrible.
It's not really that worth getting up in arms about, though, because JESUS CHRIST there's a compiler flag to disable the warnings, just put it in your makefile and quit bitching already!
Parent
Stop protecting me from me! (Score:5, Interesting)
As a competent developer, I get extremely annoyed by this sort of shit.
Removing/banning memcpy doesn't change a damn thing cause the first thing I do with things that have to compile in VisualStudio now is add the following defines which turn this shit off:
_CRT_SECURE_NO_WARNINGS
_CRT_NONSTDC_NO_DEPRECATE
If the remove that option I'll simply add memcpy to my standard MS compatibility library that deals with all the other bullshit MS decides to do.
You can't fix stupid. Stop trying. People fuck up VB and C# apps just as much as the fuck up C and C++ apps. So they don't do it with a buffer overflow, they do it by shear stupidity. You'll be more secure by taking away languages that allow non-programmers to pretend to be programmers than making it harder on those of us that are just going to work around what you do anyway.
You're not going to fix broken shitty apps with exploits by removing functions, the functions aren't the problem they do exactly as they are told (or atleast they are supposed to :). You need to fix the programmers who can't clarify what they want done.
http://www.xkcd.com/568/ [xkcd.com]
Second pane:
You'll never find a programming language that frees you from the burden of clarifying your ideas.
easy fix (Score:5, Insightful)
Just write a one-liner that replaces all calls to memcpy with a call to memcpy_s, duplicating the size parameter.
I'm only half-joking. This is exactly how people will (mis)use memcpy_s. If you want safe memory access, you need to ban the entire C language. For those cases where you need C, you'll just have to make sure your programmers know what they're doing.
The wole thing is just a bunch of nonsense (Score:5, Insightful)
Firstly, the specification of C anf C++ standard library is governed by the corresponding standard commitee. Microsoft has absolutely no authority to "banish" anything from neither C nor C++. They can deprecate it in their .NET code, C# etc., but it has absolutely no relevance to C and C++ languages. So, why would the author of the original question direct it to "advanced C and C++" programmers is beyond me. In general, C and C++ programmers will never know about this "interesting" development.
Secondly, the tryly unsafe and useless functions in the C standard library are the functions like "gets", which offer absolutely no protection agains buffer overflow, regardless of how careful the develoiper is. Functions like 'memcpy', on the other hand, offer sufficient protection to a qualified developer. There's absolutely no sentiment against these functions in C/C++ community and there is absolutely no possiblity of these functions to get deprecated as long as C language exists.
Lock in from Microsoft (Score:5, Interesting)
There have been several suggestions to replace memcpy with memcpy_s as the safer alternative. That's fine, I guess, if memcpy_s is part of the ANSI/ISO standard for C, which as far as I know, it is not; just like all the *_s functions.
Microsoft says your code is safer when using the *_s, but it will no longer be portable, it'll be Microsoft-only. They put in a warning in the compiler from VS2005 onwards about using "unsafe" functions, and that you should use *_s, which is a pain because you have to disable it as the project level, there doesn't seem to be anywhere that I've found that can just turn it off permanently. Even using the STL that comes with VS2008 will generate these warnings, even if you never do any explicit memory stuff yourself.
Microsoft did the same thing with the _* functions; a lot of them are just wrappers around their ANSI-compliant versions (_sprintf -> sprintf), but are also not portable; I worked with a guy who wrote/tested all his C code in VS6 then gave it to me to port to Unix and VMS, and the compilers would choke on not having these particular functions.
Microsoft is trying to get lock-in at the language level instead of providing a good set of Win32 API-based functions that make using memcpy() unnecessary.
They should go one better... (Score:5, Insightful)
...and pop up a message box asking the user to confirm they want to copy the memory, and if they press OK then they should have to enter a captcha.
Seriously though, how is it supposed to make your code safer if you pass the size you think your destination buffer is? With memcpy, that size is implicitly greater or equal to the copy size and it's the caller's responsibility to make sure this is the case. Putting bounds checking into the copy function is ridiculous if you're responsible for passing the bounds yourself, and it goes against basic good design. I'm surprised they aren't passing the source buffer size too, just to be extra safe. Also, what happened to the __restrict keyword? It's strangely absent from the memcpy_s function declaration.
=Smidge=
Parent
Re:They should go one better... (Score:5, Informative)
The problem is memcpy returns a void *. If this is dynamically cast, it needs to be checked at runtime and may even be set to a value the programmer never intended (say unsigned 16 bit values instead of unsigned 8 bit characters). It may be an issue with updating the code - say the code was originally written for 8 bit ASCII and got updated to, say UTF-16 (16 bit). A dynamically cast void* doesn't care what the size is, it just shoves the values in the buffer. This may work fine in basic testing even, because you never overflow the buffer with 1-2 characters, and maybe even gets past a QA team, but once you go past 1/2, you've got a buffer overrun.
As I understand it, __restrict wouldn't work in a C++ program using dynamic_cast because it doesn't know the size at compile time (sorry, I'm not sure what is done in C as I haven't kept up with the language, so I have to use a C++ example). My guess is memcpy_s does runtime bounds checking (it isn't specified on the memcpy_s page, maybe the security ref - too busy to read it though).
Parent
Re:Should have been done 30 years ago. (Score:5, Insightful)
Why? I can see some justification on the strXXX functions where you don't know how many bytes are going to be copied unless you call strlen first, but in memcpy you pass how many bytes to copy in as a parameter. So this is to protect programmers who can't do math?
Parent
Re:Should have been done 30 years ago. (Score:4, Interesting)
If you haven't tried Ada yet, I highly suggest looking into it. It keeps track of data sizes, types, etc... for the programmer but it will also let you get close to the hardware like C does. It's often used for safety critical software such as that used in aviation.
Unfortunately I can't recommend using Ada to develop windows apps. It's technically possible but you end up importing C library functions to do it. And if you're going to do that, you might as well just use a native development environment that is better suited to the task.
Parent
Re:Isn't security the programmer's responsibility? (Score:5, Informative)
you didnt read.
MSFT is banning it from their development process, not the language, use it as much as you like.
Parent
Re:A half-measure, at best... (Score:4, Funny)
So, Ben... or is it Peter? Do you always copy your comments verbatim from the linked article, or only when you agree with them?
Parent
How to easily ... (Score:4, Insightful)
How to easily make your code compliant with the new safety requirements:
#define memcpy(dest,src,len) memcpy_s(dest,len,src,len)
Parent
Re:How to easily ... (Score:5, Informative)
Parent
Re:Shooting themselves in the foot. (Score:4, Interesting)
Now, all that's going to happen is that programmers are going to write their own memcpy-like routines using a quicky for-loop or something. It'll be just as bug prone, and harder to detect via automated source code analysis.
Not to mention it'll be slower. memcpy is one of the most optimized functions in any C library. It's frequently handled as a compiler intrinsic, that can do stuff like unroll short copies, generate optimal machine code, etc.
Parent
Re:Shooting themselves in the foot. (Score:4, Interesting)
Actually, memcpy in and of itself is slow. Hand writing your own asm version of memcpy using extended cpu functions is a lot faster as memcpy itself is usually kept basic enough to work on any cpu, including the older cpu's without MMX, SSE, etc.
glibc contains specific implementations for sparc32, powerpc32, powerpc64, i386, i586, cris, i860, rs6000, and m68k. I don't know where you got your idea.
Parent