This document last updated on 3/29/2005.
This 'FAQ' (it is not really about questions people ask, more about what people should be doing, perhaps it should be 'Frequent Sources of Insecurity') is intended to be a bit broader than simply how to avoid buffer overflows. C (and C++ because it is backwards compatible) have particular vulnerabilities that many other 'higher level' languages lack, such as buffer overflows, but all programs are capable of being insecure (really, all are likely to be insecure unless determined efforts are expended to make them secure) no matter what language is used to write it. The first thing to understand is "Security is a process, not a product". Secure programming begins with a secure design, which relies on having a clear understanding of the goals and purposes of the program. Since the vast majority of us programmers would rather code than design, design often gets short shrift resulting in the plethora of security holes waiting for a cracker to locate. However, design need not be onerous and adding security in the design phase can often be quite simple and straightforward. The most basic element of any secure program is paranoia. Trust no one! All input data is suspect and must be sanitized. All interactions with the 'outside' world must pass through default-exclude filters. This includes all file access, command line arguments, interactive input, etc. A cracker only has to find one weakness to totally corrupt (0wn in crackerspeak) the application. Use defense-in-depth; if one layer is compromised there are other layers ready to pick up the slack.
This FAQ (FSI) is targeted toward the client/server paradigm, which I mean to include any application where any portion of the execution and/or intelligence of an application resides on different physical computers. Web enabled enterprise applications are the specific target, but what follows applies directly to client/server applications on an intranet just as well (securing your application from internal attacks is just as critical and it is poor programming design to assume that an internal network is any more secure than an external one). If you are on a multi-user computer (like almost all of us except for those few DOS holdouts) you can even use the same techniques to provide some protection from malicious code running on your computer (worms, virus', Trojans, etc.) If your program is running at a higher permission level, these techniques will go a long way toward keeping someone from elevating their privileges by hacking through your application. You should NOT, however, think that your program is 'more secure' simply because it lacks any networking capability. If your program is doing something important enough for you to devote hours of labor, it is almost certainly important enough to incorporate the elements discussed herein to protect itself. The non-security related benefit you can also realize is that your programs may have fewer, and easier to resolve, bugs, as there are quite a few instances where failure to check error conditions can lead to hours spent analyzing perfectly functioning code.
Because a lot of people will remain convinced that writing a secure program is only about the code, I will place the code specific parts first. If you want to have a secure program this entire list is the minimum you need to accomplish this task. Most of these subjects might take dozens of pages to cover the surface and entire books to cover in-depth; please do not think that this is all encompassing.
Special thanks to Scorpions4ever for review comments and particular thanks to infamous41md (both from the DevShed) for his suggestions on C coding issues.
You are visitor number since October 7, 2004.
Many of the common security pitfalls in C have been addressed with C++ classes (particularly the use of the string class instead of fixed-length character buffers), but it is worth discussing them because there are general issues to keep in mind when you are creating or extending classes.
Some general tips (courtesy of infamous41md):
Specific coding issues:
- Never, ever trust the user or environment, or even libraries you are using. Validate everything.
- If you want to write secure programs, learn what makes a program insecure.
- Check your own code with a code-auditing tool, such as rats. Most errors are stupid mistakes that would have been corrected if the programmer had checked them for.
- Always plan for something going wrong by checking for and handling errors.
- Read the documentation on any library functions used, and know exactly what state they leave data in.
- Understand the details of the language and the machine. Do you know what happens when you compare a signed number to an unsigned number? What happens when a signed 16-bit number is promoted to an unsigned 32-bit number? Is it a ssize_t that is signed, or a size_t that is?
Some C++ comments:
- Never NEVER use gets() (it is NEVER safe, it should be removed from the standard lib)!
- When using *printf() always supply a format string. If you do not supply a format string the user data could be interpreted as the format string (RTFM if you don't understand) and all sorts of bizarre and likely insecure behavior can result. It is poor practice to attempt to sanitize the input before using it as the format string as you have to guard against all possible ways of abusing the system while the hacker need only find a single thing you have overlooked.
- If you use scanf(), ALWAYS put in limitations on strings ("%.25s"). Note that the scanf() functions all return a value. This value is actually useful data – it tells you how many items were read in by the scanf() function. ALWAYS check the return value to see that you got what you expected (and abort cleanly if you didn't).
- It is better to use sscanf() and fgets() when dealing input (user or file) because you have better control over buffer sizes (meaning you can size them all the same so there is no chance of overflow).
- strncpy() and strncat() should be used instead of strcpy() and strcat(). Better still, use strlcpy() and strlcat(). See http://www.courtesan.com/todd/papers/strlcpy.html for why.
- Use snprintf() instead of sprintf(), where possible.
- Always leave room for the NULL and it is good practice to always set your buffers to NULL (using memset() or an equivalent) before you use them.
- Never assume that a malloc(), calloc() or realloc() succeed every time. Always check to see if your memory was allocated correctly. If using realloc(), use a temporary pointer so if it fails you can still clean up the lost memory.
- It is very poor practice to rely on the OS to clean up after you; you should always release all resources yourself. C++ classes make this much easier than in C, but of course you have to write your classes smart enough to handle it.
- Using sizeof() can be problematic and potentially lead to a vulnerability, so be careful! Sizeof(buffer) where buffer is a statically allocated object will return the number of bytes (note that for arrays it is NOT the number of elements in the array! you need to divide the result by the sizeof(element) to get the number of elements) used by the object, but if you choose to change that to a dynamically allocated memory (in which case buffer becomes a pointer), all the sudden the results of sizeof() return the size of the pointer (four bytes on 32 bit machines) which can lead to all sorts of subtle bugs and potentially lead to security holes.
- Be aware of signed vs. unsigned behavior. A negative variable converted to an unsigned will be larger than the maximum value of the signed variable! If no bounds checking is done on the value before it is used, bizarre and potentially insecure behavior can result.
RTFM: Even the best documentation is useless if the user refuses to take the time to read and understand it. Most documentation ('man' pages for those *nix people) will discuss any problematic elements of a function/API call. As long as you have access to the Internet you have no excuse for ignorance. MSDN is a free resource and has plenty about ANSI (and even POSIX) programming. Google is your friend and typing 'man ' will get you exactly what you need possibly as much as 99% of the time.
- Don't make the C mistakes above! Many of us are C programmers that have migrated to C++ and therefore tend to use a lot of C constructs when coding. That, per se, is not bad, there are certain efficiencies to be gained in C over a subset of C++ constructs (with the concurrent increase in the potential for bugs), but what often happens is that less experienced programmers take these C constructs and use them as if they behaved like the C++ counterparts (for instance, throwing exceptions on errors) and failing to handle return values properly, etc. In general it is a bad thing to mix C and C++ constructs (i.e., using stdio and iostream), though I highly recommend using a C++ compiler for plain C code as C++ catches more simple coding problems and I have yet to see a C++ warning/error that didn't lead to cleaner C code (or even reveal a serious bug).
- If you are in C++ land, unless there are clear needs for using character array buffers, use strings. There is a note on performance below, but I will repeat the salient point here: most modern CPUs are actually in a wait state some 70% of the time even when it reports that it is at 100% utilization, so it is extremely unlikely that in real world situations you will encounter a measurable impact on performance when using strings instead of character buffers.
Something critical to keep in mind: What may be a perfectly secure program when OS resources are plentiful may become hugely insecure when the machine is being brought to its knees. Your program should never make assumptions about resources being available, ever. It should be written to gracefully handle unexpected behavior and cleanly exit. Since it is not possible to predict every possible failure condition, you should devise a strategy to handle only expected behavior and cleanly exit the instant something unexpected happens. C++'s exceptions make this much easier and cleaner, though it is a bad habit to let the default exception handle catch errors (your program will terminate, possibly leaving files, sockets, etc. in an indeterminate state).
If you are writing libraries it is important to know that you can't hide anything from the program that is using your library. Since the library exists within the memory of the program, the program can get a pointer into the library's memory and troll around reading anything and making any changes. Only if you want to go through the hassle of making your library a separate process can you avoid this. The same thing is true for 'private' class members. The compiler will prohibit access to private members of a class (variables/methods), but there is absolutely no run-time constraint and a user can get a pointer into your class and do with it what they will. This is not a security problem (it is not even a 'problem'), it is just that most people are not aware of this 'feature' and make incorrect (and sometimes fatal) assumptions. As a real world example: Microsoft used to have the security credential that is created when a user logs on (to an NT based computer, this is meaningless information for other Windows OSs) actually stored in user land. As a consequence, the user could get a pointer into the credential and with the correct changes make themselves local admin. If they were really clever (and did the needed research) they could even elevate their privileges out onto the domain. The solution was to move the credential memory into the OS's memory, but that means a context switch ever time the credential is accessed (a worth while tradeoff in most cases).
Embedded strings in binaries can be read simply by opening the executable in notepad or equivalent. As a consequence, if you have your program log onto your remote database with administrator privileges via a DSN-less connection (a real dumbdumb, see here), that connection string, containing the password, is in the clear. Try it and see what I am talking about. The first thing to learn is the use of minimal privileges; in this example you should create an application specific account with just barely enough rights to get its job done. The second thing to keep in mind is that while obscuring your connection string will not stop (or even slow down) a competent cracker, it will stop unsophisticated users from trolling your binaries. I wrote a little application that can be useful in obscuring such data, it can be found here. The goal is to raise the bar and reduce the exposure; it is not possible to make it impossible for a cracker to peer up your skirts.
The performance cost of security need not be significant (or even measurable) if handled at the appropriate location in the program execution. In addition, with the use of inline function calls, repetitive and tedious error checking can be encapsulated without impacting the readability of the code or the performance of the executable. There is a measurable impact in the performance if one looks at it from the point of view of number of instructions, but most modern computers are so fast that the CPU is actually in a wait state some 70% of the time even when it registers as 100% utilized, so the 'waste' of a few clock cycles to verify that your function call succeeded may never even be measurable in real-world conditions.
Paranoia is good when it comes to the design of applications, particularly when they are on a network. When you receive data from an external (to your program, such as from the OS, a socket, pipe, command line, user, etc.) you must ensure that it meets the expected range of values. Since you can't predict what kind of input your crazy users will supply (let alone that of a determined cracker), you should always operate from the 'default-deny' point of view. If you get some sort of data that is out of bounds, it is the best to simply discard the entire batch of data, return a generic error message to the user and put a specific error in some sort of log. This point of default-deny requires some repetition… It is impossible in most real-world situations to predict every possible form of invalid input, but it is usually trivial to parameterize valid input. If you only allow valid input and exclude all other types of input then you may have an occasional problem with input that turns out to be valid that was not considered so in the original specification (in which case you update the specification!). This is particularly critical when using fixed-length character buffers for input data, but simple ints as well as floats, etc. can also easily cause range errors. There are also major concerns with the use of signed vs. unsigned integer variables, not to mention the chance of overflow causing 'wrapping'. If you know your program should expect a given value to be in the range of 10-100, exclude anything less than 10 and greater than 100; it is just that simple. It can even be worthwhile to put restrictions on expected, but extraordinary input just to raise a flag so that it can be investigated (perhaps it never happens in the real world so why write code for it?). Note that Perl and PHP tutorials also put heavy emphasis on checking user input, anything that comes to you over an insecure network (such as the Internet) MUST be validated.
It should be mentioned that logging unexpected behavior is a great way to learn about strange behavior in your program (why do I get so many errors in this module?), problems with the user interface (users often get errors in a particular place), and hack attacks (why are so many strange characters being used in the password field?). Having said that, if no one will read the logs and act upon what they show, then the disk space and CPU cycles are better used on other things. With a properly designed and coded application it is possible to turn logging on and off without having to restart an application (even selectively logging) at the price of a few wasted cycles checking flags all the time, but it could be very valuable when the application behaves unexpectedly and you want to know what is going on. Too much information is just as useless as too little; be intelligent about what you collect.
This concept is designed on the premise that you will miss something somewhere. Being human, mistakes are our forte. If you plan on making mistakes (instead of blindly assuming you will be perfect; tough concept to deal with I know) you can layer your protection so that a mistake in one area can be trapped by another area. In addition there is the simple fact that any third-party (and OS) API has bugs in it and while it is possible (though, realistically, unlikely) that your code is perfect, just a few days of monitoring bugtraq will tell you how many errors pop up in mainstream software. Design your programs using the paranoid principle and treat each functionality as its own logical program so you can cover for mistakes. Classes in C++ make this very easy, but it is not that difficult on C with the use of modules.
The client is simply a way to make life easier for the user, every single bit and byte returned from the client should be sanitized at the server and treated as if it has the most contagious disease, the most deadly poison. If you take this to the logical extreme, each layer in your application will look upon each higher layer with distrust and will examine each piece of input data for flaws. It sounds tedious (and can be if not properly implemented), but it makes debugging orders of magnitude easier, can often make adding enhancements trivial, sometimes allows for parallel processing and can allow for cleaner scaling when you get the success you have been dreaming about. Almost any program of significance will need to have several logical tiers. You will need the user interface (it may be as simple as an ini file, but you can't trust that!), which can be logically isolated by creating a data structure to pass to an analytical engine that does the work, which can be logically isolated from the data storage mechanism (which might be a plain old flat file). The 'N-Tier' concept of being able to spread your computation to many different distributed machines can be utilized for the construction of even rather simple, few-thousand line programs. You can argue that it is overkill, but those sorts of arguments are exactly what leads to poor programming practices (not just security related!) as well as seemingly endless hours debugging strange, sometimes random errors. Besides, if you use good OOD practices when you create your layers, you get the benefit of code reuse and can look like a real hero when you can provide a bulletproof module in next to no time.
Something to note here: while I am focusing on the server side (or area with higher privileges), you should also consider that the client could be spoofed just as easily. One should never presume anything at any location in the information stream. Clients should exercise just as much caution and paranoia regarding data they have received from the server as the server does the client. Even with encryption it is possible to execute a man-in-the-middle attack and no matter how fantastically awesome your encryption is, it is completely in the clear as the hacker reads it.
It is considered good practice to return error messages, because they let the user know something went wrong and also assist you in debugging. However, the user does not need to always know the exact details (it can't help them, but it can help a cracker) and if that information is not also recorded at the server, unless the user happens to email the message to someone in support, it is lost anyway. Instead write the error message to a log somewhere (I like the use of a database for a log) that has some sort of unique key and return the key to the user. If they do choose to email support, by including the key they are providing all the details needed. Plus you can do queries on the data and look for classes of errors and proactively fix them, looking like a real awesome dood to your boss.
Just a note of clarification… What I mean by meaningless error messages is 'meaningless to an attacker', not meaningless to a real user. If the user failed to authenticate because their username or password is bad, just tell them they have a bad username or password (do NOT specify which!), don't tell them that "An exception was thrown at line 427 of module 'checkPass' after checking database 'weKeepOurStuffHere', table 'passWords'". Those sorts of messages are wonderful when you are debugging an application, but they also work wonders for the hacker trying to compromise your system. Remember: the hacker need only find a single vulnerability to exploit to completely 0wn your system.
If you use a database with its own security (as opposed to something like Access, which relies on the OS for security (a really bad idea if anyone can reach the machine)), do NOT use the database administrator account for access!!! This is an incredibly common mistake that lazy developers (which is just about all of us) use during development and testing and it often slides right out into production. If a cracker is able to compromise the application then they now have access to the database with the same rights as the application. If you had used the db admin user to access the database, they now have the full rights to the database, meaning they can create accounts, add/drop tables, etc. Since on typical default installations the database server rights may be the same as admin or root of the machine (years ago I worked with an installation of a MS SQLServer with replication turned on and it required (hopefully that has been changed now) DOMAIN admin rights!) meaning that there is essentially nothing to stop the cracker from 0wning the machine.
The application should have the minimum rights to the database. I suggest the use of stored procedures, if your database supports that. The application should have no direct access to any underlying tables and should only be allowed to execute the chosen stored procedures (note that there are many very dangerous stored procedures that the application should not have access to, do not give blanket rights to the application!). This has the very valuable design and development benefit of isolating the application from any changes to the underlying schema of the database. Most good SQL engines such as SQL Server and Oracle allow you to specify permissions to a very fine-grained level. A note on SQL injection: nothing that the user supplies should ever be executed in the database without undergoing extensive vetting (with the default-deny rule as mentioned above).
Configuration management is seen much more often in larger projects where multiple developers are involved, but it can be valuable even in a single programmer environment. The idea is that different people with different permissions and access to different resources are needed to put a change into production. For instance, in a large shop you might have clear boundaries between the developers, the testers and the systems people. The developers fix a bug/add an enhancement, pass it to the testers (who do regression testing if they are worth their salt), who then pass the vetted code onto the systems people. Since these groups have very different responsibilities, have access to different machines (possibly even on different networks), it is very clear that many steps are needed for any change, no matter how trivial, to get to the end users. While these added layers of handling can slow things down (that is a management issue, by the way, it need not have any significant impact), they do provide a stumbling block for any would-be crackers (internal or external). If the testers and systems people only have read-only access to the code repository, they can't make any changes. If the developers lack any access to the production machines they can't put changes out willy-nilly. This can be emulated in smaller environments by the use of different logins with different permissions.
It is IMPOSSIBLE to protect anything from a cracker if they have full control of the environment in which an application executes. Anyone who tells you different is ignorant or lying to you. Even with smart cards it is extremely difficult to make them tamper proof, about the best that has been done is to make them tamper evident. Since we are dealing with general software running on commodity hardware and operating systems we don't even have the luxury of being able to guarantee we have exclusive access to our own registers, let alone RAM or disk. Your application can be running in a debugger; the OS API can be emulated; the OS can be running on emulated hardware; there are many layers where someone with enough resources and full control of the application can peer into each registry, even changing registry values, as the application runs. There are tricks that can be used to make this more difficult and time consuming, but generally they result in slower, more difficult to debug code, and (the worst offence for a production coding environment) these techniques almost universally cause the maintenance cost of the application to skyrocket. Except for the few notes below regarding passwords and a few other critical elements like credit card numbers, I do not advocate the use of any of these techniques in the development of network code.
Firewalls, DMZ, proxies, SSL… These are all great buzzwords, but what do they mean? They are all about layers of defense. No single layer is sufficient; you should have the maximum you can afford (in time, dollars, people, etc.). Some can be quite cheap, particularly if you are willing to restrict access to your hardware (say, by turning off all ports but 80 on your web server). Intrusion detection/prevention systems are great, IF someone with the right knowledge monitors them. Having a car alarm is useless if you don't run out to the car and check it every time it goes off! Remember: security is a process, not a product.
The magic word! Add a little encryption and suddenly you are secure from crackers, right? Dead wrong! Securing data via encryption is far beyond complex, bulging fore-headed math; it is about protocols, handshakes, key generation, etc. (of course if the math sucks, the rest is a waste of time). The implementation is critical to success and even the very best security professionals make mistakes (sometimes they are real doozies!), doing it yourself is just asking for trouble. Security through obscurity is considered the LAST line of defense (no sense in giving the cracker any information about your system, even if it is a public web server in a DMZ). Do not think that just because Bob created the encryption algorithm during a haze of caffeine induced Zen programming that it provides you any protection. Good encryption is very hard to get right, it is best to rely on publicly available libraries (and be sure to sign up to their notification list so you can find out when any patches are made available).
Keys are critical to encryption. If you don't secure your keys then you have no encryption at all! Keys also need to be backed up as well as made available to others in the event the beer truck takes you out, so there are even more points of access then even backups. I won't detail the various ways you can screw up the protection (many weighty tomes have been penned to provide that information) just keep this very much in the front of your mind and do some research before you put anything into production.
Never store security-critical data in the clear in the storage medium. Passwords should always stored after passing through a one-way hash (to validate a user, run their password through the same hash, then compare the results; if they are the same then the password is almost certainly correct). That will not protect the user from a dictionary attack if a cracker gets access to the database, but it removes the trivial use of the password. Users tend to reuse passwords (I won't add to the discussion about whether the use of passwords is a good thing, only acknowledge that it is likely to be critical to securing access for years to come) which means that if someone gets their password for one account there is an excellent probability that the same password will access many of their other accounts. You can avoid the comparison of the hashed password to tell if the user has the same one by supplying an instance-specific 'salt', meaning that the same password generates different hashes on different machines/servers. Most modern OSs have the ability to allocate memory that is guaranteed not to be swapped to disk and the memory allocated for storing the clear-text password should be this sort of memory if available (keep in mind, though, that the entire OS can be running on emulates hardware and a debugger gives the user access to everything). Another alternative (that is not as secure) is to use stack-based variables in the handling of the password. Since the stack is readily overwritten the likelihood that the password will exist in a readable format for very long is slim. In any case, as soon as you are done with the password overwrite the memory location where the password was stored.
Storing sensitive user data such as credit cards should also be encrypted, though obviously it can't be one-way. Sensitive data should always be encrypted as it resides in the storage medium and only unencrypted for the minimum amount of time (in a secure section of memory; see the discussion on password memory) and then promptly overwritten when its use is over. Another possible way 'secure' yourself is to shift responsibility for storing the credit card details to some well-known third party service, so that you can do away with the liabilities. All your application needs to do is store some key value that points to the actual card data stored by the third party service. Thus, it has no real need to store the actual card details. This protection is all about managing your keys. Sloppy key management leads to no encryption at all, so be sure you don't flub this part!
As mentioned briefly in protecting user credentials, the user's password (or whatever) should never be stored anywhere in the clear. I like to use a hash of the user's name, password and application specific salt (random value initially, but fixed for the life of the application) for actual authentication. That way two users can never have the same hash even if they have the same password and if they (contrary to 'good security practices') use the same password with different applications, the hash will be different. Once the user has been authenticated they should get some sort of expiring token that represents them. This is rather commonly seen in web pages, but it is just as useful in any client/server communication (even on an intranet). It is poor security to trust any connection (even those encrypted) for continuing communication just because the user initially authenticated as the user could have had their session hijacked (possible, though very difficult, even when using encryption). In addition, the session key should expire in a short, fixed time period. For best security, generate a new session key with each communication. The session keys must NOT be predictable; it should come from a large pool of random-ness and not be dependent solely on what the client supplies (meaning it must depend on some randomness on the server side). Never use 'time', 'clock' or even the instruction counter to increment your key. Even if run through a hash, you are greatly reducing the total key space by doing so and can make your keys very predictable.
You do make backups, right? If you don't then by default you have an insecure system, because part of security is availability. If you deny access to something then you are in control of it. If you lack backups and you have a crash, then you have no system! It is quite obvious, but many people fail to treat the subject seriously. OK, now that you have backups, where exactly are they? If they are sitting inside the server then we are back to having no backups. The tapes (or whatever) need to be off-site so that if the building burns down you still have protection. A media-safe fire safe can be used if you are bold and bulky, but off-site is the best. Of course, off-site give crackers two locations to attack your system. If you don't treat your backups like the diamond studded bauble that is your live system, you are making it trivial for a cracker to take you to the cleaners. Backups must be kept secure during their travels too and fro, and of course they must be very well secured wherever they are stored. When you are done with your backups they should be physically destroyed, preferably by incineration. 'Deleting' a file simply marks the space as available; the data resides intact until it is overwritten. Overwriting data does not completely erase the previous data to a determined cracker. Degaussing a tape/hard drive has to be done carefully to be sure that all the media has been completely wiped, not something for the inexperienced. You could encrypt the data on the backup (an excellent idea, btw), but then you must be sure to secure your keys, adding yet another layer for managing.
The real world rears its ugly head! You can't operate in a vacuum; anything you do that interacts with the universe is impacted by your surroundings. So you wrote the most secure program possible (inventing some techniques you hope to patent!) running on a box that has had its OS personally tuned for maximum security (you are so good you find things before the crackers do), yet you leave your box sitting in an unlocked room just waiting for someone to pick it up and walk off with it. Physical security is just as critical to data and application security. Drop the ball here and all your other efforts are wasted if Joe Random Cracker knows where you live. Another critical thing to keep in mind (notice how everything is critical? Your attacker only has to find a single way of penetrating your defenses and you have none) is that the people who have access to your machines (and backups!) should be trusted. This doesn't mean that you simply trust them because they have access, this means you do background checks on everyone who has unrestricted access to the network, server room, backup tapes, database, test machines, etc. Oh, let's not forget the janitor! A massive blind spot to most organizations. Since few people relish the thought of emptying trashcans, it is remarkable how little scrutiny is given to those who volunteer. Besides the obvious security risk of 'dumpster diving' (whereby sensitive documents are harvested from your trash), the janitor generally has completely unrestricted access to your entire infrastructure. Just because most janitors lack an intensive Infosec education (the ones that do are the ones cracking your system) does not mean that they should be ignored, they could be carrying out simple instructions from the black-hat who does have the appropriate training. Un-trusted people should never have unescorted access to any part of your system at any time. Since that may be impractical if you have an extensive network, all the routers, switches, etc. should be in locked rooms with alarms. If you also encrypt all communication on the network then you reduce (but far from eliminate!) the chance someone can simply plug a machine in and start harvesting sensitive information. A couple of words on wireless: Don't Use It! If you have to have wireless communication (for a warehouse, for instance) then it should be on an isolated network that has a complete firewall protecting the rest of your system. Ever hear of 'war dialing'? Now there is 'war driving' where people drive around looking for wireless access points. Since all these systems are designed to be user friendly, they will announce themselves to anyone who asks and provide instructions on how to access the system. You can (and should!) reduce the chance a random person will use your network by enabling WEP and telling your system not to respond to queries, but the WEP protocol is well and truly broken so provides nothing more than a rather low bar for a determined cracker. Once someone is on your network then they can listen to all your traffic (which is why encryption can help deter crackers). Once someone gets an account on your system it is generally accepted that they can elevate their privileges quite easily. The moral of this story: don't let anyone un-trusted on your system!
Just a little note for those of you who do not yet have sweat beading from your foreheads… It is commonly accepted amongst security professionals that at least half of the break-ins are done by trusted insiders. Just another excellent reason to encrypt ALL network traffic. And keep those server doors locked! And really, do a background check; at least make them work to fool you!
Let's talk a moment about disaster. This means that something unpredictable (or ignored) jumps up and bites you on your sensitive parts and now you have a mess. If you have no plan to deal with disaster, then a clever cracker can create one for you and then rob you blind while you are running in circles waving your hands. Obviously backups are important, but just as critical is what to do when you desperately need those backups. Pre-thinking these scenarios goes a long way to being ready when it happens, actually writing it down and publishing it throughout your organization is even better, practicing the scenarios (say, by turning the power off to the server room during prime time) is best. There is nothing like practice to prepare you.
Just like the Olympics, when in a race you can never predict the outcome 100% of the time. The nature of preemptive multi-tasking operating systems (*nix, Windows, MacOS, etc.) means that your application can be 'swapped out' of the CPU at any instruction. A lot can happen while your process is in limbo and there is absolutely no way to predict when your process will have access to the CPU again (if ever). In addition, every time you make a request to the OS there is a 'context switch' meaning your process has been swapped out of the CPU so that the OS can handle some task (accessing a file, communicating with a socket, allocating memory, etc.) and there is a decent chance that something can happen before your process gets back to the CPU. As a consequence, there are a lot of situations where your program can initiate a process yet another process can jump in and change conditions. I won't detail a lot of situations where this can happen (Google on "race conditions" and you will probably find more than enough to satisfy your curiosity), but it is something to keep in mind. The greater the potential lag in acquiring the resource, the greater the chance of a race condition. BTW, you see this same effect in multi-threaded/processed applications, so it is very much worth learning about so you can be aware of when it might be happening. There is generally some help available from the OS to avoid race conditions, but they are almost always OS specific (sometimes even version specific) so I won't detail any here.
An example may help clarify the issue. Let's say you have your high privilege application (something running as root/local admin providing a service to a non-root/admin user) that needs to read a file that is storing some important data about the user. Being a good security conscious application it checks to see that the file is there, has the right permissions, isn't in use, etc. However, there is an unknown and unquantifiable time between when the application checks these things and then makes a request to the OS to open the file. Without using OS specific mechanisms to avoid this situation it is impossible to guarantee that the file the application intended to open is the actual file that it is accessing. During periods of light load on the server, it is very reasonable to expect the process would be nearly atomic. However, it is typically trivial to throw a huge load on a server (calculate PI to a billion decimal places for instance) at just the instant the application is doing its file thing and then try to slip a link into the file as it is between the check and the open. A patient hacker can exploit even a low probability event.
Be aware of your environment (application, not the guy sneaking up behind you (though you might take a quick look over your shoulder just to be sure)). It is quite simple to alter an application's environment before it is executed and if you blindly trust this information it can be simple to take advantage of your vulnerabilities.
A final note: Keep It Simple Stupid. Complexity is the bane of all projects. As the complexity increases linearly the chance of failure increases geometrically. Never build more into an application than is needed, always remove dead code (commenting it out is not acceptable, rely on CVS or equivalent to retain old versions) and strive to reduce the complexity of any given algorithm each time you touch it (refactoring).
What I hope you take away from this document is that there are a lot of places that need attention when attempting to secure an application, most of which have nothing to do with buffer overflows. While this may seem overwhelming to a 'simple coder', knowing about the potential vulnerabilities during design time can allow for some simple and trivial requirements to be put in place to avoid the common pitfalls. With a little attention to detail it is possible to remove all insecure low-hanging fruit from your application and make a cracker earn her pay. It is a practical impossibility to be completely secure, the best you can do is manage the risk. In order to manage the risk properly you must be aware of the potential vulnerabilities; a document like this can give you good ideas on what needs attention. I would like to reiterate one more time… This list above is only a starting point in securing an application. Experienced security professionals make mistakes and they have extensive training to back up their practical, on-the-job knowledge, please do not feel that following this minimum guide is all that is needed to protect yourself.
Keith (mitakeet) Oxenrider
September 26, 2004