Winsock Programmer’s FAQ
Articles: The Lame List
I have reproduced The Lame List here because it is so valuable. This text is cut-and-pasted directly from Appendix C of version 2.2.2 of the Windows Sockets 2 Application Programming Interface. The list originally started out as a list of complaints by Winsock stack vendors about wrongheaded applications created back when Winsock was new and not as well understood. Despite that, these items are still valuable because newbie Winsockers still make the same wrongheaded mistakes. Avoiding the items on this list will take you a long way along the road toward Winsock guruhood.
This version of the List is slightly different from the original: I have changed some punctuation, minor bits of phrasing, etc. And, of course, I have added all the pretty HTML formatting.
Keith Moore of Microsoft gets the credit for starting this, but other folks have begun contributing as well. Bob Quinn, from sockets.com, is the kind soul who provided the elaborations on why these things are lame and what to do instead. This is a snapshot of the list as we went to print (plus a few extras thrown in at the last minute).
Brought to you by The Windows Sockets Vendor Community
connect()on a non-blocking socket, getting
WSAEWOULDBLOCK, then immediately calling
WSAEWOULDBLOCKbefore the connection has been established. Lame.
recv(). Lame assumption.
WSAEWOULDBLOCKerror value, but must not depend on occurrence of the error.
select()with three empty
fd_sets and a valid
TIMEOUTstructure as a sleazy delay function. Inexcusably lame.
select()function is intended as a network function, not a general purpose timer.
connect()on a non-blocking socket to determine when the connection has been established. Dog lame.
connect()when a non-blocking connection is pending, so the error value returned may vary.
select()function (but see item 23).
recv()is even more lame than polling on
Reason: The only invalid socket handle value is defined by the
winsock.h file as
INVALID_SOCKET. Any other
SOCKET type can handle is fair game, and an
application must handle it. In any case, socket handles are
supposed to be opaque, so applications shouldn’t depend on
specific values for any reason.
Alternative: Expect a socket handle of any value, including 0. And don’t expect socket handle values to change with each successive call to
handles may be reused by the Winsock implementation.
select() and a zero timeout in Win16’s
non-preemptive environment. Nauseatingly lame.
Reason: With any non-zero timeout,
select() will call the
current blocking hook function, so an application anticipating an
event will yield to other processes executing in a 16-bit Windows
environment. However, with a zero timeout an application will not
yield to other processes, and may not even allow network operations
to occur (so it will loop forever).
WSAAsyncSelect()with a zero Event mask just to make the socket non-blocking. Lame. Lame. Lame. Lame. Lame.
WSAAsyncSelect()is designed to allow an application to register for asynchronous notification of network events. The Winsock 1.1 specification didn’t specify an error for a zero event mask, but may interpret it as an invalid input argument (so it may fail with
WSAEINVAL), or silently ignore the request.
ioctlsocket(FIONBIO). That’s what it’s for.
SO_OOBINLINE, nor read OOB data. Violently
Reason: It is not uncommon for Telnet servers to generate urgent data, like when a Telnet client will send a Telnet BREAK command or Interrupt Process command. The server then employs a "Synch" mechanism which consists of a TCP Urgent notification coupled with the Telnet DATA MARK command. If the telnet client doesn’t read the urgent data, then it won’t get any more normal data. Not ever, ever, ever, ever.
Alternative: Every telnet client should be able to read and/or detect OOB data. They should either enable inline OOB data by calling
setsockopt(SO_OOBINLINE), or use
except_fds to detect OOB data arrival, and call
MSG_OOB in response.
Reason and Alternative: See item 4.
Reason: Winsock applications that don’t close sockets, and call
WSACleanup(), may not allow a Winsock implementation
to reclaim resources used by the application. Resource leakage
can eventually result in resource starvation by all other Winsock
applications (i.e. network system failure).
Alternative: While a blocking API is in progress in a 16-bit Winsock 1.1 application, the proper way to abort is to:
WSAEINTR error, but applications must
also be prepared for success, due to the race condition involved
shutdown() with the how equal to 1
recv() until it returns 0 or fails with any error
This procedure is not relevant to 32-bit Winsock 2 applications, since they really block, so calling
from the same thread is impossible. (Therefore, this call is
deprecated under Winsock 2.) However, the shutdown procedure above
is still useful.
Reason: TCP can’t do Out of Band (OOB) data reliably. If that isn’t enough, there are incompatible differences in the implementation at the protocol level (in the urgent pointer offset). Berkeley (BSD) Unix implements RFC 793 literally, and many others implement the corrected RFC 1122 version. (Some versions also allow multiple OOB data bytes by using the start of the MAC frame as the starting point for the offset.) If two TCP hosts have different OOB versions, they cannot send OOB data to each other.
Alternative: Ideally, you can use a separate socket for urgent data, although in reality it is inescapable sometimes. Some protocols require it (see item 7), in which case you need to minimize your dependence, or beef up your technical support staff to handle user calls.
strlen() on a hostent structure’s ip address,
then truncating it to four bytes, thereby overwriting part of
malloc()’s heap header. In all my years of observing
lameness, I have seldom seen something this lame.
Reason: This doesn’t really need a reason, does it?
Alternative: Clearly, the only alternative is a brain transplant.
recv(MSG_PEEK) to determine when
a complete message has arrived. Thrashing in a sea of
Reason: A stream socket (TCP) does not preserve message boundaries (see item 20). An application that uses
ioctlsocket(FIONREAD) to wait
for a complete message to arrive, may never succeed. One reason
might be the internal service provider’s buffering; if the
bytes in a "message" straddle a system buffer boundary, the Winsock
may never report the bytes that exist in other buffers.
Alternative: Don’t use peek reads. Always read data into your application buffers, and examine the data there.
Reason: Winsock implementations often check buffers for readability or writability before using them to avoid Protection Faults. When a buffer length is longer than the actual buffer length, this check will fail, so the function call will fail with
Alternative: Always pass a legitimate buffer length.
WSACleanup(). Pushing the
Reason: This is not illegal, as long as each
has a matching call to
WSACleanup(), but it is more work
Alternative: In a DLL, custom control or class library, it is possible to register the calling client based on a unique task handle or process ID. This allows automatic registration without duplication. Automatic de-registration can occur when a process closes its last socket. This is even easier if you use the process notification mechanisms available in the 32-bit environment.
Reason: Error values are your friends! When a function fails, the error value returned by
WSAGetLastError() or included in an
asynchronous message can tell you why it failed. Based on
the function that failed, and the socket state, you can often infer
what happened, why, and what to do about it.
Alternative: Check for error values, and write your applications to anticipate them, and handle them gracefully when appropriate. When a fatal error occurs, always display an error message that shows:
recv(MSG_PEEK) in response to an
FD_READ async notification message. Profoundly
Reason: It’s redundant. It’s redundant.
Alternative: Make a plain
recv() call in response
FD_READ message. Even if it fails with
WSAEWOULDBLOCK, that error is easy to ignore, and you
are guaranteed to get another
FD_READ message later
since there is data pending.
FALSE. Floundering in an endless desert of
Reason: One of the primary purposes of the blocking hook function was to provide a mechanism for an application with a pending blocking operation to yield. By returning
FALSE from the
blocking hook function, you defeat this purpose and your application
will prevent multitasking in the non-preemptive 16-bit Windows
environment. This may also prevent some Winsock implementations
from completing the pending network operation.
Alternative: Typically this hack is done to try to prevent reentrant messages. There are better ways to do this, like subclassing the active window, although, admittedly, preventing reentrant messages is not an easy problem to avoid.
Note that this is not an issue for Winsock 2 applications, since blocking hooks are now a thing of the past! (Good riddance.)
Reason: By definition, client applications actively initiate a network communication, unlike server applications which passively wait for communication. A server must
bind() to a specific port
which is known to clients that need to use the service, however,
a client need not
bind() its socket to a specific port in
order to communicate with a server.
Not only is it unnecessary for all but a very few application protocols, it is dangerous for a client to
bind() to a specific
port number. There is a danger in conflicting with another socket
that is already using the port number, which would cause the call
bind() to fail with
Alternative: Simply let the Winsock implementation assign the local port number implicitly when you call
connect() (on stream or
datagram sockets), or
sendto() (on datagram sockets).
Reason: The Nagle algorithm reduces trivial network traffic. In a nutshell, the algorithm says don’t send a TCP segment until either:
A "Nagle challenged application" is one that cannot wait until either of these conditions occurs, but has such time-critical data that it must send continuously. This results in wasteful network traffic.
Alternative: Don’t write applications that depend on the immediate data echo from the remote TCP host.
Reason: Stream sockets (TCP) are called stream sockets, because they provide data streams (duh). As such, the largest message size an application can ever depend on is one-byte in length. No more, no less. This means that with any call to
the Winsock implementation may transfer any number of bytes less
than the buffer length specified.
Alternative: Whether you use a blocking or non-blocking socket, on success you should always compare the return from
recv() with the value you expected. If it is less than
you expected, you need to adjust the buffer length, and pointer,
for another function call (which may occur asynchronously, if you
are using asynchronous operation mode).
WSACleanup() from their
WEP. Inconceivably lame.
WEP() is lame, ergo depending on it is lame. Seriously,
16-bit Windows did not guarantee that
WEP() would always be
called, and the Windows subsystem was often in such a hairy state
that doing anything in
WEP() was dangerous.
Alternative: Stay away from
in a pool of lameness.
Reason: Couple one-byte sends with Nagle disabled, and you have at best a 40:1 overhead-to-data ratio. Can you say wasted bandwidth? I thought you could.
As for one-byte receives, think of the effort and inefficiency involved with trying to drink a Guinness Stout through a hypodermic needle. That’s about how your application would feel "drinking" data one-byte at a time.
Alternative: Consider Postel’s RFC 793 words to live by: "Be conservative in what you do, be liberal in what you accept from others." In other words, send modest amounts, and receive as much as possible.
select(). Self abusively lame.
Reason: Consider the steps involved in using
select(). You need
to use the macros to clear the 3
fd_sets, then set the
fd_sets for each socket, then set the timer,
select() returns with the number of sockets that
have done something, you need to go through all the
and all the sockets using the macros to find the event that occurred,
and even then the (lack of) resolution is such you need to infer
the event from the current socket state.
Alternative: Use asynchronous operation mode (e.g.
gethostbyname() before calling
inet_addr(). Words fail to express such all-consuming
Reason: Some users prefer to use network addresses rather than hostnames at times. The Winsock 1.1 specification does not say what
gethostbyname() should do with an IP address in standard
ASCII dotted IP notation. As a result, it may succeed and do an
(unnecessary) reverse-lookup, or it may fail.
Alternative: With any destination input by a user—which may be a hostname or dotted IP address—you should call
inet_addr() first to check for an IP address, and if
that fails call
gethostbyname() to try to resolve it.
Furthermore, in some applications, you may want to explicitly check the input string for the broadcast address "255.255.255.255," since the return value from
inet_addr() for this address is
the same as
Reason: Besides yielding to other applications (see item 17), blocking hook functions were originally designed to allow concurrent processing within a task while there was a blocking operation pending. In Win32, there’s threading.
Alternative: Use threads.
ioctlsocket(FIONREAD) on a stream socket
until a complete "message" arrives. Exceeds the bounds of earthly
Reason and Alternative: See item 12.
Reason: Various networks all have their limitations on maximum transmission unit (MTU). As a result, fragmentation will occur, and this increases the likelihood of a corrupted datagram (more pieces to lose or corrupt). Also, the TCP/IP service providers at the receiving end may not be capable of re-assembling a large, fragmented datagram.
Alternative: Check for the maximum datagram size with the
SO_MAX_MSG_SIZE socket option, and don’t send
anything larger. Better yet, be even more conservative. A max of
8K is a good rule-of-thumb.
Reason: UDP has no reliability mechanisms (that’s why we have TCP).
Alternative: Use TCP and keep track of your own message boundaries.
Reason: If you can’t figure out the reason, it’s time to hang up your keyboard.
Alternative: Have a fallback position that uses only base capabilities for when the extension functions are not present.
Reason: UDP is unreliable. TCP/IP stacks don’t have to tell you when they throw your datagrams away (a sender or receiver may do this when they don’t have buffer space available, and a receiver will do it if they cannot reassemble a large fragmented datagram.
Alternative: Expect to lose datagrams, and deal. Implement reliability in your application protocol, if you need it (or use TCP, if your application allows it).
Copyright owned by the authors of the Lame List items, including, but not necessarily limited to, the people mentioned in the introductory matter at the beginning of this article.
<< How to Use TCP Effectively
||Debugging TCP/IP >>|
|Updated Sun Oct 25 2009 01:54 MDT||Go to my home page|