Winsock Programmer’s FAQ
Articles: The History of Winsock
by Warren Young
This article was started in late 2009 when I refocused the FAQ on Windows NT 4.0 SP4 and newer. The FAQ had a lot of material for older systems in it, and rather than just axe it all, I decided to use it as a seed for a new article. The idea was to present all that information in historical context, to explain why Winsock was different in those days. It also explains a lot about why Winsock is the way it is today.
To properly explain such things, I found myself needing to give a lot of back-story that isn’t strictly about Winsock, but which nevertheless informed the design of the early Winsock implementations. Taken together, this is the story of the rise of Winsock and TCP/IP against the spate of proprietary networking schemes prevalant at the time.
With that history to ground us, the second part presents the technical differences and limitations in those old stacks relative to the modern ones covered in the FAQ proper. All of the limitations and bugs discussed here are in products released over a decade ago, and thus of little interest to working programmers, except as history.
Indeed, it is the very age of this FAQ that justifies the inclusion of a history article. History works like a lens: you can use it to create a picture that condenses a lot of activity into a thing small enough to speak its message clearly. The thing is, like a photographic lens, the lens of history also has a minimum focal distance: when you try to focus it on events that are still too close, you end up with a blurry confusion that won’t enlighten anyone. For technology, that minimum focusing distance is about a decade.
If you’re not interested in the history or lived through it like I did and aren’t interested in someone else’s recounting of it, feel free to skip ahead to that second part.
In the beginning, there was chaos.
Throughout the 1980s and into the early 1990s, there was little of the broad agreement on networking standards that we have today. Microsoft had LAN Manager, but it wasn’t even close to dominant. It competed with a zoo of other proprietary systems from Novell, IBM, 3Com, and other companies long since dead or unrecognizably morphed. Generally speaking, back in the DOS and early Windows eras, you bought your desktop operating system from Microsoft and your network stack from some third party, usually selected based on whose servers you were using 1.
Early networking hardware was not only proprietary and nonstandard, it was usually quite expensive to boot. I remember it being a big deal when network interface cards first started dropping below $1,000.
Because of this, the early PCs had no built-in support for networking. A programmer wanting to build a program that talked to the network either had to write his own network card drivers and protocol stack, or buy these programs from someone who had already written them.
Microsoft was trying to solve these sorts of problems in Windows, which abstracted away hardware details with its driver mechanism. They did a really good job with some classes of hardware. A Windows programmer has little need to worry about the particular chips in the computer, or the peripherals attached to it. A DOS programmer wanting to target a broad user base had to implement half a dozen different mouse protocols, for instance. Video cards were an even worse mess, and printers worse still. Windows presented a unified interface to all this hardware and more.
But until 1993, the broad hardware support in Windows did not extend to network cards. Why?
Microsoft had poor incentives to solve the networking problem in Windows. Their primary incentive through the early 1990s was to continue flogging their own horse in this race, LAN Manager. It was a DOS product, but then, its biggest competitors didn’t run on Windows, either: NetWare was built on top of Novell’s own proprietary OS, IBM’s LAN Server product ran atop OS/2, Banyan VINES atop a custom version of Unix... The LAN Manager technology wasn’t added to Windows until 1993, in Windows for Workgroups 3.1 and Windows NT 3.1.
Thus was the dominance of proprietary networking schemes prolonged.
Understand, Microsoft was trying to move people off of DOS and off of competing file and print services, but in typical Microsoft fashion, their first solution to the problem was weak and late. In part, Microsoft was competing against themselves. The DOS LAN Manager stack ran just fine underneath regular Windows 3.1, which had a year’s sales lead. Moving away from this required replacing both your client and server operating systems with more expensive versions, with little net gain in the end. It would take years for regular PC replacement cycles to diffuse these Windows implementations of the LAN Manager technology into the PC world to an extent that network managers felt they could start switching to it.
Meanwhile, Microsoft was also fighting against the inertia possessed by all their other competitors for dominance of the LAN. Moving off NetWare or VINES or whatever was no easy jump.
One by one, these competitors fell. By 1993, there was only one left worth talking about, Novell’s NetWare.
The reader may not be old enough to have lived through this battle between Microsoft and Novell over control of the LAN. Simply recall the same noise, confusion and hyperbole of any other major technology battle you have lived through: Netscape vs. Internet Explorer, iPhone vs. the world cell phone system, Google vs. everybody... The battle for control of the LAN was every bit as significant at the time. There were magazine covers about it. Really!
As in so many other such battles, the bone of contention was the tempting possibility of vendor lock-in through proprietary technologies. Whoever controlled the winning technology would control the LAN. (So they thought at the time, anyway.)
Above, I’ve been pretty hard on Microsoft, painting them as only interested in control through proprietary technology. Someone more charitable might instead observe that with all the confusion in the networking market of the time, it wasn’t as clear what it might mean to have a vendor-neutral network programming API as it was for, say, sound cards.
I would say it’s not charity I lack, but credulousness, because the problem had been solved a decade earlier in 4.2BSD. This was the first appearance of “sockets,” one of many BSD innovations to fix shortcomings in the original Unix. 4.2BSD also saw the first implementation of TCP/IP and is why sockets were created to begin with.
These BSD innovations were well known to many third-party Windows software developers. This created a market for TCP/IP stacks for Windows, which was small but thriving all this time Microsoft was focused elsewhere.
The third-party stack vendors had already created their own proprietary technology mess, however, each vendor providing a different programming API. A group of these vendors got together with Microsoft and hammered out a backwards-compatible extension of the sockets API, which they called WinSock 2, short for Windows Sockets. Version 1.0 of the spec came out in the summer of 1992. Half a year later, version 1.1 introduced several minor improvements and clarifications.
Winsock 1.1 was a truly landmark API, important for many years after, and arguably still important today. With it, programmers finally had an alternative to proprietary network APIs on Windows. You still had to buy a third-party stack, but at least you weren’t tied to that one vendor’s stack.
Although the creation of Winsock and the Windowsization of LAN Manager were both happening at the same time, they were not part of a single unified Microsoft strategy. The driver behind these contemporaneous events was the falling price of LAN equipment. Microsoft frequently has multiple initiatives going at any one time that compete with each other, simply because they are a huge company with a hugely diverse customer base and thus diverse draws for their attention. Microsoft was seeing pressure from those interested in vendor-neutral networking, and responded with Winsock. Meantime, their main focus remained on the battle with Novell. Microsoft didn’t release a product including Winsock for years past the initial spec, and even then it only appeared in their business-focused editions of Windows. The early years of Winsock were largely a way for Microsoft to help developers using third-party network products interoperate with each other. If you wanted to use Microsoft’s own technology, though, you still had to use their proprietary APIs.
The high cost of replacing infrastructure and the high reliance of networking on infrastructure mean adoption of new networking technology takes longer than just about anything else in the PC industry. Companies offering clearly better technology often go out of business before they can sell enough product to pay off their R&D costs. Sure, network speeds increase right in line with Moore’s Law, but replacing anything down at the infrastructure level, including APIs, takes years.
In this time when Microsoft and Novell were duking it out over proprietary LAN technologies and the third-party Winsock stack market was prospering, TCP/IP was this weird thing that was mostly used by minicomputers and expensive workstations running Unix. It was the technological glue that held together this equally strange thing called the Internet that few businesses and almost no individual users had used yet, if they’d heard of it at all. The rise of TCP/IP and the Internet was so peripheral to the main PC industry that Microsoft almost completely ignored it until the second half of the 1990s.
Microsoft’s first OS release including a Winsock-based stack was Windows NT 3.1, which came out in mid-1993, a year after the first Winsock spec. This first version of Windows NT wasn’t received well at all. Even in the best possible scenario, NT could never have commanded enough market share to worry Novell and the third-party Winsock stack vendors, being too resource-hungry to run well on consumer-class hardware and too expensive to run on anything but big servers besides. As Microsoft themselves have proven repeatedly, he who owns the consumer market will always outcompete anyone who shuns it.
Microsoft didn’t make a second grab at this market until 1994, when they released a free Winsock-based TCP/IP stack as an add-on to Windows for Workgroups, which was only ever called by its codename, Wolverine 3. It didn’t work with regular Windows 3.1 at all, and it was something you had to go and download from Microsoft, so again it didn’t exactly catapult Microsoft into the lead. The old LAN Manager technology was gaining ground against the likes of Novell and IBM by this time thanks to the rise of Windows NT and the death of DOS, but Winsock and TCP/IP were still a specialty thing, something most people relied on a third party to provide.
And so it was that those third party Winsock vendors, all vanished or transmogrified beyond recognition today, prospered for several years after Winsock was created. Programs written and tested on one vendor’s Winsock stack generally worked fine when run on a different computer using a different stack. There was the occasional interop problem, of course, but in general the vision of a vendor- and protocol-neutral network programming API worked out exactly as intended.
Doom finally came to this market three years after Winsock was created, with the release of Windows 95. This was the first consumer OS from Microsoft with networking included. Not only did it include Microsoft’s own proprietary NetBIOS based networking scheme, but also TCP/IP, the language of the Internet, which was taking off with the commercialization of the Internet at the same time. The inertia produced by the highly successful Windows 3.1 still kept the third-party stack vendor market alive for years afterward, but by the time of the last Winsock specification two years after the release of Windows 95, the new shape of the world was clear.
In 1996, Microsoft drove another big nail into the third party Winsock stack market’s coffin lid, releasing Windows NT 4. It included Winsock 2, substantially the same API as in current versions of Windows. Later tweaks to the spec, taking it up to version 2.2.2, came in the form of service packs to NT. When Microsoft released the last Winsock spec in 1997, it wasn’t because the stacks in Windows stopped changing, but because there was no longer a need for a separate spec, there being no third-party market left to speak of. It took many years more to kill off the zombie armies of Token Ring and IPX machines — again, infrastructure changes take time — but by 1997 it was fair to say that the Winsock specification was “whatever Windows stacks do today.”
If another nail was needed, 1997 also saw a version of Internet Explorer 4 for 16-bit Windows that included a full Winsock-based network stack. You can still download it here, if you’re interested.
It’s fixed in computing industry lore now that “Microsoft was late to the Internet.” Bill Gates as much as admitted this in the scramble to update his ill-timed book The Road Ahead. (The linked article has a good summary of the events I refer to, [citations needed] notwithstanding.)
The Internet took off in 1994. Many of the most significant Internet organizations were founded that year: Amazon.com, Yahoo!, Netscape, the W3C... This was the year it became clear to everyone that there was money to be made here. Doubtless even Microsoft saw this, internally, but Microsoft is a big ship to turn.
Microsoft’s Internet access story in 1994 was all but nonexistent. There was no DUN feature in Windows; you had to buy third-party software to get that. Internet Explorer didn’t exist yet; it would take until late summer of 1995 for that to appear, as part of Windows 95, and it didn’t become a solid browser until a year later, with Internet Explorer 3. You couldn’t even plug a stock Windows 3.1 box into a LAN and get access to the Internet through a LAN gateway to the Internet; you had to buy a separate stack or upgrade to Windows for Workgroups so you could use Microsoft’s experimental Wolverine stack.
A great many of us on the Internet in those early days therefore didn’t use Winsock or TCP/IP. The hoi polloi used proprietary systems like AOL, and the geeks used Unix boxes or terminal software to get dial-up access to one. This was a year of transition, the bend in the hockey stick curve of Internet adoption. So, all through this year, the Microsoft vs. Novell story still mattered, but it was all about to be obliterated.
It’s foolish to nail down a single cause to an event, but it seems to be a thing historians must do, so here’s where I draw my arbitrary line in the sand: Novell lost the battle for the LAN not to Microsoft, but to the Internet. Both companies had products that defined the way people got access to networked resources, but like anything that depends on central control, it was doomed to fail, unable to scale.
Both companies’ proprietary protocols were destroyed by a third, TCP/IP. It was controlled by no one, and thus was no threat they could identify. Novell’s unrouteable IPX/SPX is inconsequential in today’s world where virtually every computer is part of a WAN, the Internet if nothing else. The original protocol underlying Microsoft’s LAN Manager technology, NetBEUI, was similarly unrouteable and thus also unable to move into the WAN world, though they did manage to patch over the problem by moving to NetBIOS over TCP/IP. 4
Although the Microsoft file and print sharing technologies live on in a way that the Novell ones do not, they’re hopelessly commoditized, available in every NAS box and all their competitors’ operating systems, such that no one makes a serious buck off control of these any more. When you go to share a file, it almost certainly goes over TCP/IP, and the chance that the high-level transfer protocol is SMB and not some Internet protocol — FTP, SMTP, HTTP, BitTorrent, chat — are increasingly small. To the extent that we still print documents any more in the age of widespread Internet publishing and cloud word processors, it, too, almost always uses TCP/IP and frequently uses Internet protocols like IPP.
Old technologies rarely go away entirely: many still listen to AM radio and write with fountain pens. Yet, it’s clear that Microsoft lost the battle for the LAN through control of file and print services. The battle moved away from the LAN to the WANnest WAN of them all, the Internet, where overall control is virtually impossible.
And yet, did they really lose? When Windows 95 came out with TCP/IP and Winsock built in, it immediately killed off the third-party stack market. They saved themselves again against another rising competitor, Netscape, a few years later. Today, the new challenger is cloud computing: the buzz is all about Google, Amazon S3, Linux, and social networking. Microsoft continues to join the party late, but they still hold commanding shares in many network related areas. This drags us back on-point, finally: Winsock is still relevant.
These early third-party Winsock stacks collectively supported all the popular network protocols of the time, not just TCP/IP, the most popular use of Winsock today.
The way this worked is that each vendor wrote a DLL — simply called winsock.dll on 16-bit Windows — which knew how to translate standard Winsock calls into the form needed by the associated network stack. The DLL was just one of the things you got when you installed the stack 5. The stack vendor also gave you an import library and a winsock.h header to compile against 6. Once built, you could install your program on any computer providing a Winsock DLL and expect it to at least run. Naturally, it would only work if the computer’s stack supported the right protocols. You couldn’t expect to write a Winsock program speaking IPX, built and tested against Novell’s Winsock, to run on, say, NetManage’s TCP/IP Winsock stack. But, your program would run, giving error messages indicating the problem, not crash or fail to even start.
Another hurdle of the time is that there were subtle differences in Windows NT 3.x’s 32-bit version of Winsock relative to the more popular third-party implementations for 16-bit Windows. Because of these differences, the 32-bit DLL was called wsock32.dll, which you can still find on the most recent release of Windows, providing the good old Winsock 1.1 API. You can still write new programs against this API, and they’ll work fine if all you need is Winsock 1.1 support. Windows NT also included a winsock.dll for compatibility with 16-bit programs, of course. I still see a winsock.dll on an XP system I have here.
Conversely, there was an add-on from Microsoft you could download for 16-bit Windows called Win32s, a 32-bit emulation layer, which included a wsock32.dll. I used it for a while back when Windows 3.1 was still common. As long as you stuck to a rather small subset of the Win32 API, you could write, build and test your program on Windows NT 3.51 — so much less stressful to use than Windows 95! — and deploy the unmodified binaries on Windows 3.1. Ah, memories.
Although Windows 95, 98, and ME support a large portion of the full Win32 API defined for Windows NT and its derivatives — Windows 2000, Windows XP, Windows Server 200x, Windows Vista, and Windows 7 — these operating systems are closer in technical capability to Windows 3.1 than to NT. They share its reliance on DOS, and in other ways are not fully top-to-bottom integrated 32-bit operating systems like Windows NT and its descendants. They also have a lot of arbitrary limitations lower than the corresponding limit in NT.
In this section, I will discuss those limitations that affect Winsock programs. You can readily find information elsewhere about broader platform issues.
Frequently below it’s helpful to group several releases of Windows based on some common capability. “DOS-based” obviously refers to the old 16-bit versions of Windows, but also to Windows 95 and its direct descendants. Where we wish to talk only about the 32-bit DOS-based versions of Windows — Windows 95, 98 and ME — we’ll refer to them as “Chicago kernel” operating systems, after the code name for Windows 95. Finally, in a few places we cover some of the earlier NT derivatives, either specifically or as a dividing point in history.
The most serious disadvantage is that the Chicago kernels don’t support overlapped I/O or I/O Completion Ports. These mechanisms provide lower overhead I/O in NT derivatives, as compared to more traditional forms of I/O.
In the first versions of NT, overlapped I/O only worked with some forms of I/O, like disk storage. Windows NT 4 extended the Winsock API to allow use of overlapped I/O with networking, too.
Windows 98 came out not long after NT 4, so it also had Winsock 2, but since there is no underlying kernel support for overlapped I/O, it’s emulated in user space so there is no speed advantage over traditional I/O strategies. Windows ME shares this limitation.
In Windows 2000, Microsoft enhanced the Win32 API
so it would accept Winsock error codes, giving decent canned error messages. In earlier
NT derivatives and all DOS-based versions of Windows, programmers had to
build error code to message mapping mechanisms themselves, such as with
a STRINGTABLE in the resource file.
Winsock on Chicago systems has a bug where
select() can fail to block on a nonblocking socket. It will
signal one of the sockets, which will cause your program to call
send() or similar. That function will return
WSAEWOULDBLOCK, which can be quite a surprise. So,
a program using
select() on Chicago systems has to be able to
deal with this error at any time.
On modern Windows, if you have a network interface, the default network configuration also adds a loopback interface so that two programs can talk to each other within the machine. This means you can test network programs without a second machine to bounce packets off of. Even if you don’t have a network interface in the machine, you can add a dummy loopback device easily.
Getting to the same point was often surprisingly difficult on DOS-based Windows, since the more primitive stacks you tended to see on them were often so very vertically integrated that they made unwarranted assumptions about the way the network should work, making uncommon usages like loopback testing difficult.
An example: One of the easiest ways to get a loopback interface
for testing on Chicago systems was to install Dial Up Networking
(DUN) and point it at an unused serial port. The main problem you
then encountered is that DUN would blindly decide to dial the modem
any time you do a name lookup call like
when it should have been obvious to the stack that this couldn’t
succeed. The Chigago DUN system was so stupid about this that on a
system with both a modem and a LAN card, and DNS configured on the
LAN interface, it would still try to dial the modem for a DNS lookup
that the local DNS server knew how to answer. The best workaround was
simply to turn off DUN’s automatic dial feature, and to be sure
only to use IP addresses in testing, not domain names.
One of the features of the TCP protocol is the ability to periodically send an empty packet to the other side. The remote program won’t see anything, but it tells the remote peer’s stack that we’re still here. These packets are called “keepalives” since they’re used to oppose code in some network stacks that drops a network link if nothing happens on it for some time. This is common with modem PPP links, for instance. A side effect of keepalives is that it can be used to detect when the link has gone down with network media that doesn’t have a way to signal the remote peer’s disconnection, like Ethernet.
Unfortunately, in all Microsoft stacks for DOS-based versions of Windows and in Windows NT through version 4, you couldn’t set this timeout period on a per-process basis. You could set a registry entry to change it for the entire system, but the varying ways keepalives are used mean that earlier versions of this FAQ simply recommended against using them. With Windows 2000 or better finally being something most programs can count on, the FAQ has flipped a 180.
Microsoft added generic SSL/TLS support to Windows NT, but on the DOS-based OSes, you had to fall back to one of a few less palatable options. If all you needed was HTTP, you could piggyback on the Internet Explorer engine, which limited your program’s range of capability to that of IE itself. (You probably wouldn’t want to write a custom Web spider this way, for instance.) If you needed some other protocol or needed full control of the HTTP conversation, you had to either write your own SSL/TLS support, or get it from some third-party library.
WSAIoctl() function in Winsock 2 adds a lot of
useful functionality, but it wasn’t completely implemented in
Winsock 2 for non-modern versions
Take the SIO_GET_INTERFACE_LIST sub-function, for example. It completely fails on Windows 95, even after installing the Winsock 2 add-ons. On Windows 98, it returns only partial information. And on Windows NT 4.0 with SP3 and earlier, there are known bugs. (KB181520 and KB170642) In all cases, the fix is to upgrade to a modern version of Windows.
In DOS-based versions of Windows, these files were tossed in the garbage stew that is C:\WINDOWS. In NT and its descendants, they were moved to the Unix-inspired %WINSYSDIR%\drivers\etc directory.
Win16 message queues are fixed-length and fairly short, so it is at
least possible to lose
WSAAsyncSelect() notifications in 16-bit
programs. If Winsock fails to send you a notification because the
message queue is full, it is supposed to keep trying, but empirical
evidence suggests that this does not always happen. It’s hard to
be specific here due to the nature of “16-bit Winsock,”
since this is back when we had all the confusion that competition
brings. We’re talking about stacks from a dozen different
vendors, each with many versions, spanning many years.
The DOS-based versions of Windows are completely unsuitable for use as servers, for a number of reasons:
They share the 5-slot backlog limit of the workstation-class Windows NT derivatives.
The performance of their stacks are objectively inferior to those in the NT derivatives. Simple tests to show this are timing the connection accept time and throughput of a single connection. It gets worse as the number of concurrent connections goes up.
Their kernels are much less stable.
Their kernels lack overlapped I/O support. (It’s emulated out in user space.)
I/O completion ports are completely missing.
The networking subsystem doesn’t handle multiple network cards very well.
The DOS-based versions of Windows with Winsock 2 support only allow you to send ICMP and IGMP over raw sockets. This lets programs send “ping” packets in a standard way, and little else.
Some of the third-party Winsock 1.1 stacks also supported raw sockets, but since the Winsock 1.1 spec didn’t standardize raw sockets, everyone did it differently.
On Windows 95 derivatives, a process can only have 100 connections, a limit enforced by the kernel. You can increase this limit by editing the registry key HKLM\System\CurrentControlSet\Services\VxD\MSTCP\MaxConnections. On Windows 95, the key is a DWORD; on Windows 98/ME, it’s a STRING. I’ve seen some reports of instability when this value is increased to more than a few times its default value.
Modern Windows systems let a process share socket and other file handles with other processes. This mechanism is broken in Chicago kernel OSes. See Microsoft knowledge base articles KB150523 and KB156319 for details.
There’s a bug in Chicago-kernel versions of Windows where
select() doesn’t behave as you might expect. The details
are in Microsoft knowledge base article KB177346. Though it
really is a bug, this is one of those areas where defensive programming
is a better answer than upgrading the OS.
Remember, Microsoft didn’t have a server operating system at the time. You can’t count Xenix, because Microsoft didn’t sell it directly, and Microsoft Xenix didn’t have networking in it anyway. SCO added TCP/IP support to their variant of Xenix shortly before Microsoft gave control of Xenix over to SCO. ↑
Only pedants still use the original camel-case spelling. ↑
Even today, with fast WANs and improved protocols, you often still find Windows file and print services blocked at the LAN gateway, or at least greatly restricted. ↑
This lead to a common newbie mistake in the old days of Winsock, where someone would try to copy a winsock.dll file from one machine to another, or include it with their program’s installer. Their program would fail to even start because the stack-specific DLL couldn’t find the underlying stack it needed to function. These days it’s part of the operating system, so we don’t have to worry about distribution issues like that any more. ↑
The popular C and C++ development tools didn’t include Winsock libraries and headers for years after Winsock was created, and of course there was more time needed before those newer tools saw widespread adoption. So, in the early years of Winsock development, each developer had their own lash-up to build Winsock programs. ↑
This space intentionally left blank. :)
<< Winsock for Non-Windows Systems
||Book and Software Reviews >>|
This article is copyright © 2009-2014 by Warren Young, all rights reserved.
|Updated Mon Sep 22 2014 20:57 MDT||Go to my home page|