I'm running Windows 10 professional with a Realtek PCIe GBE Family controller integrated on my MSI 170A-Pro mainboard. Usually everything is fine. The network works fine and interruption free on Linux and Windows. So the hardware seems to be okay.
However, I experience a loss of connectivity about once a day using Windows 10. The symptoms are a bit weird though:
- I cannot connect to any website in either Chrome or Internet Explorer (Chrome says ERR_CONNECTION_FAILED)
- except that Google usually works (probably because a connection to it is maintained by chrome)
- my Google Talk connection continues to work (it seems only to affect new connections)
- nslookup works fine for any domain
- I can ping the sites I want to browse
- I have a valid IPv4 and IPv6 address
- I can ping the default gateway on IPv4 and IPv6
- Windows Network diagnostics can not find any problems
- Windows says I'm successfully connected to the Internet
- other devices on the network continue to have no problems (it's not router related)
However the only solution is to either reset the network through the control panel option and reboot or by calling netsh winsock reset
in an admin console and reboot. Rebooting alone does not solve the problem.
So far I did
- disable power management for the network card
- upgrade to the most recent driver from Realtek
I'm completely at loss what exactly is wrong. Because the network obviously works. There seems just to be certain part of it not working.
If anyone has any idea how to debug this further I'm all ears!
Please note this is a wired network
ipconfig output follows (it looks exactly the same when the connection works)
Windows IP Configuration
Ethernet adapter Ethernet:
Connection-specific DNS Suffix . : w00t
IPv6 Address. . . . . . . . . . . : 2a02:2450:1024:442:808:aa56:5c13:9413
Temporary IPv6 Address. . . . . . : 2a02:2450:1024:442:a5e3:4f74:fb29:5d13
Link-local IPv6 Address . . . . . : fe80::808:aa56:5c13:9413%3
IPv4 Address. . . . . . . . . . . : 192.168.1.165
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : fe80::1e87:2cff:fe6a:b6b0%3
192.168.1.1
Tunnel adapter isatap.w00t:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . : w00t
Tunnel adapter Teredo Tunneling Pseudo-Interface:
Connection-specific DNS Suffix . :
IPv6 Address. . . . . . . . . . . : 2001:0:9d38:6ab8:2cd7:1e65:3f57:fe5a
Link-local IPv6 Address . . . . . : fe80::2cd7:1e65:3f57:fe5a%12
Default Gateway . . . . . . . . . :
Some additional information:
- when rebooting to Linux, the network connection works flawless there, rebooting back to Windows and the problem is still there
- completely powering off the machine does not solve the problem
- HTTP sites on my local network are not reachable as well
- the problem is DNS independent, sites are not reachable via IP address either
- SMB connections to a Windows-Share do not work either
To me it looks like the TCP-Stack of the Operation-System somehow "gets stuck". Ping (ICMP) and DNS (UDP) work, HTTP and SMB (TCP) don't.
This got me to try another thing: I tried to use PuTTY to ssh (TCP) to another machine and it brings up the error: Network error: No buffer space available
Above error pointed me to https://serverfault.com/questions/131935/network-error-no-buffer-space-available which in turn led me to check the Event Viewer which shows Error 4227:
TCP/IP failed to establish an outgoing connection because the selected local endpoint was recently used to connect to the same remote endpoint. This error typically occurs when outgoing connections are opened and closed at a high rate, causing all available local ports to be used and forcing TCP/IP to reuse a local port for an outgoing connection. To minimize the risk of data corruption, the TCP/IP standard requires a minimum time period to elapse between successive connections from a given local endpoint to a given remote endpoint.
When disabling and re-enabling the device (which the knowledge base entry suggests) the error simply reoccurs in the log.
It seems like some program is exhausting the available outgoing TCP ports. So the questions become:
- how to figure out which program is the culprit?
- why wouldn't a reboot solve this problem?
Answer
EventID 4227 seems related to too many outstanding TCP connections.
Not an answer, but the following first steps are too long for a comment:
- Run sfc /scannow
- In Device Manager delete the network adapter and reboot
- Use TCPView to see outgoing connections when this happens
- Increase maximum of outgoing connections by setting TcpNumConnections and see also the other parameters described in this article
- Disable IPv6
- Restart Chrome
- Start Windows in Safe mode with network and if this stops happening then some installed application is to blame
- Use Chrome in incognito mode to temporarily disable extensions
- Try Firefox
- Do you have many tabs open? Or when this is happening do you always have one particular website open?
The results of the above may help with localizing the problem.
Comments
Post a Comment