Handling Network Connection Failures

One of the fallacies of distributed computing is that the network is 100% reliable. You will always encounter some network connectivity issue. It must be handled in the code to make your software robust. You can turn off the WiFi or disconnect the Ethernet cable. This will result in the error:

SocketError: Failed to open TCP connection to www.google.com:80 (getaddrinfo: nodename nor servname provided, or not known)

We can handle this case in our code by catching the SocketError. If we are able to make the connection (assuming your are connecting to the right server and the correct port), you will get:

Errno::ECONNREFUSED

when the server is down. Finally timeout error can happen when you are able to connect to the server but the server is overloaded with too many requests. Here is a simple test case:

require 'net/http'

begin
  http = Net::HTTP.new('www.google.com', '80')
  http.open_timeout = 3
  http.read_timeout = 3
  http.get('/')    
  puts 'Success'
rescue SocketError
  puts 'Network connectivity issue'
  # Network is not 100% reliable
rescue Errno::ECONNREFUSED => e
  puts 'The server is down.'
  puts e.message
  # Retry a few times and fail with connection refused error message
rescue Timeout::Error => e
  puts 'Timeout error occurred.'
  puts e.message
  # Retry a few times and fail with timeout error message
end

These are infrastructure issues. It's a good idea to separate the infrastructure exceptions from business logic specific exceptions. This will aid us to isolate problems when troubleshooting. You can retry a few times before gracefully exiting the program. You can also run a Rails app locally and change the host to localhost and port to 3000 in the example above.


Related Articles


Create your own user feedback survey