504 Gateway Timeout

You're viewing Apigee Edge documentation.
View Apigee 10 documentation.

Symptom

The client awarding receives an HTTP status code of 504 with the message Gateway Timeout as a response for the API calls.

The HTTP status lawmaking - 504 Gateway Timeout fault indicates that the client did not receive a timely response from the Border Gateway or backend server during the execution of an API

Fault messages

Client application gets the following response code:

HTTP/1.1 504 Gateway Timeout

In some cases, the following error bulletin may as well exist observed:

{    "fault": {       "faultstring": "Gateway Timeout",       "particular": {            "errorcode": "messaging.adaptors.http.flow.GatewayTimeout"        }     } }                  

What causes gateway timeouts?

Typical path for an API request via the Border platform will exist Client -> Router -> Message Processor -> Backend Server every bit shown in the below figure:

The client application, routers, and Message Processors within Edge platform are set upward with suitable timeout values. The Edge platform expects a response to exist sent within a certain menstruation of time for every API request based on the timeout values. If you don't go the response within the specified period of time, so 504 Gateway Timeout Error is returned.

The post-obit table provides more details about when timeouts may occur in Edge:

Timeout occurrence Details
Timeout occurs on Message Processor
  • Backend server does non respond to the Bulletin Processor within a specified timeout period on the Message Processor.
  • Message Processor times out and sends the response status as 504 Gateway Timeout to the Router.
Timeout occurs on Router
  • Message Processor does non respond to the router within the specified timeout menstruum on the Router.
  • Router times out and sends the response status as 504 Gateway Timeout to the client application.
Timeout occurs on client application
  • Router does non respond to the client application within the specified timeout period on the router.
  • The Client application times out and ends the response status as 504 Gateway Timeout to the end user.

Possible causes

In Border, the typical causes for 504 Gateway Timeout fault are:

Crusade Details Steps given for
Slow backend server The backend server that is processing the API request is too slow due to high load or poor performance. Public and Private Deject users
Irksome API request processing by Border Edge takes a long time to process the API asking due to high load or poor operation.

Slow backend server

If the backend server is very boring or takes a long time to process the API asking, and so you lot volition go a 504 Gateway Timeout error. As explained in the department above, the timeout can occur nether one of the following scenarios:

  1. Bulletin Processor times out before backend server responds.
  2. Router times out before Message Processor/backend server responds.
  3. Client application times out before Router/Message Processor/backend server responds.

The following sections describe how to diagnose and resolve the upshot under each of these scenarios.

Scenario #one Message Processor times out earlier backend server responds

Diagnosis

You can use the following procedures to diagnose if the 504 Gateway Timeout error has occurred because of the ho-hum backend server.

Procedure #1 Using Trace

If the issue is still agile (504 errors are nevertheless happening), then follow the below steps:

  1. Trace the affected API in Edge UI. Either wait for the error to occur or if you have the API call, and then make some API calls and reproduce the 504 Gateway Timeout mistake.
  2. In one case the error has occurred, examine the specific request which shows the response code as 504.
  3. Cheque the elapsed time at each phase and make a note of the phase where most time is spent.
  4. If you observe the mistake with the longest elapsed time immediately after one of the post-obit phases, then it indicates that the backend server is tiresome or taking a long fourth dimension to procedure the request:
    • Asking sent to target server
    • ServiceCallout policy

The following provides a sample Trace showing that the backend server did not respond fifty-fifty later 55 seconds resulting in a 504 Gateway Timeout fault:

In the above trace, the Message Processor times out after 55002 ms as the backend server does non respond.

Procedure #2 Using Bulletin Processor logs

  1. Check the Message Processor'due south log (/opt/apigee/var/log/edge-message-processor/logs/arrangement.log)
  2. If yous find Gateway Timeout and onTimeoutRead errors for the specific API proxy asking at the specific time, then it indicates that the Bulletin Processor has timed out.

    Sample Message Processor log showing Gateway Timeout Error

    2015-09-29 20:sixteen:54,340 org:myorg env:staging api:profiles rev:xiii NIOThread@1 Fault ADAPTORS.HTTP.Menstruation - AbstractResponseListener.onException() : AbstractResponseListener.onError(HTTPResponse@4d898cf1, Gateway Timeout)  2015-09-29 xx:16:57,361 org:myorg env:staging api:profileNewsletters rev:8 NIOThread@0 Error HTTP.Client - HTTPClient$Context$3.onTimeout() : SSLClientChannel[C:XX.XX.Twenty.Twenty:443 Remote host:192.168.38.54:38302]@120171 useCount=2 bytesRead=0 bytesWritten=824 historic period=55458ms lastIO=55000ms .onTimeoutRead                      

    In the above Message Processor log, you lot detect that the backend server denoted with the IP address 20.20.20.XX did not respond even later 55 seconds (lastIO=55000ms). As a result, the Message Processor timed out and sent 504 Gateway Timeout fault.

    Check this: How is timeout controlled on Bulletin Processor?

    • How is timeout controlled on Bulletin Processor. Message Processors are usually set with a default timeout value of 55 seconds) via the property HTTPTransport.io.timeout.millis. This timeout value is applicable for all the API Proxies that belong to an arrangement served by this Bulletin Processor.
      • If the backend server does non respond within 55 seconds, then the Message Processor times out and sends 504 Gateway Timeout error to the client.
    • The timeout value specified in the Message Processor can be overridden by the holding io.timeout.millis specified within the API Proxy. This timeout value is applicable to a specific API Proxy in which the higher up mentioned property is specified. For example, if the io.timeout.millis is set to 10 seconds within the API Proxy, then the timeout value of 10 seconds will be used for this specific API Proxy.
      • If the backend server does not reply within 10 seconds for the specific API Proxy, then the Message Processor times out and sends 504 Gateway Timeout error to the client.

Resolution

  1. Check why the backend server is taking more than 55 seconds and see if it tin exist stock-still/optimized to respond faster.
  2. If it is not possible to set/optimize the backend server or it is known that the backend server takes a longer fourth dimension than the configured timeout, and then Increase the timeout value on Router and Message Processor to a suitable value.

Scenario #ii - Router times out before Message Processor/backend server responds

Yous might get 504 Gateway Timeout errors if the router times out before the Message Processor/backend server responds. This can happen under i of the following circumstances:

  • The timeout value ready on the Router is shorter than the timeout value set on the Message Processor. For example, let's say the timeout on Router is l seconds, while the Message Processor is 55 seconds.
    Timeout on Router Timeout on Message Processor
    50 seconds 55 seconds
  • The timeout value on the Message Processor is overridden with a college timeout value using the io.timeout.millis holding set within the target endpoint configuration of the API Proxy:

    For instance, if the following timeout values are fix:

    Timeout on Router Timeout on Bulletin Processor Timeout inside API Proxy
    57 seconds 55 seconds 120 seconds

    Only the io.timeout.millis is set up to 120 seconds in the API Proxy:

    <HTTPTargetConnection>      <Properties>           <Property name="io.timeout.millis">120000</Holding>       </Properties>       <URL>http://world wide web.apigee.com</URL> </HTTPTargetConnection>                      

    And then, the Message Processor volition not timeout after 55 seconds even though it's timeout value (55 seconds) is less than the timeout value on the router (57 seconds). This is because the timeout value of 55 seconds on the Message Processor is overridden past the the value of 120 seconds that is set inside the API Proxy. Then the timeout value of the Message Processor for this specific API Proxy will exist 120 seconds.

    Since the Router has a lower timeout value (57 seconds) compared to 120 seconds ready within the API Proxy, the router will timeout if the backend server does non answer back after 57 seconds.

Diagnosis

  1. Check the Nginx access log (/opt/apigee/var/log/border-router/nginx/ORG~ENV.PORT#_access_log)
  2. If the router times out before the Message Processor, then you will run across the status of 504 on the Nginx admission logs for the specific API request and the message id from the Message Processor volition exist set as -. This is because the Router didn't get any response from the Bulletin Processor within the timeout period assault the router.

    Sample Nginx Log Entry showing 504 due to Router timing out

  3. In the higher up instance, find the status of 504 on Nginx, the message id from the Message Processor is - and total time elapsed is 57.001 seconds. This is because the router timed out afterward 57.001 seconds and we didn't get any response from the Bulletin Processor.
  4. In this case, yous volition see Broken Pipe exceptions in the Message Processor logs (/opt/apigee/var/log/edge-message-processor/logs/organization.log).
    2017-06-09 00:00:25,886 org:myorg env:examination api:myapi-v1 rev:23 messageid:rrt-mp01-18869-23151-1  NIOThread@1 INFO  HTTP.SERVICE - ExceptionHandler.handleException() : Exception coffee.io.IOException: Cleaved pipe occurred while writing to channel ClientOutputChannel(ClientChannel[A:XX.Twenty.XX.XX:8998 Remote host:YY.YY.YY.YY:51400]@23751 useCount=1 bytesRead=0 bytesWritten=486 historic period=330465ms  lastIO=0ms ) 2017-06-09 00:00:25,887  org:myorg env:test api:myapi-v1 rev:23 messageid:rrt-mp01-18869-23151-ane  NIOThread@1 INFO  HTTP.SERVICE - ExceptionHandler.handleException() : Exception trace: java.io.IOException: Broken pipe         at com.apigee.nio.channels.ClientOutputChannel.writePending(ClientOutputChannel.java:51) ~[nio-1.0.0.jar:na]         at com.apigee.nio.channels.OutputChannel.onWrite(OutputChannel.java:116) ~[nio-1.0.0.jar:na]         at com.apigee.nio.channels.OutputChannel.write(OutputChannel.java:81) ~[nio-ane.0.0.jar:na]          … <snipped>                      

This error is displayed because one time the router times out, it closes the connection with the Bulletin Processor. When the Bulletin Processor completes its processing, it attempts to write the response to the router. Since the connection to the router is already airtight, you get the Broken Pipe exception on the Bulletin Processor.

This exception is expected to be seen under the circumstances explained above. So the actual cause for the 504 Gateway Timeout error is notwithstanding the backend server taking longer fourth dimension to respond and y'all need to address that issue.

Resolution

  1. If it's a custom backend server, so
    1. Bank check why the backend server is taking a long time to reply and see if it tin be fixed/optimized to respond faster.
    2. If it is not possible to set/optimize the backend server or information technology is a known fact that the backend server takes a long fourth dimension, then Increment the timeout value on Router and Message Processor.

      Idea: Set the timeout value on the unlike components in the following guild:

      Timeout on Client > Timeout on Router > Timeout on Bulletin Processor > Timeout within API Proxy

  2. If it's a NodeJS backend server, then:
    1. Check if the NodeJS lawmaking makes calls to any other backend servers and if it's taking a long time to return a response. Check why the backend servers are taking longer time and set up the problem as appropriate.
    2. Bank check if the Bulletin Processors are experiencing high CPU or Memory usage:
      1. If any Message Processor is experiencing high CPU usage, then generate iii thread dumps every 30 seconds using the following control:
                                        JAVA_HOME/bin/jstack -l                                PID                                >                                FILENAME                              
      2. If any Bulletin Processor is experiencing high retentivity usage then generate a heap dump using the post-obit command:
        sudo -u apigee                                JAVA_HOME/bin/jmap -dump:alive,format=b,file=FILENAME                                PID                              
      3. Restart the Message Processor using the below control. Information technology should bring downward the CPU and memory:
        /opt/apigee/apigee-service/bin/apigee-service border-message-processor restart
      4. Monitor the API calls to confirm if the problem still exists.
      5. Contact Apigee Support and provide the thread dumps, heap dump, and Message Processor logs (/opt/apigee/var/log/edge-bulletin-processor/logs/system.log)to assistance investigate the crusade for the high CPU/memory usage.

Check This: How is timeout controlled for NodeJS backend servers on Message Processor

  • The NodeJS backend server runs within the JVM process of Message Processor. The timeout value for NodeJS backend servers are controlled via the property http.asking.timeout.seconds in nodejs.properties file. This property is set to 0 past default, that is, the timeout is disabled by default for all the API Proxies that belong to an organization served by this Message Processor. So even if a NodeJS backend server takes long time, the Bulletin Processor will not timeout.
  • Nonetheless, if the NodeJS backend server takes long and if the time taken by the API request is > 57 seconds, and then the Router will timeout and sends 504 Gateway Timeout error to the customer.

Scenario #three - Client application times out before Router/Message Processor/backend server responds

Y'all might get 504 Gateway Timeout errors if the client application times out earlier the backend server responds. This situation can happen if:

  1. The timeout value fix on the client awarding is lower than the timeout value attack the router and Bulletin Processor:

    For example, if the following timeout values are set:

    Timeout on Client Timeout on Router Timeout on Message Processor
    fifty seconds 57 seconds 55 seconds

    In this case, the total time available to go a response for an API request through Edge is <= 50 seconds. This includes the fourth dimension taken to make an API asking, the request being candy by Edge (Router, Message Processor), the request existence sent to the backend server (if applicable), backend processing the request and sending the response, Edge processing the response and finally sending it dorsum to the customer.

    If the router does non respond to the client inside 50 seconds, then the client volition timeout and close the connection with the router. The client will get the response lawmaking of 504.

    This volition cause the Nginx to ready a status code of 499 indicating the client closed the connection.

Diagnosis

  1. If the client application times out earlier it gets a response from the router, then information technology volition shut the connection with the router. In this situation, you lot will meet a condition code of 499 in the Nginx access logs for the specific API request.

    Sample Nginx Log Entry showing condition code 499

  2. In the above example, note that the status of 499 on the Nginx and total fourth dimension elapsed is 50.001 seconds. This indicates that the client timed out later on 50.001 seconds.
  3. In this example, y'all will see Broken Pipe Exceptions in the Message Processor logs (/opt/apigee/var/log/edge-bulletin-processor/logs/system.log).
    2017-06-09 00:00:25,886 org:myorg env:test api:myapi-v1 rev:23 messageid:rrt-1-11193-11467656-one  NIOThread@1 INFO  HTTP.SERVICE - ExceptionHandler.handleException() : Exception java.io.IOException: Broken pipe occurred while writing to channel ClientOutputChannel(ClientChannel[A:Twenty.Xx.Twenty.XX:8998 Remote host:YY.YY.YY.YY:51400]@23751 useCount=1 bytesRead=0 bytesWritten=486 age=330465ms  lastIO=0ms ) 2017-06-09 00:00:25,887  org:myorg env:test api:myapi-v1 rev:23 messageid:rrt-1-11193-11467656-1  NIOThread@one INFO  HTTP.SERVICE - ExceptionHandler.handleException() : Exception trace: java.io.IOException: Broken piping         at com.apigee.nio.channels.ClientOutputChannel.writePending(ClientOutputChannel.coffee:51) ~[nio-1.0.0.jar:na]         at com.apigee.nio.channels.OutputChannel.onWrite(OutputChannel.java:116) ~[nio-1.0.0.jar:na]         at com.apigee.nio.channels.OutputChannel.write(OutputChannel.java:81) ~[nio-1.0.0.jar:na]          … <snipped>                      
  4. Later the Router times out, it closes the connection with the Bulletin Processor. When the Message Processor completes its processing, it attempts to write the response to the Router. Since the connexion to the Router is already closed, you lot go the Broken Pipage exception on the Message Processor.
  5. This exception is expected nether the circumstances explained above. And then the actual cause for the 504 Gateway Timeout error is still that the backend server takes a long time to respond and you need to address that issue.

Resolution

  1. If it'due south your custom backend server then:
    1. Cheque the backend server to determine why information technology is taking more than 57 seconds and see if information technology tin be fixed/optimized to reply faster.
    2. If information technology is not possible to gear up/optimize the backend server or if you lot know that the backend server will take a long time, then increase the timeout value on router and Message Processor.

      Idea: Set the timeout value on the dissimilar components in the following order:

      Timeout on Client > Timeout on Router > Timeout on Bulletin Processor > Timeout inside API Proxy

  2. If it'south a NodeJS backend, and so:
    1. Check if the NodeJS code makes calls to whatsoever other backend servers and if that's taking a long time to return. Check why those backend servers are taking longer time.
    2. Bank check if the Message Processors are experiencing high CPU or memory usage:
      1. If a Bulletin Processor is experiencing high CPU usage, then generate iii thread dumps every 30 seconds using the following command:
                                        JAVA_HOME/bin/jstack -l                                PID                                >                                FILENAME                              
      2. If a Message Processor is experiencing high memory usage, and so generate a heap dump using the following command:
        sudo -u apigee                                JAVA_HOME/bin/jmap -dump:live,format=b,file=FILENAME                                PID                              
      3. Restart the Message Processor using the below command. This should bring downward the CPU and memory:
        /opt/apigee/apigee-service/bin/apigee-service border-message-processor restart
      4. Monitor the API calls to ostend if the trouble still exists.
      5. Contact Apigee Support and provide the thread dumps, heap dump, and Message Processor logs (/opt/apigee/var/log/edge-message-processor/logs/system.log)to assist them investigate the cause for the high CPU/memory usage.

Increase the timeout value on Router and Bulletin Processor

Choose the timeout values to exist set on the Router and Message Processor advisedly depending on your requirements. Don't set arbitrarily large timeout values. If y'all need aid, contact Apigee Back up.

Router

chown apigee:apigee /opt/apigee/customer/awarding/router.properties
  1. Create the /opt/apigee/customer/application/router.properties file on the Router machine, if it does not already exist.
  2. Add the following line to this file:
    conf_load_balancing_load.balancing.commuter.proxy.read.timeout=TIME_IN_SECONDS                      

    For example, if yous want to set the timeout value of 120 seconds, then set information technology equally follows:

    conf_load_balancing_load.balancing.driver.proxy.read.timeout=120
  3. Ensure this file is owned past apigee:
  4. Restart the router:
    /opt/apigee/apigee-service/bin/apigee-service edge-router restart                      
  5. If y'all accept more one router, repeat the higher up steps on all the routers.

Message Processor

  1. Create /opt/apigee/customer/application/message-processor.properties file on the Message Processor machine, if information technology does not already exist.
  2. Add the following line to this file:
    conf_http_HTTPTransport.io.timeout.millis=TIME_IN_MILLISECONDS                      

    For example, if yous want to gear up the timeout value of 120 seconds, then set it as follows:

    conf_http_HTTPTransport.io.timeout.millis=120000
  3. Ensure this file is endemic past apigee:
    chown apigee:apigee /opt/apigee/customer/application/message-processor.properties
  4. Restart the Message Processor:
    /opt/apigee/apigee-service/bin/apigee-service edge-message-processor restart
  5. If you have more than than one Message Processor, repeat the above steps on all the Message Processors.

Idea: Set the timeout value on the unlike components in the following order:

Timeout on Client > Timeout on Router > Timeout on Bulletin Processor > Timeout inside API Proxy

Slow API request processing past Edge

If Edge is very boring and/or taking a long time to process the API request, then you will go a 504 Gateway Timeout error.

Diagnosis

  1. Trace the affected API in Border UI.
  2. Either wait for the mistake to occur or if you lot have the API telephone call, then make some API calls and reproduce the 504 Gateway Timeout fault.
  3. Note, in this case, you may see a successful response in the Trace.
    1. The Router/client times out as the Message Processor does non respond back within the specified timeout period on the Router/customer (whichever has the lowest time out period). Notwithstanding, the Message Processor continues to process the request and may complete successfully.
    2. In addition, the HTTPTransport.io.timeout.millis value assault the Message Processor triggers only if the Message Processor communicates with a HTTP/HTTPS backend server. In other words, this timeout will non become triggered when whatever policy (other than ServiceCallout policy) inside API Proxy is taking a long time.
  4. After the mistake has occurred, examine the specific request that has the longest elapsed time.
  5. Check the elapsed time at each phase and make a note of the stage where the most time is spent.
  6. If y'all observe the longest elapsed time in whatever of the policies other than the Service Callout policy, then that indicates that Edge is taking a long time to process the asking.
  7. Here'southward a sample UI trace showing very high elapsed time on JavaScript Policy:

  8. In the above example, you observe that the JavaScript policy takes an abnormally long corporeality of time of ~ 245 seconds.

Resolution

  1. Check if the policy that took a long fourth dimension to reply and if at that place is any custom code that might require a long time to process. If there is any such lawmaking, then see if you tin can fix/optimize the identified lawmaking.
  2. If there is no custom code that might cause high processing time, then check if the Message Processors are experiencing high CPU or memory usage:
    1. If whatsoever Message Processor is experiencing high CPU usage, and so generate 3 thread dumps every 30 seconds using the post-obit control:
                                  JAVA_HOME/bin/jstack -l                            PID                            >                            FILENAME                          
    2. If whatsoever Message Processor is having high Memory usage, so generate a heap dump using the post-obit control:
      sudo -u apigee                            JAVA_HOME/bin/jmap -dump:live,format=b,file=FILENAME                            PID                          
    3. Restart the Message Processor using the below control. This should bring downwardly the CPU and Memory.
      /opt/apigee/apigee-service/bin/apigee-service edge-message-processor restart
    4. Monitor the API calls and confirm if the problem still exists.
    5. Contact Apigee Support and provide the thread dumps, heap dump, and Message Processor logs (/opt/apigee/var/log/edge-bulletin-processor/logs/arrangement.log)to aid them investigate the cause for the high CPU/memory usage.

Diagnose problems using API Monitoring

API Monitoring enables you to isolate problem areas quickly to diagnose error, performance, and latency issues and their source, such as developer apps, API proxies, backend targets, or the API platform.

Pace through a sample scenario that demonstrates how to troubleshoot 5xx issues with your APIs using API Monitoring. For example, you may desire to fix up an alarm to exist notified when the number of 504 status codes exceeds a particular threshold.