org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " .

Sakai JIRA | Doron Shalvi | 2 years ago
  1. 0

    There is a concurrency bug in the API-A-LITE "get" APIs in Fedora 3.8, which potentially results in a server exception and failure to properly respond to the following methods: getDatastreamDissemination getDissemination getObjectProfile getObjectHistory The bug is in class org.fcrepo.server.access.FedoraAccessServlet.java, in which the instance variable requestURI is used in the processing logic. As this is an instance variable it is not thread-safe, resulting in unexpected values for a given thread and exceptions during processing. Here is the code snippet: -------------------------------- public class FedoraAccessServlet ... /** Portion of initial request URL from protocol up to query string */ private String requestURI = null; ... public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { ... requestURI = request.getQueryString() != null ? request.getRequestURL().toString() + "?" + request.getQueryString() : request.getRequestURL().toString(); logger.info("Got request: {}", requestURI); ... String[] URIArray = requestURI.substring(0, request.getRequestURL().length()).split("/"); -------------------------------- It appears that this instance variable has been around since at least Fedora 3.3, but it had been used only for logging. So previous errors may have been limited to logging an incorrect requestURI. It looks like Fedora 3.8 has a great deal of code cleanup and optimizations - thanks much for the hard work and the release! However, I guess that this bug was introduced during some of the cleanup. In our system, we process many requests simultaneously through this servlet in order to provide many images at once for search results or page thumbnails. In our system the error manifests in several ways, about 0.5% of the time: 1. DatastreamNotFoundException, because requestURI.substring is called with the shorter length of a different request, resulting in truncated datastream names: org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-101539482-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] 2. StringIndexOutOfBoundsException, because requestURI.substring is called with the longer length of a different request: java.lang.StringIndexOutOfBoundsException: String index out of range: 62 at java.lang.String.substring(String.java:1907) at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:208) 3. Incorrect logging - threads log that they are processing a different URI than they actually are. Here, three different threads report on the processing of page 11 (should be processed by one thread in our system): INFO 2015-01-16 10:11:51.960 [http-bio-8080-exec-10] (FedoraAccessServlet) Got request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB ERROR 2015-01-16 10:11:52.081 [http-bio-8080-exec-8] (FedoraAccessServlet) Datastream not found for request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB (actionLabel=Access) org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] ERROR 2015-01-16 10:11:52.095 [http-bio-8080-exec-16] (FedoraAccessServlet) Datastream not found for request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB (actionLabel=Access) org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] We have patched this class in our system by making this variable a local variable: // Portion of initial request URL from protocol up to query string // Moved to a local variable to be thread-safe String requestURI = request.getQueryString() != null ? request.getRequestURL().toString() + "?" + request.getQueryString() : request.getRequestURL().toString(); This proposed fix is attached. We have tested the patch in our Fedora 3.8 system and it seems to be OK. Converting this variable to local also seems to be OK given the intent of this variable and the rest of the code in this class, but it would be helpful to have confirmation given the effort that has gone into optimizing this class and code. I understand also that this API has been deprecated since Fedora 3.4-RC1, so an alternative solution would be to just use the REST API. However, this code was adjusted for Fedora 3.8, so perhaps it is still common for many organizations to use this API. We do still use it in parts of our system. Thanks again for all of the work in code cleanup, optimizations and other enhancements in this large release!

    Sakai JIRA | 2 years ago | Doron Shalvi
    org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " .
  2. 0

    There is a concurrency bug in the API-A-LITE "get" APIs in Fedora 3.8, which potentially results in a server exception and failure to properly respond to the following methods: getDatastreamDissemination getDissemination getObjectProfile getObjectHistory The bug is in class org.fcrepo.server.access.FedoraAccessServlet.java, in which the instance variable requestURI is used in the processing logic. As this is an instance variable it is not thread-safe, resulting in unexpected values for a given thread and exceptions during processing. Here is the code snippet: -------------------------------- public class FedoraAccessServlet ... /** Portion of initial request URL from protocol up to query string */ private String requestURI = null; ... public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { ... requestURI = request.getQueryString() != null ? request.getRequestURL().toString() + "?" + request.getQueryString() : request.getRequestURL().toString(); logger.info("Got request: {}", requestURI); ... String[] URIArray = requestURI.substring(0, request.getRequestURL().length()).split("/"); -------------------------------- It appears that this instance variable has been around since at least Fedora 3.3, but it had been used only for logging. So previous errors may have been limited to logging an incorrect requestURI. It looks like Fedora 3.8 has a great deal of code cleanup and optimizations - thanks much for the hard work and the release! However, I guess that this bug was introduced during some of the cleanup. In our system, we process many requests simultaneously through this servlet in order to provide many images at once for search results or page thumbnails. In our system the error manifests in several ways, about 0.5% of the time: 1. DatastreamNotFoundException, because requestURI.substring is called with the shorter length of a different request, resulting in truncated datastream names: org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-101539482-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] 2. StringIndexOutOfBoundsException, because requestURI.substring is called with the longer length of a different request: java.lang.StringIndexOutOfBoundsException: String index out of range: 62 at java.lang.String.substring(String.java:1907) at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:208) 3. Incorrect logging - threads log that they are processing a different URI than they actually are. Here, three different threads report on the processing of page 11 (should be processed by one thread in our system): INFO 2015-01-16 10:11:51.960 [http-bio-8080-exec-10] (FedoraAccessServlet) Got request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB ERROR 2015-01-16 10:11:52.081 [http-bio-8080-exec-8] (FedoraAccessServlet) Datastream not found for request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB (actionLabel=Access) org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] ERROR 2015-01-16 10:11:52.095 [http-bio-8080-exec-16] (FedoraAccessServlet) Datastream not found for request: http://localhost:8080/fedora/get/nlm:nlmuid-63510860R-pgimg-11/THUMB (actionLabel=Access) org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " . at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594) ~[fcrepo-server-3.8.0.jar:na] at org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447) ~[fcrepo-server-3.8.0.jar:na] We have patched this class in our system by making this variable a local variable: // Portion of initial request URL from protocol up to query string // Moved to a local variable to be thread-safe String requestURI = request.getQueryString() != null ? request.getRequestURL().toString() + "?" + request.getQueryString() : request.getRequestURL().toString(); This proposed fix is attached. We have tested the patch in our Fedora 3.8 system and it seems to be OK. Converting this variable to local also seems to be OK given the intent of this variable and the rest of the code in this class, but it would be helpful to have confirmation given the effort that has gone into optimizing this class and code. I understand also that this API has been deprecated since Fedora 3.4-RC1, so an alternative solution would be to just use the REST API. However, this code was adjusted for Fedora 3.8, so perhaps it is still common for many organizations to use this API. We do still use it in parts of our system. Thanks again for all of the work in code cleanup, optimizations and other enhancements in this large release!

    Sakai JIRA | 2 years ago | Doron Shalvi
    org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " .

    Root Cause Analysis

    1. org.fcrepo.server.errors.DatastreamNotFoundException

      [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "nlm:nlmuid-63510860R-pgimg-11" with datastream ID of "THUM " OR there are no datastreams that match the specified date/time value of "null " .

      at org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination()
    2. org.fcrepo.server
      FedoraAccessServlet.doGet
      1. org.fcrepo.server.access.DefaultAccess.getDatastreamDissemination(DefaultAccess.java:1124)[fcrepo-server-3.8.0.jar:na]
      2. org.fcrepo.server.access.FedoraAccessServlet.getDatastreamDissemination(FedoraAccessServlet.java:594)[fcrepo-server-3.8.0.jar:na]
      3. org.fcrepo.server.access.FedoraAccessServlet.doGet(FedoraAccessServlet.java:447)[fcrepo-server-3.8.0.jar:na]
      3 frames