Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-11908

Table.scan() should have a way to set the client cache size

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 4.2.0
    • Fix Version/s: 4.2.1, 4.1.2
    • Component/s: Datasets
    • Labels:
    • Rank:
      1|i003tr:

      Description

      Currently, this is hardcoded, with the assumption that it is used in map/reduce: (HBaseTable.java)

          // todo: should be configurable
          // NOTE: by default we assume scanner is used in mapreduce job, hence no cache blocks
          hScan.setCacheBlocks(false);
          hScan.setCaching(1000);
      

      But in some cases, this needs better fine-tuning. For example, when scanning an indexed table, we scan the index, then perform a get() for each row found in the index. If gets are slow(ish), then 1000 gets can take longer than the server side HBase scanner timeout, and the 1001st call to next() will fail. In this case, we would want a much smaller client cache, say 100 or 200.

      The Table API should have a way to set this per scanner instance.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bhooshan Bhooshan Mogal
                Reporter:
                andreas Andreas Neumann
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: