I have explore container configured to run with 1024mb, so 774mb is configured for the heap.
I’ve run three "select *" queries against a PartitionedFileSet dataset each fifteen minutes over a ~3 day period. Now, explore service responds with OutOfMemoryError, but doesn’t crash and restart.
I took a heap dump, and found that there are 922 instances of DistributedFileSystem, 429 instances of JobConf, and 498 instances of HiveConf. Each conf object has a key ‘explore.hconfiguration’ which has a really large value (500KB), so when these objects are leaked, they can lead to ~500MB of leaked space after 1000 leaked instances.
This can also be reproduced by creating streams. Each stream will result in additional items in FileSystem's static Cache field.
Performing a "select *" query against a PartitionedFileSet increases the size of the Cache field by 2, and downloading/preview increases it by an additional 1.