For example, when a partitioned dataset adds a partition, it
- writes the file
- adds the partition to its meta table
- makes a call to Hive to register the partition there
If the transaction fails, then second step is rolled back, but the files remain in the file system and the partition remains in Hive.
This needs to be fixed. It is not clear how to do this correctly: Hive does not support a 2-phase commit or similar mechanism.