Have an AbstractSpark program use a PartitionedFileSet in its beforeSubmit method, via the @UseDataset annotation (as opposed to context.getDataset):
It's internal `tx` field will be null (because startTx was not called on this dataset).
If you use sparkClientContext.getDataset("abc"), that will return the PartitionedFileSet with tx field being non null (so startTx is appropriately called), so the issue is only when using the @UseDataset annotation.
I have attached a diff with which WikipediaPipelineAppTest will fail.
A workaround is to use the getDataset method:
// this pfs will have tx non null
PartitionedFileSet pfs = context.getDataset(name);