If a user has existing Hive UDFS (packaged as jars), it would be most convenient to reuse them in CDAP:
- deploy as an artifact?
- register as additional resources in an app or pipeline
- have CDAP inject these UDFs into a SparkSQL context. (otherwise the spark program can do that itself in one line)
- have CDAP register these UDFs with Hive. Not clear what that means - for a dataset, or for a namespace/database?
More detailed design is needed.