Affects Version/s: None
Fix Version/s: 5.1.0
Component/s: App Fabric
Release Notes:Improved runtime monitoring (which fetches program states, metadata and logs) of remotely launched programs from the CDAP Master by using dynamic port forwarding instead of HTTPS for communication.
In CDAP 5.0, the running program in remote cluster starts a HTTPS server on port 443 and the RuntimeMonitor running in the CDAP master connects to that as a client to fetch program states, metadata and logs. The main drawback of such approach is that it requires a firewall rule being setup to allow ingress traffic on port 443. Such firewall rule usually does not exist by default, hence requires extra manual step for the user to setup such rule through the cloud provider directly. Alternatively, CDAP could setup the firewall rule automatically as part of the provisioning process, but it requires the service account that CDAP uses have such permission, which the user may not willing to. Also, automatically setting up the firewall rule makes failure recovery and cleanup more challenging. Moreover, it may unintentionally expose user's network in the cloud, depending on the existing setup.
Another drawback on the existing approach is that for persistent remote hadoop cluster, no more than one pipeline can be executed at the same time as it is hardcoded for the runtime monitor server to bind to port 443.
The simplest solution is to use SSH local port forwarding. However, such approach requires three local ports (one for SSH, one for local server socket and one for local socket connecting to the local server) per program run, since each run executes in different cluster. Also, doing so requires more threads in the CDAP master to deal with the IO (even using Netty, each worker of each local server requires a dedicated thread).
A better approach is to use dynamic port forwarding. Unlike typical dynamic port forwarding (like what the ssh -D does), which forwards all traffic to one remote host to connect to the destination, however, we only starts one SOCKS proxy server locally and based on the request hostname to dynamically creates SSH session to the target host, and forward it as a localhost (as per the remote ssh server) connection. This approach only requires two ports per program run (one for SSH, one for local socket connecting to the local SOCKS proxy server). This modified approach works in CDAP case is because we are only interested in connecting to a local server running on same remote host that we SSH into.
The Netty library always has implementation for SOCKS5 proxy handlers, which can be used for the proxy server implementation. The proxy server will be binded to localhost only on a random port. The following describes what will happen after starting a remote program execution:
- Create a SSH session
- Use the SSH session to upload files and launch the remote process
- Register the <Remote Host, SSH session> pair to the local proxy server
- Setup the RuntimeMonitorClient to proxy all HTTPS calls through the local SOCKS proxy
- On every HTTPS call made from the RuntimeMonitorClient, the proxy server lookup the SSH session based on the remote host name from the request
- If no such SSH session exists, denial the call
- The proxy server opens a TCP/IP forwarding channel (direct-tcpip) using the SSH session.
- Setup the proxy server netty ChannelPipeline to relay network data using the TCP/IP forwarding channel
- When program completed, the RuntimeMonitor will unregister the SSH session from the local proxy server and close the SSH session.