Workflow Topology

Primary Locations

  • Only 1 user can perform a single task on "live" data
  • Only when the data resides in HDFS is it persisted, sharable and highly available

Main Advantages

  • Multiple concurrent users
  • Multi-job pipelines
Using the same working set