tladeras’s avatartladeras’s Twitter Archive—№ 5,785

  1. #rstats twitter: Does anyone have a good figure that shows how Spark data is distributed in a cluster and how distributing the data enables fast querying? Realizing there's a real disconnect between dev level diagrams and beginner diagrams.
    1. …in reply to @tladeras
      If you are a DB engineer, the Spark diagrams make sense, but explaining to beginners why Spark was chosen needs a different level of abstraction.
      1. …in reply to @tladeras
        (And apologies if I'm asking the wrong question, Spark people.)
        1. …in reply to @tladeras
          Realizing that I need to learn more about RDDs (Resilient Distributed Datasets).
          1. …in reply to @tladeras
            (Ok, just goes to show that I need to brush up on the terminology before I ask these questions.)
            1. …in reply to @tladeras