• 2 Posts
  • 13 Comments
Joined 1 year ago
cake
Cake day: June 27th, 2023

help-circle





  • I haven’t used them in Spark directly but here’s how they are used for computing sparse joins in a similar data processing framework:

    Let’s say you want to join some data “tables” A and B. When B has many more unique keys than are present in A, computing “A inner join B” would require lots of shuffling if B, including those extra keys.

    Knowing this, you can add a step before the join to compute a bloom filter of the keys in A, then apply the filter to B. Now the join from A to B-filtered only considers relevant keys from B, hopefully now with much less total computation than the original join.









  • the author is more interested in how humanity as a whole would react to his fictional scenario than he is with writing characters with depth

    This was my impression as well and I think it works only because the fictional scenarios are extremely creative along with sometimes gratuitous science-fiction details from the author’s imagination. And even though most characters seemed unrealistic as people I still liked them as characters and found them memorable.

    I also read (listened to) Voyagers by Ben Bova recently and while the fictional scenario was interesting, the character development leaned heavily on the relationship between the hero scientist and the promiscuous young scientist, a writing style which I found more boring.