Skip to main content

Timeline for answer to Stratified random sampling with BigQuery? by Gordon Linoff

Current License: CC BY-SA 4.0

Post Revisions

3 events
when toggle format what by license comment
Aug 27, 2020 at 0:23 comment added Gordon Linoff @Josh . . . What I mean is that an nth sample will work if you want to stratify by a numeric columns, for instance row_number() over (order by income) would also work with the modulo approach.
Aug 26, 2020 at 20:26 comment added Josh QQ - Why do you say that "The first has a particularly nice feature that it can also work with numeric dimensions as well."? seqnum is a number in both cases. The only diff is that in one case you are (trying to) take a fixed percentage of samples per category, whereas in the 2nd one you are taking (at most) a fixed (and equal) number of samples per category, right?
Oct 20, 2018 at 12:44 history answered Gordon Linoff CC BY-SA 4.0