Oken: Well, of course, there are other fields in the table that I’d also select, i.e. the ones I need to ****yze, otherwise the whole exercise wouldn’t make sense ;
Swed: Jmpp, and you need to process ALL those rows? if yes then making it in batches makes sense – just pick the right granularity – how many courses do you have? will it be more than 1000 records for a course?
Gift: I need to process *all* entries for each userid, courseid combo, yes
Meisenburg: And it’s a pretty low number of courses, let me get
Soechting: I expect the script to execute for a *long* while, that is OK
Swed: Jmpp, and what kind of processing? something which is not achievable with “simple” group by aggregation? counts, sums etc?
Kealohanui: What I need to guarantee is that it doesn’t crash midway due to memory exhaustion
Azzarito: Swed: no, not simple by any means
Benzer: Rows have a “data” field which is a JSON payload
Prophit: For each entry in each userid, courseid combo, I need to extract the payload and compare it against the payload of those other entries in the combo
Swed: Ah : json in a db is a bad idea unless your db understands it. but you already found out the various problems probably
Feltmann: And then select a “winner” according to a non-trivial algorithm, and discard all other rows in the userid, courseid combo
Brennecke: Swed: yeah, it’s far from ideal, pretty old inherited db which everyone in the company acknowledges to be *extremely* poorly designed
Olmos: But, well, it’s the core of the business that we’re working on redesigning, yes, but until then. :
Kao: So, again, the thing is that I need all entries in a userid, courseid combo, ****yze and pick a winner, then iterate to the next userid, courseid combo and repeat
Swed: Jmpp, you can run “select count1, countdistinct courseid, countdistinct userid, countdistinct courseid,userid” to see what numbers you are working with
Gathje: So I thought the courseid granularity would be appropriate. but, then, I’m a bit confused on writing the query that would give me the entries for a single userid given a courseid
Spallina: I can always select disctinct userid and store that, then select distinct courseid and store that, and finally start nested loops to run both arrays as appropriate, issuing each time a simple query with a WHERE and an AND
Inclan: But I’m sure there’s a much clever way
Swed: If you need to fetch for each specific pair instead for all pairs together, then your “granularity level” is not on courseid but on courseid,userid – so the simplest solution is to run “select distinct userid from table where courseid=X” before the inner query to get the users you need to iterate
Swed: Sorry for not being understandable sometimes. my english is far from 100%
Insognia: I need to p**** the data in *each* entry for *each* userid, courseid.
Hamada: Swed: heh, nevermind, appreciate your help very much
Swed: Jmpp, did you try the “count distinct” query I suggested? depending on the resulting numbers it will either be better to get all courses and all users and then nested loops if both counts are relatively small or get the courses, loop over them and for each course get relevant users and then loop over those thats when there are lot more users than courses and it is probable that for each course only a small number of them has any record
Besson: Hey! Hoping someone can help me figure out why I’m getting different results from a query when setting the value to a variable versus the value being returned from a stored proc.
Keirstead: I figure it’s probably because it’s not using the ORDER BY when using SELECT id INTO variable
Calamity: Swed: sorry, hand’t read you
Lanni: Yes, I tried the query, here are the numbers:
Negroni: 31 courses; 150,334 users
Anstine: And the last column is 169,266, but I’m not too sure what it says
Adesso: Hi folks. i’m not terribly experienced with mysql, but i think i’ve got a simple problem that someone with more experience could see the answer to right away.