Hi folks. i’m not terribly.

 
Oken: Well, of course, there are other fields in the table that I’d also select, i.e. the ones I need to ****yze, otherwise the whole exercise wouldn’t make sense ;

Swed: Jmpp, and you need to process ALL those rows? if yes then making it in batches makes sense – just pick the right granularity – how many courses do you have? will it be more than 1000 records for a course?

Gift: I need to process *all* entries for each userid, courseid combo, yes

Meisenburg: And it’s a pretty low number of courses, let me get

Soechting: I expect the script to execute for a *long* while, that is OK

Swed: Jmpp, and what kind of processing? something which is not achievable with “simple” group by aggregation? counts, sums etc?

Kealohanui: What I need to guarantee is that it doesn’t crash midway due to memory exhaustion

Azzarito: Swed: no, not simple by any means

Benzer: Rows have a “data” field which is a JSON payload

Prophit: For each entry in each userid, courseid combo, I need to extract the payload and compare it against the payload of those other entries in the combo

Swed: Ah : json in a db is a bad idea unless your db understands it. but you already found out the various problems probably

Feltmann: And then select a “winner” according to a non-trivial algorithm, and discard all other rows in the userid, courseid combo

Brennecke: Swed: yeah, it’s far from ideal, pretty old inherited db which everyone in the company acknowledges to be *extremely* poorly designed

Olmos: But, well, it’s the core of the business that we’re working on redesigning, yes, but until then. :

Kao: So, again, the thing is that I need all entries in a userid, courseid combo, ****yze and pick a winner, then iterate to the next userid, courseid combo and repeat

Swed: Jmpp, you can run “select count1, countdistinct courseid, countdistinct userid, countdistinct courseid,userid” to see what numbers you are working with

Gathje: So I thought the courseid granularity would be appropriate. but, then, I’m a bit confused on writing the query that would give me the entries for a single userid given a courseid

Spallina: I can always select disctinct userid and store that, then select distinct courseid and store that, and finally start nested loops to run both arrays as appropriate, issuing each time a simple query with a WHERE and an AND

Inclan: But I’m sure there’s a much clever way

Swed: If you need to fetch for each specific pair instead for all pairs together, then your “granularity level” is not on courseid but on courseid,userid – so the simplest solution is to run “select distinct userid from table where courseid=X” before the inner query to get the users you need to iterate

Swed: Sorry for not being understandable sometimes. my english is far from 100%

Insognia: I need to p**** the data in *each* entry for *each* userid, courseid.

Hamada: Swed: heh, nevermind, appreciate your help very much

Swed: Jmpp, did you try the “count distinct” query I suggested? depending on the resulting numbers it will either be better to get all courses and all users and then nested loops if both counts are relatively small or get the courses, loop over them and for each course get relevant users and then loop over those thats when there are lot more users than courses and it is probable that for each course only a small number of them has any record

Besson: Hey! Hoping someone can help me figure out why I’m getting different results from a query when setting the value to a variable versus the value being returned from a stored proc.

Keirstead: I figure it’s probably because it’s not using the ORDER BY when using SELECT id INTO variable

Calamity: Swed: sorry, hand’t read you

Lanni: Yes, I tried the query, here are the numbers:

Negroni: 31 courses; 150,334 users

Anstine: And the last column is 169,266, but I’m not too sure what it says

Adesso: Hi folks. i’m not terribly experienced with mysql, but i think i’ve got a simple problem that someone with more experience could see the answer to right away.