I have a MYSQL table called aa_ngl_defs of the following form:
id | Word | POS | Definition |
---|---|---|---|
1 | the | article | used to point to something already mentioned |
2 | be | verb | used to show the identity of a thing |
3 | and | conj | used to join words or groups of words |
4 | of | prep | belonging to or connected with something |
5 | to | prep | used to indicate place or direction |
6 | a | article | one particular thing or one of a class of things |
7 | in | prep | used to indicate being inside of or surrounded by something else |
8 | have | verb | to own, possess, or hold something |
9 | it | pronoun | a thing that has been previously mentioned |
10 | you | pronoun | used to refer to the person the speaker is addressing |
11 | he | pronoun | used to refer to a male person that is the subject |
12 | for | prep | indicating the purpose or need of something |
13 | they | pronoun | used to refer to two or more people, animals, or things |
14 | not | adverb | used to make an expression negative |
15 | that | adj | used to identify a specific person, thing, fact or idea |
16 | we | pronoun | used to refer to the speaker and another person as the subject |
17 | on | prep | touching and being supported or physically in contact by something |
The table has unique contiguous integers in the id column.
I want to create a new table by adding 3 columns of definitions taken randomly from other rows within this table. This would allow me to create quizzes with 3 distractors. I need reasonable performance speed but not necessarily high speed, as the table has 3,500 rows but I only need to generate a new quiz once-a-week.
So that it looks like this:
id Word POS Correct_Definition Distractor1 Distractor2 Distractor3
The online quiz generators Quizlet, Kahoot and Blooket are able to take a simple list like this and produce quizzes with 3 distractors. That’s what I’m trying to emulate.
The following works OK for adding 1 extra distractor definition:
SELECT t1.id, t1.Word, t1.Definition, t2.Definition
FROM aa_ngl_defs AS t1
LEFT JOIN aa_ngl_defs AS t2
ON t1.id <> t2.id -- Ensure we don't join the row with itself
ORDER BY RAND()
LIMIT 3;
But I don’t know how to add two more distractor definitions, and also my query took 17s which I suspect will multiply exponentially if I try to add more columns.
I tried to incorporate the following to improve performance
SELECT *
FROM aa_ngl_defs AS r1 JOIN
(SELECT (RAND() * (SELECT MAX(id) FROM aa_ngl_defs)) AS id
)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
… and this works very fast to select one row at random, but I am completely lost as to how incorporate this to add the three distractors. I think I need nested select statements or perhaps variables.
Obviously, since each row will have 3 distractors there will be some repeats of distractors across rows, but I can’t have repeated distractors within any single row. Randomness only needs to be reasonably random, not strictly so.
2
Answers
See if this works for you. You can move the random rows logic into a lateral join and then apply a row_number, followed by conditional aggregation to pivot into your required columns:
See this demo Fiddle
This dbfiddle sometimes picks a duplicate distractor
The reason for picking duplicates is known, the reason of not returning 17 rows is not known (to me ).