This should help:
Select *
from table1 inner join
table2
on table1.in = table2.in
where table1.T = A and rand() < 1000.0/20000.0
order by rand()
limit 500
This will limit the result set to about 1000 random rows before extracting a random sample of 500. The purpose of getting more rows than expected is just to be sure that you get a large enough sample size.
Here is an alternative strategy, building off the "create your own indexes" approach.
Create a temporary table using the following query:
create temporary table results as
(Select *, @rn := @rn + 1 as rn
from table1 inner join
table2
on table1.in = table2.in cross join
(select @rn := 0) const
where table1.T = A
);
You now have a row number column. And, you can return the number of rows with:
select @rn;
Then you can generate the ids in your application.
I would be inclined to keep the processing in the database, using these two queries:
create temporary table results as
(Select *, @rn := @rn + 1 as rn, rand() as therand
from table1 inner join
table2
on table1.in = table2.in cross join
(select @rn := 0) const
where table1.T = A
);
select *
from results
where therand < 1000/@rn
order by therand
limit 500;
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…