I'm trying to fill file of enormous size (>1GB) with random data.
I've written simple "thread safe random", that generates strings (solution was suggested at https://devblogs.microsoft.com/pfxteam/getting-random-numbers-in-a-thread-safe-way/), and reworking random to make random strings is trivial.
I'm trying to write this to file using this code:
String rp;
Parallel.For(1, numlines -1, i =>
{
rp = ThreadSafeRandom.Next();
outputFile.WriteLineAsync(rp.ToString()).Wait();
});
when line numbers are small file is generated perfectly.
When I enter bigger number of lines (say 30000) following happens:
some strings are corrupted (Notepad++ sees them as prepended by lots of NUL)
at some point i get InvalidOperationException("Thread is used by previous thread operation").
I tried making Parallel.For(1, numlines -1, async i =>
with await outputFile.WriteLineAsync(rp.ToString());
and also tried doing
lock (outputFile) {
outputFile.WriteLineAsync(rp.ToString());
}
I can always use single thread approach with simple for and simple writeLine() but as I've said I want to generate big files and I assume that even simple for loop that generates > 10000 records can take some time (in file with big size we will have 1e+6 or even 1e9 records, which is > then 20GB) and I can not think about any optimal approach.
Can someone suggest how to optimize this?
See Question&Answers more detail:os