Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I couldn't find any relevant topics so I'm posting this one:

How can I parallelize operations/calculations on a huge array? The problem is that I use the arrays with size of 10000000x10 which is basically small enough to operate on in line, but while running on parfor - causes not enough memory error.

The code goes:

function aggregatedRes = umbrellaFct(preparedInputsAsaCellArray)

% Description: function used to parallelize calculation 
% preparedInputsAsaCellArray - cell array with size of 1x10, for example first
% cell {1,1} would be: {array,corr,df}
% array - an array 1e7 by 10, with data from different regions to be aggregated 
% corr - correlation matrix 
% df - degrees of freedom as an integer value 

% create a function handle from child function 
fcnHndl = @childFct;

% For each available cell - calculate and aggregate
parfor j = 1:numel(preparedInputsAsaCellArray)
    output = fcnHndl(preparedInputsAsaCellArray{j}{:});
end

% Extract results 
for i = 1:numel(preparedInputsAsaCellArray)
    aggregatedRes(:,i) = output{j};
end
end

And child function used in the umberella function:

function aggregated = childFct(array, corr, df)
% Description: 
% array - an array 1e7 by 10, with data from different regions to be aggregated 
% corr - correlation matrix 
% df - degrees of freedom as an integer value 

% get num of cases for multivariate nums
cases = lenght(array(:,1));

% preallocate space 
corrMatrix = double(zeros(cases, size(corr,1)))
u = corrMatrix; 
soerted = corrMatrix;
s = zeros(lenght(array(:,1)), lenght(array(1,:)));

% calc multivariate nums
u = mvtrnd(corr, df, cases);

clear corr, cases

% calc t-students cumulative dist 
u = tcdf(u, df);

clear df

% double sort 
[~, sorted] = sort(u);
clear u 
[~, corrMatrix] = sort(sorted);
clear sorted 

for jj = 1:lenght(lossData(1,:))
    s(:,jj) = array(corrMatrix(:,jj),jj);
end

clear array corrMatrix jj

aggregated = sum(s,2);
end

I already tried with distributed memory but ultimately failed. I will apreciate any help or hint!

Edit: The logic behind functions is to calculate and aggregate data from different regions. In total there are ten arrays, all with size 1e7x10. My idea was to use parfor to simultaneously calculate and aggregate them - to save time. It works fine for smaller arrays (like 1e6x10) but runned out of memory for 1e7x10 (in case of more than 2 pools). I suspect the way i used and implemented parfor could be wrong and inefficient.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
301 views
Welcome To Ask or Share your Answers For Others

1 Answer

等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...