skip to Main Content

This is a web scraping project in C# and responses are the https responses we get but my project works so slow, so I was thinking to change those foreach calls to parallel.foreach.

var index = 0;

foreach (var response in responses) 
{
    if (index == 0) 
    {
        ParseHtmlVatan(response);
    }

    if (index == 1) 
    {
        ParseHtmlTrendyol(response);
    }

    if (index == 2) 
    {
        ParseHtmln11(response);
    }

    index++;
}

2

Answers


  1. You can accomplish this easily using Parallel.For (it’s a little less code than using Parallel.ForEach given your current code).

    Parallel.For(0, responses.Count, index =>
    {
        var response = responses[index];
        if (index == 0)
        {
            ParseHtmlVatan(response);               
        }
        if (index == 1)
        {
            ParseHtmlTrendyol(response);
        }
        if (index == 2)
        {
           ParseHtmln11(response);
        }
    });
    

    Note that hard-coding the parsing method based on the index is pretty fragile. You might want to store some information in each response instance indicating which parser to use.

    Login or Signup to reply.
  2. Parallel.ForEach provides the index as an argument to the lambda.

    Parallel.ForEach(responses, (response, _, index) =>
    {
        if (index == 0)
        {
            ParseHtmlVatan(response);
        }
        if (index == 1)
        {
            ParseHtmlTrendyol(response);
        }
        if (index == 2)
        {
            ParseHtmln11(response);
        }
    });
    

    Since it looks like you’ve got a hard-coded set of responses to start with, you could use Parallel.Invoke:

    Parallel.Invoke(
        () => ParseHtmlVatan(responses[0]),
        () => ParseHtmlTrendyol(responses[1]),
        () => ParseHtmln11(responses[2])
    );
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search