Skip to main content

Command Palette

Search for a command to run...

Day 8/100 100 days of Code

Info Hunter

Updated
2 min read
Day 8/100 100 days of Code
C

AKA Chris, is a software developer from Athens, Greece. He started programming with basic when he was very young. He lost interest in programming during school years but after an unsuccessful career in audio, he decided focus on what he really loves which is technology.

He loves working with older languages like C and wants to start programming electronics and microcontrollers because he wants to get into embedded systems programming.

I created a new function to include the code that I wanted to run in another thread.

void MainFrame::StartScraping(int amount, int counter, 
                              std::vector<std::string> keywords,
                              std::vector<std::string> getUrls)
{
    std::vector<std::string> scraperKeywords;
    for (int j = 0; j < amount; j++)
    {
        scraperKeywords.push_back(keywords[j]);
    }

    Scraper scraper;
    scraper.SetupScraper(scraperKeywords, getUrls[counter]);
    AnalyzePages pageAnalyzer;

    // Get info from website
    cpr::Response r = scraper.request_info(scraper.baseURL);

//    std::cout << r.text << std::endl;

    // Parse it
    std::vector<std::string> urls = scraper.ParseContent(r.text,
                                                         (char *) "href",
                                                         (char *) "/");

    // Iterate through them
    for (const std::string &item: urls) {
        std::cout << item << std::endl;
       pageAnalyzer.analyzeEntry(item, scraperKeywords, scraper);
    }
}

Then I added the following code to my program to run the function in a new thread:

        std::thread t(StartScraping,amount, counter, getSettingsKeywords, getUrls);

        if (t.joinable())
        {
            t.detach();
        }

But this caused the following code to cause a bad access error:

 lxb_char_t html[content.length() + 1];

After experimenting with it, I discovered the problem was with content.length() + 1. Adding a simple number in the length of the HTML lxb_char_t fixes this but the value that should be inserted there varies.

I had a lot of problems finding a solution. In the end I found out that allocating memory was the solution to the problem. The following code fixed the problem.

 lxb_char_t *html = new lxb_char_t[content.size() + 1];

After making sure that this part was ok, I found out that a similar issue appeared somewhere else. Time to go there and fix! Now that I know the solution, it shouldn't take long.

The question remains though. What caused this issue? I was unable to find a good result googling around that might explain it. If anyone knows anything I would love to know.

100 Days of Code

Part 8 of 50

100 days of code is a good initiative to go into hard mode and spend more time in programming. These 100 days will be focused on completing projects and research.

Up next

Day 9/100 100 Days of Code

Info Hunter