Find Subdomain Visit Count

A website domain "write.geeksforgeeks.org" consists of various subdomains. At the top level, we have "org", at the next level, we have "geeksforgeeks.org" and at the lowest level, "write.geeksforgeeks.org". When we visit a domain like "write.geeksforgeeks.org", we will also visit the parent domains "geeksforgeeks.org" and "org".

A count-paired domain is a domain that has one of the two formats "cnt d1.d2.d3" or "cnt d1.d2" where cnt is the number of visits to the domain and d1.d2.d3 is the domain itself. For example, "1000 write.geeksforgeeks.org" is a count-paired domain that indicates that write.geeksforgeeks.org was visited 1000 times.

Given an array of count-paired domains cpdomains[], return an array of the count-paired domains of each subdomain in the input. Return the answer in any order.

Examples:

Input: cpdomains = ["1000 geeksforgeeks.org"]
Output: ["1000 geeksforgeeks.org", "1000 org"]
Explanation: We only have one website domain: "geeksforgeeks.org". The subdomain "org" will also be visited. So, both will be visited 1000 times.

Input: cpdomains = ["900 google.mail.com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org"]
Output: ["901 mail.com","50 yahoo.com","900 google.mail.com","5 wiki.org","5 org","1 intel.mail.com","951 com"]
Explanation: We will visit "google.mail.com" 900 times, "yahoo.com" 50 times, "intel.mail.com" once and "wiki.org" 5 times. For the subdomains, we will visit "mail.com" 900 + 1 = 901 times, "com" 900 + 50 + 1 = 951 times, and "org" 5 times.

Approach: To solve the problem, follow the below idea:

The problem can be solved using hashing. Use a hashmap to store all the domains and subdomains. Iterate over all the count-paired domains, and for each count-paired domain, get the count by extracting the string till the first space. After the first space extract the domain and subdomain and increment their frequency by adding count to the hash map.

Step-by-step algorithm:

Use a hashmap to count the frequency of all the domains and subdomains.
For each count-paired domain,
- Find the first space in the count-paired domain.
- Extract the count cnt as all the characters in count-paired domain before the first space.
- Extract the domain from as all the characters after the first space and increment its frequency by cnt.
- Extract all the subdomains from the domain and increment their frequencies by cnt.
After traversing over all the count-paired domains, construct the resultant string and return it.

Below is the implementation of the algorithm:

C++

#include <bits/stdc++.h>
using namespace std;

// function to count the frequency of domains and subdomains
vector<string> subdomainVisits(vector<string>& cpdomains)
{
    // vector to store the result
    vector<string> res;

    // map to store the frequency of domains and subdomains
    map<string, int> freq;

    for (string str : cpdomains) {
        // Extract the count from the string str
        int firstSpace = str.find(" ");
        int cnt = stoi(str.substr(0, firstSpace));

        // Extract the domain from the string str
        string domain = str.substr(firstSpace + 1);

        freq[domain] += cnt;
        for (int i = 0; i < domain.length(); i++) {
            // EXtract subdomains from domain
            if (domain[i] == '.') {
                string subdomain = domain.substr(i + 1);
                freq[subdomain] += cnt;
            }
        }
    }
    // Construct the resultant string
    for (auto ele : freq) {
        string str
            = to_string(ele.second) + " " + ele.first;
        res.push_back(str);
    }
    return res;
}

int main()
{
    // Sample Input
    vector<string> cpdomains
        = { "900 google.mail.com", "50 yahoo.com",
            "1 intel.mail.com", "5 wiki.org" };

    // Function call
    vector<string> res = subdomainVisits(cpdomains);
    for (string str : res)
        cout << str << "\n";

    return 0;
}

Output

951 com
900 google.mail.com
1 intel.mail.com
901 mail.com
5 org
5 wiki.org
50 yahoo.com

Time Complexity: O(N * logN), where N is the number of count-paired domains in input.
Auxiliary Space: O(N)

Find Subdomain Visit Count

Explore