Given a sorted Alien dictionary, find order of characters

Problem

In an alien language, surprisingly, they also use English lowercase letters, but possibly in a different order. The order of the alphabet is some permutation of lowercase letters.
Given a sequence of words written in the alien language, find the order of the alphabet.

GeeksforGeeks Problem Link

Example 1:

Input: words = ["hello","leetcode"], 
order = "hlabcdefgijkmnopqrstuvwxyz"
Explanation: As 'h' comes before 'l' in this language, then the order is sorted.

Example 2:

Input: words = ["baa", "abcd", "abca", "cab", "cad"], 
order = "bdacefghijklmnopqrstuvwxyz"
Explanation: As 'a' comes after 'b' and 'd' in this language, hence the order is sorted.
Approach using DAG (Directed-Acyclic Graph)

Create a graph using given dictionary of alphabets with no cycle ie Directed Acyclic Graph (DAG). And then a topological sort of the created graph, would give the order of characters.

A Topological sort of a Directed-Acyclic graph G is a linear ordering of all its vertices such that if G contains an edge E (u, v), then u appears before v in the ordering.

Pseudocode
1. Create a graph (DAG) with alphabets of the dictionary as the vertices of the graph.
2. Add an edge from alphabet 'a' to alphabet 'b', if 'a' appears before 'b' in the given dictionary.
3. Call DFS(G) for each vertex v (26, since English alphabets are 26 in count).
4. As each vertex is visited and finished (recursion finishes), insert the vertex onto front of a stack.
5. Return the stack of vertices i.e. topological sorting of vertices or the required order of characters.

Code Implementation

//
//  main.cpp
//  Alien Dictionary
//
//  Created by Himanshu on 13/09/21.
//
#include <iostream>
#include <vector>
#include <stack>
using namespace std;
const int N = 26;

// Method to perform topological sorting
void topologicalSort (int x, vector<int> graph[], 
int vis[], stack<int> &st) {
    vis[x] = 1;

    for (int i=0; i<graph[x].size(); i++) {
        int j = graph[x][i];
        if (vis[j] == 0) {
            topologicalSort(j, graph, vis, st);
        }
    }

    st.push(x);
}

void createGraph (vector<int> graph[N], vector<string> words) {
    
    for (int i = 0; i < words.size()-1; i++) {
       string word1 = words[i], word2 = words[i+1];

       for (int j = 0; j < word1.size() && j < word2.size(); j++) {
           // Add an edge for every mismatched character from word1[j]
           // to word2[j]

           if (word1[j] != word2[j]) {
               graph[word1[j]-'a'].push_back(word2[j]-'a');
               break;
           }

       }
    }
}

int main() {
    stack<int> st;
    vector<string> words = {"baa", "abcd", "abca", "cab", "cad"};

    vector<int> graph[N];

    // Creating a graph (DAG) from words
    createGraph(graph, words);
    
    int vis[N];
    for (int i=0; i<N; i++) {
        vis[i] = 0;
    }
    
    // Passing stack st as pass by reference
    // to store the topological or actual ordering
    // of alphabets
    for (int i=N-1; i>=0; i--) {
        if (vis[i] == 0) {
            topologicalSort(i, graph, vis, st);
        }
    }

    // Printing actual order as found by topological sort 
    cout<<"Order of characters:"<<endl;
    while(!st.empty()) {
        char ch = st.top() + 'a';
        cout<<ch;
        st.pop();
    }

    cout<<endl;
    return 0;
}

Output

Order of characters:
bdacefghijklmnopqrstuvwxyz

Here’s a working example: Alien dictionary solution

Time complexity of above solution is O(n + alphabet), where n is the number of words in the dictionary and alphabet is the number of unique characters in dictionary.

Practice problem
LeetCode Problem Link

Leave a Reply

Your email address will not be published. Required fields are marked *