Given a sorted Alien dictionary, find order of characters

In an alien language, surprisingly, they also use English lowercase letters, but possibly in a different order. The order of the alphabet is some permutation of lowercase letters.

Given a sequence of words written in the alien language, find the order of the alphabet.

Problem statement:

GeeksforGeeks Problem Link

Example 1:

Input: words = ["hello","leetcode"], 
order = "hlabcdefgijkmnopqrstuvwxyz"
Explanation: As 'h' comes before 'l' in this language, then the order is sorted.

Example 2:

Input: words = ["baa", "abcd", "abca", "cab", "cad"], 
order = "bdacefghijklmnopqrstuvwxyz"
Explanation: As 'a' comes after 'b' and 'd' in this language, hence the order is sorted.

Approach using DAG

Create a graph using given dictionary of alphabets with no cycle ie Directed Acyclic Graph (DAG). And then a topological sort of the created graph, would give the order of characters.

A Topological sort of a Directed-Acyclic graph G is a linear ordering of all its vertices such that if G contains an edge E (u,v), then u appears before v in the ordering.

Pseudocode

1. Create a graph (DAG) with alphabets of the dictionary as the vertices of the graph.
2. Add an edge from alphabet 'a' to alphabet 'b', if 'a' appears before 'b' in the given dictionary.
3. Call DFS(G) for each vertex v(26 - since English alphabets are 26). 
4. As each vertex is visited and finished (recursion finishes), insert the vertex onto front of a stack.
5. Return the stack of vertices i.e. topological sorting of vertices or the required order of characters.

Code Implementation

//
//  main.cpp
//  Alien Dictionary
//
//  Created by Himanshu on 13/09/21.
//
#include <iostream>
#include <vector>
#include <stack>
using namespace std;
const int N = 26;
// Method to perform topological sorting
void topologicalSort (int x, vector<int> graph[], 
int vis[], stack<int> &st) {
    vis[x] = 1;
    for (int i=0; i<graph[x].size(); i++) {
        int j = graph[x][i];
        if (vis[j] == 0) {
            topologicalSort(j, graph, vis, st);
        }
    }
    st.push(x);
}
void createGraph (vector<int> graph[N], vector<string> words) {
    
    for (int i = 0; i < words.size()-1; i++) {
       string word1 = words[i], word2 = words[i+1];
       for (int j = 0; j < word1.size() && j < word2.size(); j++) {
           // Add an edge for every mismatched character from word1[j]
           // to word2[j]
           if (word1[j] != word2[j]) {
               graph[word1[j]-'a'].push_back(word2[j]-'a');
               break;
           }
       }
    }
}
int main() {
    stack<int> st;
    vector<string> words = {"baa", "abcd", 
"abca", "cab", "cad"};
    vector<int> graph[N];
    // Creating a graph (DAG) from words
    createGraph(graph, words);
    
    int vis[N];
    for (int i=0; i<N; i++) {
        vis[i] = 0;
    }
    
    // Passing stack st as pass by reference
    // to store the topological or actual ordering
    // of alphabets
    for (int i=N-1; i>=0; i--) {
        if (vis[i] == 0) {
            topologicalSort(i, graph, vis, st);
        }
    }
    // Printing actual order as found by topological
    // sort 
    cout<<"Order of characters:"<<endl;
    while(!st.empty()) {
        char ch = st.top() + 'a';
        cout<<ch;
        st.pop();
    }
    cout<<endl;
    return 0;
}

Here’s a working example:

https://ideone.com/7oyrRn

Time complexity of above solution is O(n + alphabet), where n is the number of words in the dictionary and alphabet is the number of unique characters in dictionary.

Practice problem

LeetCode Problem Link

Leave a Reply

Your email address will not be published.