What is a Trie?
Trie Data Structure is an efficient information reTrieval data structure. When using Trie Data Structure, the search complexities of code can be brought to optimal limits (key length). It is a tree of nodes that supports Search, Insertion, and Deletion operations. The Search operation returns the value of the key string, the Insert operation inserts a new string (the key) and value into the trie, and the Delete operation deletes the key-value pair from the trie. The major setback working with Trie Data Structure is on the storage requirements.
Today's Agenda
Algorithm
In
the Trie Data Structure, each node consists of multiple sub-nodes(branches). Each branch is used to represent a character of keys(words). We mark the last node of every key as the end of the word node, depicted by is_leaf. For example, the structure to represent all words of the English alphabets is :
When inserting a key in
the Trie Data Structure, each character of the input key is inserted as an individual Trie node. The children are a defaultdict() which stores all the next level trie nodes. When the input key is a new one or an extension of the existing key, we add non-existing nodes of the key and mark the end of the word as the last node; whereas in case the input key is a prefix then, we mark that last node as the end of the word.
For searching a key we compare the characters and move along the key nodes until we encounter the end of the string or no key in the Trie Data Structure.
Lexicographic sorting of a set of keys can be accomplished by building a trie from them, with the children of each node sorted lexicographically, and traversing it in pre-order, printing only the leaves' values. This algorithm is a form of radix sort.
The above image shows all the possible words formed by traversing through the Trie Tree. The traversal starts from the root and ends when we reach the last node of the trie for each branch. NULL is the root node and 'a', 'c', 'd', 'd', 'g' are the leaf nodes(in left to right format).
Applications of Trie Data Structure
- Trie Data Structure carries out the insert, search, and delete operations in O(L) time complexity, where "L" is the length of the key.
- It is faster than a Binary Search Tree (BST).
- Trie works faster than the Hashing function also as in Trie Data Structure there needs to be no hash function and no collision handling.
- The Prefix-Searching is a major advantage of the Trie Data Structure implementation.
- In order to print all the words in a lexicographically sorted manner, Trie Data Structure comes to the rescue as it can be done easily without any hashing required.
Let's Get Started
Input:
Enter Paragraph: In computer science, a data structure is a data organization, management, and storage for information that enables efficient access and modification. more precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.
Enter Prefix: s
Output:
['science', 'storage', 'structure']
Enter Prefix: m
Output:
['management', 'modification', 'more']
Enter Prefix: ‘’ or ‘ ’
Output:
[]
Note:
defaultdict is a container like dictionaries present in the module collections. It is a sub-class of the dict class that returns a dictionary-like object. We use defaultdict because it never raises a KeyError, instead provides a default value for the key that does not exists.
Comments
Post a Comment