LeetCode 30. Substring with Concatenation of All Words

September 01, 2022

A super naive solution is to build all permutations of wordswords and then look for those substrings in ss. That would be O(slen(words)2)O(s*len(words)^2).

It's interesting and probably useful that the length of every wordword is the same. That makes it possible to, at a given index in ss, quickly check the following substring of length wordword against all possible wordswords as a hashmap lookup.

This is beginning to look like a DP / backtracking solution.

Note: it's possible to have concatenated substringsconcatenated\ substrings that overlap.

Solution:

For each index in ss, see if it is the beginning of a concatenated substringconcatenated\ substring. Do this with a hashmap of wordword:<number needed>\lt number\ needed \gt. At each step, we can find a concatenated substringconcatenated\ substring in O(len(words))O(len(words)) time. If we exhaust the hashmap of needed words, we've found a concatenated substringconcatenated\ substring. If we run into a substring that we don't recognize, we can short circuit and move to the next index in ss.

The runtime of this algorithm is O(len(s)len(words))O(len(s) * len(words)) and the space complexity is O(len(words)))O(len(words))).

import unittest
from collections import Counter
from typing import List

class Test(unittest.TestCase):

    def setUp(self):
        self.s = Solution()

    def test_leetcode_examples(self):
        self.assertEqual(self.s.findSubstring("barfoothefoobarman", ["foo","bar"]), [0,9])
        self.assertEqual(self.s.findSubstring("wordgoodgoodgoodbestword", ["word","good","best","word"]), [])
        self.assertEqual(self.s.findSubstring("barfoofoobarthefoobarman", ["foo","bar", "the"]), [6,9,12])
        self.assertEqual(self.s.findSubstring("wordgoodgoodgoodbestword", ["word","good","best","good"]), [8])

class Solution:

    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        res = []
        if not words:
            return res
        wlen = len(words[0])
        for i in range(len(s)):
            if len(s) - i < len(words)*wlen:
                return res
            # Check if concatenated substring
            wcount = Counter(words)
            for j in range(i, len(s)-wlen+1, wlen):
                w = s[j:j+wlen]
                if w not in wcount:
                    break
                wcount[w] -= 1
                if wcount[w] == 0:
                    del wcount[w]
            if not wcount:
                res.append(i)
        return res

if __name__=="__main__":
    unittest.main()