Implement strstr() to Find a Substring in a String

Write C code to implement the strstr (Search for a substring) function. Do not use any system library such as strlen.


The strstr function locates a specified substring within a source string. Strstr returns a pointer to the first occurrence of the substring in the source. If the substring is not found, strstr returns a null pointer.

As expected, this question is very popular among technical interviews. This question tests your ability in manipulating strings using pointers, as well as your coding ability.

Of course, you can demonstrate to your interviewer that this problem can be solved using known efficient algorithms such as Rabin-Karp algorithm, KMP algorithm, and the Boyer-Moore algorithm. Since these algorithms are usually studied in advanced algorithms class, for an interview it is sufficient to solve it using the most direct method — The brute force method.

For non-C programmers who are not familiar with C-style strings, your first reaction is to obtain the length of the string. But remember, use of system library is not allowed. If you are not familiar with pointer manipulation, it’s time to brush up your pointer skills.

The brute force method is straightforward to implement. We scan the substring (the target) with the string from its first position and start matching all subsequent letters one by one. If one of the letter doesn’t match, we start over again with the next position in the string. Assume that N = length of string, M = length of substring, then the complexity is O(N*M).

Think you’ve got it all in your head? Try writing the code down on a piece of paper. Remember to test for special cases.

Did you handle the special case correctly? That is, when the target substring is an empty string. What should you return in this case? The correct implementation of strstr should return the full string in this case.

The above solution is usually good enough for an interview. But it turns out we are able to improve the code a little further. Note that the above code iterates at most N times in the outer loop.

The improvement is based on one observation: We only need to iterate at most N-M+1 times in the outer loop. Why? Because after looping more than N-M+1 times, the string has less than M characters to be matched with the substring. In this case, we know no more substring will be found in the string, and therefore saving us from additional comparisons (which might be substantial especially when M is large compared to N.)

How do we loop for only N-M+1 times? We are able to calculate the size of the string and substring by iterating a total of N+M times. In fact, finding just the length of the substring, M is sufficient. We use an additional pointer, p1Adv and advance it M-1 times ahead of the string pointer. Therefore, when p1Adv points to ‘\0′, we know that it has iterated N-M+1 times.

VN:F [1.9.22_1171]
Rating: 4.5/5 (44 votes cast)
Implement strstr() to Find a Substring in a String, 4.5 out of 5 based on 44 ratings

38 thoughts on “Implement strstr() to Find a Substring in a String

  1. 1337c0d3r

    No, because this line:
    p1 = p1Begin + 1;

    is advancing p1 by one step forward in each iteration.

    Eventually p1 *p1 will be equal to '\0', which will exit the loop.

    VA:F [1.9.22_1171]
    +1
  2. johny

    Hi, 1337:

    Good optimization on the algorithm.

    After seeing your first solution, I was also thinking that there is room for improvement, based on the same thought about the difference of the two strings. But it is easier thought of than done, you know, esp. when it involves considerable pointer’s manipulation.

    By the way, I have a confusion here. Why shall the function return the source string’s beginning location when the target string is of 0-length?

    VA:F [1.9.22_1171]
    0
  3. weizhi

    O(n) solution where n is the size of str.

    char* StrStr(const char *str, const char *target) {
    if (*target == NULL)
    return (char*)str;
    char *p1 = (char*) str;
    char *p2 = (char*) target;
    char *p1Begin = p1;
    while (*p1) {
    if (*p1 == *p2) {
    if (p2 == (char*)target) {
    p1Begin = p1;
    }
    p2++;
    } else {
    p2 = (char*) target;
    }
    if (!*p2)
    return p1Begin;
    p1++;
    }
    return NULL;
    }

    VA:F [1.9.22_1171]
    -1
  4. codingC

    You dont need to check whether p1 reaches the end of the string. Your solution assumes that the string is longer than the substring, and you use p1Adv to specify the bound. So checking *p1 is redundant.

    VA:F [1.9.22_1171]
    +1
  5. Pingback: Implement strstr() to Find a Substring in a String | 口├人人│의 Blog

  6. moonus

    I think the improved code has a bug. Like floor #7 says, if size of “str” is less than size of “target”, the code will not work. It only works under the assumption that “str” has a larger size than “target”.

    VA:F [1.9.22_1171]
    +1
  7. Hai

    I have a solution of O(n), very simple code, tried works:

    const char *mystrstr(const char* str, const char* substr)
    {
    if(!substr || !str) return NULL;

    const char *p_str = str, *s_str = substr, *matchHead = NULL;

    while(*p_str != ” && *s_str!= ”)
    {
    if(*p_str == *s_str)
    {
    if(s_str == substr) matchHead = p_str;
    p_str++;
    s_str++;
    }
    else
    {
    p_str++;
    s_str = substr;
    matchHead = NULL;
    }
    }

    if(*p_str == ” && *s_str !=”)
    return NULL;
    else
    return matchHead;
    }

    VA:F [1.9.22_1171]
    -4
    1. samkit jain

      You are right. My code is similar to yours.

      char * my_strstr(char *src , char *target){
      if(src == NULL && target == NULL)
      return NULL;
      else{
      char *temp_target = target;
      int found = 0;
      int firstMatchedChar = 1;
      char *substrptr = NULL;

      while(*src !=”){
      if(*src == *temp_target){
      if(firstMatchedChar){
      substrptr = src;
      printf(“1st Match is at : %c\n”,*substrptr);
      firstMatchedChar = 0;
      }
      temp_target++;
      if(*temp_target == ”){
      found = 1;
      printf(“Breaking . String found..\n”);
      break;
      }
      }
      else{
      printf(“No match at %c . Resetting pointers\n”, *src);
      temp_target = target;
      firstMatchedChar = 1;
      substrptr = NULL;
      }
      src++;
      }
      if(found){
      printf(“Match found at %s\n”,substrptr);
      return substrptr;
      }
      else{
      printf(“No match found\n”);
      return NULL;
      }
      }
      }

      VA:F [1.9.22_1171]
      0
  8. sampsun

    char * strstr (const char * str, const char* substr){
    const char * p1= str;
    const char * p2= substr;
    const char * pos = str;
    if (!*p1 && !*p2) return true;
    while (p1&&p2){
    if (*p1==*p2) p1++; p2++;
    else {
    pos++;p1=pos;p2=substr;
    }
    }
    if (p1 || !*p1&&!*p2) return true;
    return false;
    }

    VA:F [1.9.22_1171]
    +2
  9. Hemesh Thakkar

    This is the most optimized code

    char* strstr(char* haystack, char* needle) {
    for (;; ++haystack) {
    char* h = haystack;
    for (char* n = needle;; ++n, ++h) {
    if (!*n) return haystack;
    if (*h != *n) break;
    }
    if (!*h) return NULL;
    }
    }

    VA:F [1.9.22_1171]
    0
  10. tolik

    I might be misunderstood something, but what happened in the second version when

    is longer than

    ?

    VA:F [1.9.22_1171]
    0
  11. Profile photo of Sha HuaSha Hua

    A small improvement. When you are comparing each character of the two strings via:
    while (*p1 && *p2 && *p1 == *p2) {
    p1++;
    p2++;
    }
    you can record the next character that equals *target. Then if this comparison doesn’t give a match, you can start directly from the next location instead of doing
    p1 = p1Begin + 1

    VN:F [1.9.22_1171]
    0
  12. Grace

    There might be 2 small issues for the improved version.

    1. str and target are the same string.
    2. target is longer than str.

    the fixes will be as below:
    while (*p2++)
    {
    if (!*p1Adv) return NULL;
    p1Adv++;
    }
    p1Adv–;

    VA:F [1.9.22_1171]
    +2
  13. rzv

    What about this O(n)

    VA:F [1.9.22_1171]
    0
  14. sb

    can nebody plz help..i have a log sheet of say 2000 lines.i have to compare a substring with each line.So,first i have to take each line as a string.Then compare the substring with each string.Suppose i got 50 such strings which contain the substring.Now i need only those strings which have the substring at location 5.

    VA:F [1.9.22_1171]
    0
  15. mohit gupta

    i have write the below code in C language please tell me is this code correct or not or what modification required please send me your opinion on gmohit5877@gmail.com

    #include
    #include
    void main()
    {
    int i=0,j;
    char *st,*sub;
    clrscr();
    printf(“Enter the main String \n”);
    scanf(“%s”,st);
    printf(“\n enter the substring : \n”);
    scanf(“%s”,sub);

    while(st[i]!=”)
    {
    j=0;
    while(sub[j]!=”)
    {
    if(st[i+j]==sub[j])
    j++;
    else
    break;
    }
    if(sub[j]==”)
    break;
    i++;
    }
    if(sub[j]==”)
    printf(“the substring is at position %d “,i);
    getch();
    }

    thank you

    VA:F [1.9.22_1171]
    0
  16. NIK

    VA:F [1.9.22_1171]
    +1
  17. Amith KUmar

    simplest way to write a strstr();
    char *mystrstr(char *main,char *sub)
    {
    char *p,*q;/*to store the base address*/
    p =main;
    q = sub;
    while (*main){
    while (*sub){
    if (*main == *sub){
    main++; sub++;
    }
    else
    break;
    }
    if (*sub == ”)
    return p;

    main++;
    p = main;/*reassigning the current value*/
    sub = q;/*setting the string again to first position*/
    }
    }

    VA:F [1.9.22_1171]
    0
    1. dalewmoore

      I suspect that Amith KUmar forgot a return NULL near the bottom.

      I suspect that Amith Kumar forgot that if the needle is the empty string
      then the main string or haystack should be returned.

      Here is a slightly more correct impementation to Amith’s ‘simplest way’.

      Please forgive me for swapping the use of the parameter
      variables and local variables from the way that Amith originally
      had them.

      With much code movement and optimization, you can
      end up with a very similar and smaller algorithm I have posted below.

      VA:F [1.9.22_1171]
      0
  18. Pingback: [Leetcode]Implement strStr() | Wei's blog

  19. Dale Moore

    VA:F [1.9.22_1171]
    0
    1. Dale Moore

      I suppose I could do it without the ‘else’

      VA:F [1.9.22_1171]
      0
    2. Dale Moore

      Recursion for Recursion’s sake

      VA:F [1.9.22_1171]
      0
      1. Dale Moore

        Darn. Got it wrong. This time for sure.
        char * strstr(const char * s1, const char *s2)
        {
        if (!*s2) return (char *)s1;
        if (!*s1) return NULL;
        if ((*s1 == *s2) &&
        (strstr(s1+1, s2+1) == *s1 + 1)) return (char *)s1;
        return strstr(s1+1, s2);
        }

        VA:F [1.9.22_1171]
        0
  20. Pingback: LeetCode | Technical interview solutions

  21. XIaohui Liu

    Simpler N – M + 1.

    VA:F [1.9.22_1171]
    0
  22. sfbay

    VA:F [1.9.22_1171]
    0
  23. Noob

    This is my implementation . When I execute it, it shows Time Limit Exceeded. Please help me with it.

    VA:F [1.9.22_1171]
    0

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use the <code> tag to embed your code.