Implementing String split Method that can Handle Multiple Spaces in a Sentence

In this article, we will incrementally build a method that can split a given sentence that contains multiple words. The disadvantage of using the Ruby split method is that it will remove multiple spaces when splitting the sentence. We want our implementation to retain multiple spaces as part of the words. We will work through a series of drills to eventually solve the given problem.

Drill 1

Given an input string: apple sauce, split this into two words:

s = 'apple sauce'
s.split

This prints:

apple
sauce

Intent: How does Ruby split work?

Drill 2

Print all characters in the sentence.

s = 'apple sauce'
characters = sentence.chars
characters.each {|c| p c}

Intent: How to create a sequence of characters from a string?

Drill 3

Stop printing if a space character is encountered.

characters.each do |c|
  if c == ' '
    break
  else
    p c
  end
end

The conditional has space between the quotes that is not visible. Intent: How to identify words?

Drill 4

Save the word and print after processing it.

result = ''
characters.each do |c|
  if c == ' '
    break
  else
    result << x
  end
end
p result

Intent : How to grab all the words in a sentence?

Drill 5

Save all the words in a sentence and print them.

sentence = 'implement regular split'
characters = sentence.chars
result = []
word = ''
characters.each do |c|
  if c == ' '
    result << word
    word = ''
  else
    word << c
  end
end

p result

This misses the last word split. Intent: Accumulate all the words and print them.

Drill 6

Print all the characters in the sentence with its corresponding index.

characters.each_with_index do |c, i|
  p c
  p i
end

Intent: Is the index first or second block variable?

Drills 7

Let's now fix the bug by using the index to check for the last word.

sentence = 'implement simple split'
length = sentence.length - 1
characters = sentence.chars
result = []
word = ''
characters.each_with_index do |c, i|
  if (i == size)
    result << word
  end
  if c == ' ' 
    result << word
    word = ''
  else
    word << c
  end
end

p result 

Intent: Handle the last word differently.

Drill 8

Solution for split with multiple spaces that handles spaces in the front of words. We need to change the conditional:


if c == ' ' and (characters[i - 1] != ' '

This conditional checks if the preceding character of a space is a non-space character, if it is it is assumed to be a word boundary. This will not work if the sentence has empty spaces after the last word in the sentence. We will fix it now.

Drill 9

Handle spaces after the word in the last word of the sentence.


def remaining_characters_are_spaces?(input)
  result = true
  characters = input.chars
  characters.each do |c|
    if c != ' '
      return false
    end
  end
  result
end

sentence = '  split    with   multiple    spaces     '

characters = sentence.chars
size = characters.length - 1
result = []
word = ''
characters.each_with_index do |c, i|    
  padding = ''
  if c == ' ' and (characters[i-1] != ' ')
    if remaining_characters_are_spaces?(sentence[i, size])
      padding = sentence[i, size]
    end
    result << (word + padding) 
    word = ''      
  else    
    word << c
  end
end

p result

I did not use any tests to drive the solution. I also have not refactored the solution. The focus is only on programming constructs and how you can decompose a given problem into smaller tasks and map them to code.


Related Articles


Ace the Technical Interview

  • Easily find the gaps in your knowledge
  • Get customized lessons based on where you are
  • Take consistent action everyday
  • Builtin accountability to keep you on track
  • You will solve bigger problems over time
  • Get the job of your dreams

Take the 30 Day Coding Skills Challenge

Gain confidence to attend the interview

No spam ever. Unsubscribe anytime.