Removing vowels from string in Ruby

Removing vowels from string in Ruby

Minor changes would make disemvowel work correctly. This is what was fixed, and why:

Disemvowel

The Bugs

  1. split was changed to string.split(). split with no arguments will split by spaces, and split() will split by characters. With this change, the string_array becomes an array of each of the characters in the string. This can also be done more succinctly with string.chars, which is the preferred method.

See:

  1. vowels was changed to a string. %w[] creates an array of the words, so when using %w[aeiou], vowels was actually an array of 1 string aeiou. This meant that neither String#include? nor Array#include? would work in the comparison to each character. Changing it to a constant string meant that vowels.include? could match against a character.

See:

  1. vowels.include? had no parens and was explicitly comparing to true. The way that Ruby works, the result of the expression string_array[i] == true was passed to vowels.include?, which wasnt what was intended.

A couple of style tips that can help with this:

  • comparisons to true should be implicit (e.g. dont use == true)
  • use parens when calling functions or methods.

See:

  1. sub changed to gsub. The call to sub will only make one replacement in a string, so when calling with f b r, only the first spaces are replaced, leaving the string fb r. gsub does global substitution, which is exactly what you want in this case.

See:

First working version

The working disemvowel function looks like this:

def disemvowel(string)
  string_array = string.split()
  vowels = aeiou
  i = 0
  while i < string.length
    if vowels.include?(string[i])
      string_array[i] =   
    end
    i +=1
  end

  new_string = string_array.join
  new_string = new_string.gsub(/s+/,)
  return new_string
end

and produces this output with your tests:

fbr
rby

Cleaning up

  1. Support mixed-case vowels.

    def disemvowel_1_1(string)
    string_array = string.split()
    vowels = aeiouAEIOU
    i = 0
    while i < string_array.length
    if vowels.include?(string_array[i])
    string_array[i] =
    end
    i +=1
    end

    new_string = string_array.join
    new_string = new_string.gsub(/s+/,)
    return new_string
    end

  2. Consistent use of string_array instead of intermingling with string. Various uses of string occur when its more appropriate to use string_array, instead. This should be replaced.

    def disemvowel_1_2(string)
    string_array = string.split()
    vowels = aeiouAEIOU
    i = 0
    while i < string_array.length
    if vowels.include?(string_array[i])
    string_array[i] =
    end
    i +=1
    end

    new_string = string_array.join
    new_string = new_string.gsub(/s+/,)
    return new_string
    end

  3. Dont use a variable for aeiou. This is a constant expression, and should either be written as a string literal or a constant. In this case, a literal string will be chosen, as theres no enclosing scope to constrain the use of a constant in the global namespace (in case this code gets inserted into another context).

    def disemvowel_1_3(string)
    string_array = string.split()
    i = 0
    while i < string_array.length
    if aeiouAEIOU.include?(string_array[i])
    string_array[i] =
    end
    i +=1
    end

    new_string = string_array.join
    new_string = new_string.gsub(/s+/,)
    return new_string
    end

  4. Replace the vowel character with nil instead of to eliminate the gsub replacement.

    def disemvowel_1_4(string)
    string_array = string.split()
    i = 0
    while i < string_array.length
    if aeiouAEIOU.include?(string_array[i])
    string_array[i] = nil
    end
    i +=1
    end

    new_string = string_array.join
    return new_string
    end

  5. Convert the while loop to Array#each_with_index to process the array elements

    def disemvowel_1_5(string)
    string_array = string.split()
    string_array.each_with_index do |char, i|
    if aeiouAEIOU.include?(char)
    string_array[i] = nil
    end
    end

    new_string = string_array.join
    return new_string
    end

  6. Replace the use of split() with String#chars to get the array of characters to process.

    def disemvowel_1_6(string)
    string_array = string.chars
    string_array.each_with_index do |char, i|
    if aeiouAEIOU.include?(char)
    string_array[i] = nil
    end
    end

    new_string = string_array.join
    return new_string
    end

  7. Reduce the number of temporary variables by chaining results. This can minimize the number of individual variables that Ruby has to keep track of and reduce the variable lookup that occurs each time a variable name is referenced.

    def disemvowel_1_7(string)
    string_array = string.chars
    string_array.each_with_index do |char, i|
    if aeiouAEIOU.include?(char)
    string_array[i] = nil
    end
    end

    new_string = string_array.join
    return new_string
    end

  8. Remove the explicit return to use Rubys expression-based return values.

    def disemvowel_1_8(string)
    string_array = string.chars
    string_array.each_with_index do |char, i|
    if aeiouAEIOU.include?(char)
    string_array[i] = nil
    end
    end.join
    end

  9. Use Array#map to process characters, rather than Array#each_with_index.

    def disemvowel_1_9(string)
    string.chars.map {|char| aeiouAEIOU.include?(char) ? nil : char }.join
    end

Disemvowel 2

The Bugs

  1. Replace delete with delete_if. The Array#delete method will only delete exact matches, so you would have to loop over the vowels to make it work correctly in this case. However, Array#delete_if gives you the ability to delete on a condition, and that condition is vowels.include?(element).

See:

First working version

def disemvowel_2(string)
  string_array = string.split()
  string_array.delete_if {|element| aeiou.include?(element) }
  string_array.join()
end 

Cleaning up

  1. Support mixed-case vowels.

    def disemvowel_2_1(string)
    string_array = string.split()
    string_array.delete_if {|element| aeiouAEIOU.include?(element) }
    string_array.join()
    end

  2. Replace the use of split() with String#chars to get the array of characters to process.

    def disemvowel_2_2(string)
    string_array = string.chars
    string_array.delete_if {|element| aeiouAEIOU.include?(element) }
    string_array.join()
    end

  3. Change join() to just join. The join method will already join this way, so the extra param is redundant

    def disemvowel_2_3(string)
    string_array = string.chars
    string_array.delete_if {|element| aeiouAEIOU.include?(element) }
    string_array.join
    end

  4. Reduce the number of temporary variables by chaining results. This can minimize the number of individual variables that Ruby has to keep track of and reduce the variable lookup that occurs each time a variable name is referenced.

    def disemvowel_2_4(string)
    string.chars.delete_if {|element| aeiouAEIOU.include?(element) }.join
    end

Disemvowel 4

String has a delete method that will remove all matching characters. Given the vowels, this is a straightforward implementation:

def disemvowel_4(string)
  string.delete(aeiouAEIOU)
end

See:

Testing

I created a unit-test like program to do programmatic self-testing, rather than just displaying the disemvoweled strings to the console. This will test each version of the function and report whether it passes or fails the test:

data = [
  [foobar, fbr],
  [ruby, rby],
  [aeiou, ],
  [AeIoU, ],
]

data.each do |test|
  puts disemvowel_1   #{disemvowel_1(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_1 #{disemvowel_1_1(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_2 #{disemvowel_1_2(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_3 #{disemvowel_1_3(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_4 #{disemvowel_1_4(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_5 #{disemvowel_1_5(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_6 #{disemvowel_1_6(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_7 #{disemvowel_1_7(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_8 #{disemvowel_1_8(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_1_9 #{disemvowel_1_9(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_2   #{disemvowel_2(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_2_1 #{disemvowel_2_1(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_2_2 #{disemvowel_2_2(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_2_3 #{disemvowel_2_3(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_2_4 #{disemvowel_2_4(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_3   #{disemvowel_3(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
  puts disemvowel_4   #{disemvowel_4(test[0]) == test[1] ? Pass : Fail}: #{test[0]}
end

This will produce the following output:

>$ ruby disemvowel.rb
disemvowel_1   Pass: foobar
disemvowel_1_1 Pass: foobar
disemvowel_1_2 Pass: foobar
disemvowel_1_3 Pass: foobar
disemvowel_1_4 Pass: foobar
disemvowel_1_5 Pass: foobar
disemvowel_1_6 Pass: foobar
disemvowel_1_7 Pass: foobar
disemvowel_1_8 Pass: foobar
disemvowel_1_9 Pass: foobar
disemvowel_2   Pass: foobar
disemvowel_2_1 Pass: foobar
disemvowel_2_2 Pass: foobar
disemvowel_2_3 Pass: foobar
disemvowel_2_4 Pass: foobar
disemvowel_3   Pass: foobar
disemvowel_4   Pass: foobar
disemvowel_1   Pass: ruby
disemvowel_1_1 Pass: ruby
disemvowel_1_2 Pass: ruby
disemvowel_1_3 Pass: ruby
disemvowel_1_4 Pass: ruby
disemvowel_1_5 Pass: ruby
disemvowel_1_6 Pass: ruby
disemvowel_1_7 Pass: ruby
disemvowel_1_8 Pass: ruby
disemvowel_1_9 Pass: ruby
disemvowel_2   Pass: ruby
disemvowel_2_1 Pass: ruby
disemvowel_2_2 Pass: ruby
disemvowel_2_3 Pass: ruby
disemvowel_2_4 Pass: ruby
disemvowel_3   Pass: ruby
disemvowel_4   Pass: ruby
disemvowel_1   Pass: aeiou
disemvowel_1_1 Pass: aeiou
disemvowel_1_2 Pass: aeiou
disemvowel_1_3 Pass: aeiou
disemvowel_1_4 Pass: aeiou
disemvowel_1_5 Pass: aeiou
disemvowel_1_6 Pass: aeiou
disemvowel_1_7 Pass: aeiou
disemvowel_1_8 Pass: aeiou
disemvowel_1_9 Pass: aeiou
disemvowel_2   Pass: aeiou
disemvowel_2_1 Pass: aeiou
disemvowel_2_2 Pass: aeiou
disemvowel_2_3 Pass: aeiou
disemvowel_2_4 Pass: aeiou
disemvowel_3   Pass: aeiou
disemvowel_4   Pass: aeiou
disemvowel_1   Fail: AeIoU
disemvowel_1_1 Pass: AeIoU
disemvowel_1_2 Pass: AeIoU
disemvowel_1_3 Pass: AeIoU
disemvowel_1_4 Pass: AeIoU
disemvowel_1_5 Pass: AeIoU
disemvowel_1_6 Pass: AeIoU
disemvowel_1_7 Pass: AeIoU
disemvowel_1_8 Pass: AeIoU
disemvowel_1_9 Pass: AeIoU
disemvowel_2   Pass: AeIoU
disemvowel_2_1 Pass: AeIoU
disemvowel_2_2 Pass: AeIoU
disemvowel_2_3 Pass: AeIoU
disemvowel_2_4 Pass: AeIoU
disemvowel_3   Pass: AeIoU
disemvowel_4   Pass: AeIoU

Benchmarking

I wrote a benchmark program to test the performance of each implementation. Heres the benchmark program:

Times = 5_000
chars = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!@#$%^&*(),./<>?;:[]{}\|-=_+`~.chars
array = Times.times.map { |n| #{chars.sample(n)} }

puts =============================================================
puts RUBY_DESCRIPTION

Benchmark.bm(15) do |x|
  dismevowel_1_report =   x.report(disemvowel_1:)   { array.each {|s| disemvowel_1(s) } }
  dismevowel_1_1_report = x.report(disemvowel_1_1:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_2_report = x.report(disemvowel_1_2:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_3_report = x.report(disemvowel_1_3:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_4_report = x.report(disemvowel_1_4:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_5_report = x.report(disemvowel_1_5:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_6_report = x.report(disemvowel_1_6:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_7_report = x.report(disemvowel_1_7:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_8_report = x.report(disemvowel_1_8:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_1_9_report = x.report(disemvowel_1_9:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_2_report   = x.report(disemvowel_2:)   { array.each {|s| disemvowel_2(s) } }
  dismevowel_2_1_report = x.report(disemvowel_2_1:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_2_2_report = x.report(disemvowel_2_2:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_2_3_report = x.report(disemvowel_2_3:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_2_4_report = x.report(disemvowel_2_4:) { array.each {|s| disemvowel_1_1(s) } }
  dismevowel_3_report   = x.report(disemvowel_3:)   { array.each {|s| disemvowel_3(s) } }
  dismevowel_4_report   = x.report(disemvowel_4:)   { array.each {|s| disemvowel_4(s) } }
end

And this is the output from the benchmarks:

=============================================================
ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin14]
                      user     system      total        real
disemvowel_1:     2.630000   0.010000   2.640000 (  3.487851)
disemvowel_1_1:   2.300000   0.010000   2.310000 (  2.536056)
disemvowel_1_2:   2.360000   0.010000   2.370000 (  2.651750)
disemvowel_1_3:   2.290000   0.010000   2.300000 (  2.449730)
disemvowel_1_4:   2.320000   0.020000   2.340000 (  2.599105)
disemvowel_1_5:   2.360000   0.010000   2.370000 (  2.473005)
disemvowel_1_6:   2.340000   0.010000   2.350000 (  2.813744)
disemvowel_1_7:   2.380000   0.030000   2.410000 (  3.663057)
disemvowel_1_8:   2.330000   0.010000   2.340000 (  2.525702)
disemvowel_1_9:   2.290000   0.010000   2.300000 (  2.494189)
disemvowel_2:     2.490000   0.000000   2.490000 (  2.591459)
disemvowel_2_1:   2.310000   0.010000   2.320000 (  2.503748)
disemvowel_2_2:   2.340000   0.010000   2.350000 (  2.608350)
disemvowel_2_3:   2.320000   0.010000   2.330000 (  2.820086)
disemvowel_2_4:   2.330000   0.010000   2.340000 (  2.735653)
disemvowel_3:     0.070000   0.000000   0.070000 (  0.070498)
disemvowel_4:     0.020000   0.000000   0.020000 (  0.018580)

Conclusion

The String#delete method massively outperforms all of the hand-rolled solutions except String#gsub by more than 100X, and its 2.5 times faster than String#gsub. Its very easy to use and outperforms everything else; this is easily the best solution.

This first solution is bureaucratic and has some errors, in code and style.

  • You break your string into an array of separate char, but do it wrongly with string_array = string.split. Or string_array = string.split() or string_array = string.chars (best option) or string_array = string.each_char.to_a. If you do asdfg.split, the result will be [asdfg], not [a,s,d,f,g], as you seem to expect.
  • Then you dont use this (supposed) array, but keeps using the original string. If you intended to to this, why would you try to split the original string?
  • Finally you move back to working with the array, changing it according to what happened in the original string. As you may see, you keep working with too many objects, more than needed certainly. This violates the KISS principle and not running properly is a consequence.

Your second solution, although much simpler than the first one, has the problem engineersmnky pointed. Array#delete does NOT take five arguments.

Finally, your third solution, although working fine, could be written in a much simpler way:

def disemvowel_3(string)
    string.gsub(/[aeiou]/i, )
end

As I keep telling people here, you dont need an explict return in the end of a Ruby method. By default it will return the last value calculated, whatever it is.

Another possible solution, if you allow me to suggest, would be using Array#reject in the following way:

def disemvowel(str)
  vowels = %w[a e i o u]
  str.each_char.to_a.reject{ |item| vowels.include?(item) }.join
end

Removing vowels from string in Ruby

Here is a clean and easy to understand method.

First your vowels have been defined as a variable.

string#chars makes your string into an array of characters.

#select iterates over the array of your string looking at each item. If the item is not (!) included converted to its lowercase version as a vowel. #select inserts that item into an array of items that met the criteria of items that you want.

#join without any parameters reassembles all of your characters, vowels excluded.

This will work with uppercase and lowercase letters. Also punctuation and spaces.

You will get back your input minus the vowels.

def disemvowl(string)
    vowels = aeiou
    string.chars.select { |char| !vowels.include?(char.downcase) }.join
end


#p disemvowl(string) # => strng

#p disemvowl(s trin  g!?) # => s trn  g!?

Leave a Reply

Your email address will not be published. Required fields are marked *