synsa
11/11/2017 - 5:46 AM

I cannot believe this is valid Ruby syntax. P.S. - I love Ruby

I cannot believe this is valid Ruby syntax. P.S. - I love Ruby

So here I'm using an indented heredoc containing a SQL query to look for ticket IDs across 2 tables — all from one model class, btw, so somewhere in there I'm initializing a bunch of unneeded Ticket objects. I'm looking for the ticket IDs using fairly dumb, and (yes, I know) unsanitized, LIKE queries, because I didn't have the foresight to put tickets and their notes (comments) in a single table, or to set that table to use MyISAM storage (so I could use MySQL full-text), and — as this is a small internal tool I originally wrote years ago for a small ISV — it seems ridiculous to push my lack of foresight onto the customer by forcing them to add something like Solr or Sphinx to the mix.

I mention all of that to say this: yes I know there exist better, more elegant ways to do this work. This code example isn't about semantic or technological purity, it's about making a MySQL query that now takes over 45 seconds to return results return those results faster, as quickly and cheaply as possible.

As it happens, Ruby and Rails provide some really wonderful features for getting this work done:

I feel like a lot of Rails/ActiveRecord developers feel like you should never, ever have to call find_by_sql — like it's cruft, or worse, "PHP-like" to have to sully Ruby with hard-coded SQL queries. But, as noted in my introductory rant above, sometimes the SQL layer is the best place to do some work and find_by_sql can be a handy middle step between the regular hash conditions syntax and dealing directly with the adapter using connection.execute. Here I'm able to fire off a (relatively) short SQL query that, unlike the Ruby code it replaced, makes its intent clearly known in one line. That query gets me an array of Ticket objects I can easily process into the array of integers I really want.

But it's the heredoc syntax I'm really in love with, particularly two things:

  1. Ruby has indented heredocs: by including a dash (-) in the opening sigil, I'm telling the interpreter to look at whatever whitespace precedes the first line of the heredoc, assume every line is similarly indented, and strip out the indentation. My queries are much more readable as a result. I love that.

  2. I also love how Ruby lets me use that opening sigil as a stand-in for the string as a whole, and append some methods to it on that same line, even if the sigil is wrapped in parentheses because it's an argument to another method.

Both of these are idiosyncrasies, that — if you think consistency is a virtue — make Ruby a chain-smoking whore of a language. And, as I pointed out, this particular code has at least two performance/security flaws that make me very glad the stakes are so very low.

But having traded off those details, this was extremely easy to write and has cut this app's keyword searches from 45-60 seconds to less than 1 second.

class Ticket < ActiveRecord::Base

  SEARCH_COLUMNS = {
    :tickets => [:token, :subject, :body, :customer_name, :customer_email],
    :notes   => [:body, :author_name, :author_email]
  }

  def self.conditions_for_keyword_search(table, keywords)
    # Because we may use this string elsewhere, we don't want to mutate the only copy
    query_words = keywords.dup
    query_words.gsub!(/\*/,"%") # Treat '*' as a SQL wildcard
    query_words.gsub!(/['";]/,"") # Bare minimum sanitization    

    # SECURITY RISK: May contain unsanitized SQL. A public/production app would need better 
    # sanitization, but for that matter an app like could probably use a real search engine 
    SEARCH_COLUMNS[table].map { |colname| %((#{table}.#{colname} LIKE '%#{keywords}%')) }.join(' OR ')
  end

  def self.search_by(params={})
    params.assert_valid_keys(:query)

    # Get ticket IDs where ticket columns match, against both the tickets and notes tables
    # I do it this way because this is a crufty legacy app I wrote before I knew better.
    ids_matching_keywords = find_by_sql(<<-ENDQUERY).map(&:id)
    SELECT DISTINCT id FROM tickets WHERE #{conditions_for_keyword_search(:tickets, params[:query])}
    UNION
    SELECT DISTINCT ticket_id FROM notes WHERE #{conditions_for_keyword_search(:notes, params[:query])}
    ENDQUERY

    scoped(:conditions => {:id => ids_matching_keywords}).all
  end
end