When playing with new features in databases, it’s often helpful to be able to have a meaningful amount of data, to get closer to at least approximating a real workload. My use case revolved around needing to generate random phrases, as well as a random nearby coordinate.

I’m a huge fan of Elixir, and so it was my language of choice for this. I didn’t want to deal with the overhead of adding any dependencies, so communicating directly with Postgres through Elixir was not an option. I decided I would have the script generate a bunch of INSERT statements that I could then pass to Postgres directly.

It ended up looking like this:

# words.exs

# random words generated from https://www.randomlists.com/random-words
words = ["general", "sweet", "boot", "enormous", "thumb", "rinsing", "soda", "bells", "hurry", "class", "wrap", "face", "creature", "taller", "angle", "pin", "purple", "driving", "fun", "elderly", "guitar", "magical", "pie", "acoustics", "pretend", "electricity", "cheese", "turkey", "tea"]

central_latitude = 43.0384303
central_longitude = -87.9079503

# file to create
file_path = "./sql.sql"

stream = Stream.repeatedly(fn ->
  # get coordinate within .033 degrees in each direction
  latitude = central_latitude + :random.uniform / 30.0
  longitude = central_longitude + :random.uniform / 30.0

  # Grab 5 random words from the list, and join them with a space to create one string
  words = Enum.take_random(words, 5) |> Enum.join(" ")

  # Emit desired string
  "INSERT INTO test (words, words_vector, geo) VALUES ('#{words}', to_tsvector('#{words}'), ST_GeomFromText('POINT(#{longitude} #{latitude})', 4326));"
end)

# Lazily take as many rows as we'd like to generate, and then join them with a line.
sql = Stream.take(stream, 250000)
      |> Enum.join("\n")
File.write!(file_path, sql, [:write])

# run with:
# >elixir words.exs

Once it’s run, I have a file with 250,000 insert statements that can be run through Postgres with:

psql -d database_name --quiet -f sql.sql

It wasn’t the most efficient way to solve the issue, but it didn’t need to be :)

Feel free to contact me on Twitter at @mitchellhenke with any questions or clarifications. Thanks for reading!