How To Create An Automatic Sitemap For Your Rails App On Heroku (RailsShorts)

 - Webdesign Mechelen

Here's a short article on how you can configure your rails app to automatically generate sitemaps for your domain. This will automatically help you generate a sitemap.xml.gz file once every week (or more frequent).

First up add these gems to your gemfile.

# Gemfile
gem "fog-aws"
gem "sitemap_generator"

The sitemap_generator gem (https://github.com/kjvarga/sitemap_generator) will help convert your current routes into an XML file. To configure it, type;

gem install sitemap_generator

Then create a sitemap.rb file inside your /config directory (not inside initializes!). In this file you'll have to define the configurations for the generator and the actual paths you want it to include. Since we want the sitemap to generate in production and Heroku doesn't allow persistent file-uploads, you will need an AWS account for this and a bucket you can use to upload the file to (https://devcenter.heroku.com/articles/s3

# config/sitemap.rb
SitemapGenerator::Sitemap.default_host = "http://www.mydomain.com" # Your Domain Name
SitemapGenerator::Sitemap.public_path = 'tmp/sitemap'
# Where you want your sitemap.xml.gz file to be uploaded.
SitemapGenerator::Sitemap.adapter = SitemapGenerator::S3Adapter.new( 
aws_access_key_id: ENV["S3_ACCESS_KEY"],
aws_secret_access_key: ENV["S3_SECRET_KEY"],
fog_provider: 'AWS',
fog_directory: ENV["S3_BUCKET_NAME"],
fog_region: ENV["S3_REGION"]
)

# The full path to your bucket
SitemapGenerator::Sitemap.sitemaps_host = "https://#{'ENV["S3_BUCKET_NAME"]'}.s3.amazonaws.com"
# The paths that need to be included into the sitemap. SitemapGenerator::Sitemap.create do Article.find_each do |article| add article_path(article.slug_en, locale: :en) add article_path(article, locale: :nl) if article.slug_nl != "" end Project.find_each do |project| add project_path(project, locale: :en) add project_path(project, locale: :nl) end Page.find_each do |page| add page_path(page, locale: :en) add page_path(page, locale: :nl) end add "en/single-page" add "nl/single-page" add "nl/starters-website" add "en/starters-website" add "nl/website-op-maat" add "en/website-op-maat" add "nl/webapplicatie" add "en/webapplicatie" add "nl/website-analyse" add "en/website-analyse" end

So now when you rake sitemap:refresh you should get similar output. Or you can also rake sitemap:refresh:no_ping to not notify search engines whilst you're testing it out.

$ rake sitemap:refresh
In '/home/simon/Desktop/Code/personal/truetech-v4/public/':
+ sitemap.xml.gz                                          61 links /  974 Bytes
Sitemap stats: 61 links / 1 sitemaps / 0m00s

Now visit your AWS bucket and see if the uploaded file is there! 

Allright, that's 75% of the way. Now we have to reconfigure our routes so Google can be notified through webmaster console of the changes and it can also retrieve the file on future crawls.

# public/robots.txt
Sitemap: http://www.mydomain.com/sitemap.xml.gz
# config/routes.rb
get '/sitemap.xml.gz', to: redirect("https://s3-eu-west-1.amazonaws.com/XXXXX-YOUR-BUCKET-XXXXX/sitemap.xml.gz")

Now all that's left to do it to run the rake sitemap:refresh as an automatic task on heroku. For this I've used the free heroku scheduler addon (https://elements.heroku.com/addons/scheduler). This allows you to run a particular rake task (like sitemap:refresh) on a daily basis.

It allows you to schedule tasks each hour or day. If you find this too frequent, create a separate rake task which queries the current date/day of week/whatever interval you want to use before doing the job. Like;

# lib/sitemap.rake
require "time"

task :generate_sitemap do
  if Time.now.tuesday?
     Rake::Task["sitemap:refresh"].invoke
   end
end

Now you're ready to check everything with the google webmaster tools! And your sitemap is one thing less to worry about :) 


Questions/Suggestions?

Related Articles

 - Webdesign Mechelen

How to Upload Subscribers to Mailchimp Using CSV File (RubyShorts)

Ever wanted to bulk upload users to your mailchimp account but were hindered because of the omnivore alert? Well with some magical ruby code and an API-key you won't have any problems :)

 - Webdesign Mechelen

How To Install OneNote On Ubuntu (2017)

Do you love keeping your notes in OneNote, made by Microsoft and can't really live without it? Well in this video I'm going to show you how you can set it up on your linux ubuntu device.

 - Webdesign Mechelen

How To Do Basic CSV Manipulations In Ruby (RubyShorts)

Need some basic stuff done on your CSV like creating, reading, writing or appending? Here's a short overview!

 - Webdesign Mechelen

How To Handle Errors In Ruby With Begin, Rescue & Ensure (RubyShorts)

Are your trying to catch some errors in your ruby application but can't really wrap your head around the begin, rescue and ensure blocks in ruby? Here are some pointers!

 - Webdesign Mechelen

How To Query A Basic API In Ruby (RubyShorts)

Here's a quick article on how you can quickly retrieve data from an API endpoint using the open-uri and json library

How to Upload Subscribers to Mailchimp Using CSV File (RubyShorts)

Ever wanted to bulk upload users to your mailchimp account but were hindered because of the omnivore alert? Well with some magical ruby code and an API-key you won't have any problems :)

How To Install OneNote On Ubuntu (2017)

Do you love keeping your notes in OneNote, made by Microsoft and can't really live without it? Well in this video I'm going to show you how you can set it up on your linux ubuntu device.

How To Do Basic CSV Manipulations In Ruby (RubyShorts)

Need some basic stuff done on your CSV like creating, reading, writing or appending? Here's a short overview!

How To Handle Errors In Ruby With Begin, Rescue & Ensure (RubyShorts)

Are your trying to catch some errors in your ruby application but can't really wrap your head around the begin, rescue and ensure blocks in ruby? Here are some pointers!

How To Query A Basic API In Ruby (RubyShorts)

Here's a quick article on how you can quickly retrieve data from an API endpoint using the open-uri and json library