Upgrade Mongoid - Hash arguments for group

You will receive a warning for the group method call after upgrading mongoid.

Collection#group no longer take a list of paramters. This usage is deprecated.

exactly this is because mongo gem changes the group method definition.

Before

key = ["ad_id"]
conditions = { 'ad_id' => { '$in' => ad_ids } }
initial = { "impressions" => 0.0, "clicks" => 0.0 }
reduce = "a reduce javascript function"

AdStat.collection.group(key, conditions, initial, reduce).each do |e|
  ......
end

After

key = ["ad_id"]
conditions = { 'ad_id' => { '$in' => ad_ids } }
initial = { "impressions" => 0.0, "clicks" => 0.0 }
reduce = "a reduce javascript function"

AdStat.collection.group(:key => key, :conditions => conditions, :initial => initial, :reduce => reduce).each do |e|
  ......
end

This is the usage of hash arguments, it makes the group calling more readable.

Posted in  mongoid ruby


Upgrade Mongoid - Default Type for Field

If you have watched the episode about mongoid from railscast, ryanb removed the default type String for field, like

class Article
  field :name, :type => String
  field :content, :type => String
end

can be written as

class Article
  field :name
  field :content
end

but it is not valid from mongoid.2.0.0.rc.1 again, the default type of field is changed from String to Object, that means we should explicitly set the type for each field.

Posted in  mongoid ruby


Upgrade Mongoid - Write Tests First

Mongoid is one of the popular Object Document Mappings between Ruby and Mongo, and it is still evolving. We began to use mongoid 2.0.0.beta.20 several weeks ago, the author of mongoid @durran said he wanted to release the 2.0.0 last week (As you know 2.0.0 is still not released yet, but he really did a lot of awesome work), so we tried the version 2.0.0.rc.6 to prepare upgrading to final 2.0.0.

I'm working on upgrading mongoid from 2.0.0.beta.20 to 2.0.0.rc.6 these days. I'm willing to write several posts to share my experience about upgrading.

At the first post, I just want you keep in mind that don't do any upgrading before you write tests for your models. There are many api changes between mongoid 2.0.0.beta.20 and 2.0.0.rc.6, I can't imagine how to upgrade without tests, as our project has almost 30 models and 100 view pages, I can't check the models and views one by one. Luckily, we have built many rspec tests for models and cucumber tests.

It's expected that many test failures raised after upgrading, if I fixed all the failures, the job to upgrade is complete.

I have to say I like such upgrading job, I read the source codes of mongoid, checked git logs, sometimes thought why they made such changes, and always learned a lot from reading source codes. :-)

Posted in  mongoid ruby


Migrate Custom Blog to Jekyll and Disqus

I wrote my blog system by myself about 3 years ago, using rails. It's good, but not cool enough, I just need some changes to make my blog better. After googling, I found jekyll, which is a simple, blog aware, static site generator, that means no databases and much less resources wanted, sounds great.

Build a Blog by Jekyll

Then I began to build the new blog system by jekyll two weeks ago. It's really easy to install and use, check the document here. As you know, I'm a developer, of course I install the pygments for code highlight. But there are several limitations for the default jekyll.

  1. no category section on sidebar.
  2. no archive section on sidebar.
  3. no categroy page, which lists the posts in that category.
  4. no monthly archive page, which list posts by month.
  5. no comments, yep, it generates a static website.
  6. can't display liquid codes on post.(Use literal tag to display liquid codes)

Like rails, jekyll supports plugins and extensions so that we can extend it as we want. Originally I planed to host my blog on github, but I found github doesn't support any plugins and extensions, it only supports the default official jekyll. Bad news, I have to host it on my own server with jekyll extensions, it's not a big problem.

The best extesion of jekyll I found is jekyll_ext, it provides a really flexible way to extend jekyll. The author also shares his jekyll extensions using jekyll_ext. I forked the extensions to fix the generation of archive page and add the archive section on sidebar.

OK, let me show how to fix the above limitation with my forked extension.

1. category section on sidebar.

<ul>
  {% for category in site.categories %}
  <li><a href="/categories/{{category | first}}">{{category | first}} ({{category | last | size }})</a></li>
  {% endfor %}
</ul>

2. archive section on sidebar.

<ul>
  {% for monthly_archive in site.monthly_archives reversed %}
  <li>
    <a href="{{ site.baseurl }}/{{ monthly_archive.url }}">{{ monthly_archive.name }}</a> ({{ monthly_archive.posts | size }} posts)
  </li>
  {% endfor %}
</ul>

3. category page, add a layout category_index.html

---
layout: default
---

<h1 class="page-title">
  Category Archives:
  <a href="/categories/{{page.category}}">{{page.category}}</a>
</h1>
<ol class="archive">
{% for post in site.categories[page.category] %}
  <li>
    <div class="excerpt">
      <strong class="entry-title">
        <a href="{{ post.url }}" title="{{ post.title }}" rel="bookmark">{{ post.title }}</a>
      </strong>
      <span class="date small">
        <abbr class="published" title="{{ post.date }}">{{ post.date | date_to_string }}</abbr>
      </span>
      <p class="alt-font">
        Posted in&nbsp;
        {% for category in post.categories %}
        <a href="/categories/{{ category }}" title="{{ category }}" rel="category tag">{{ category }}</a>
        {% endfor %}
      </p>
      <p class="comments-link">
        <a href='{{post.url}}#disqus_thread'>Comments</a>
      </p>
    </div>
  </li>
{% endfor %}
</ol>

4. monthly archive page, add a layout archive_monthly.html

---
layout: default
---

<h1>{{ page.month | to_month }} {{ page.year }}</h1>
<ol class="archive">
  {% for d in (1..31) reversed %}
    {% if site.collated_posts[page.year][page.month][d] %}
      {% for post in site.collated_posts[page.year][page.month][d] reversed %}
      <li>
        <div class="excerpt">
          <strong class="entry-title">
            <a href="{{ post.url }}" title="{{ post.title }}" rel="bookmark">{{ post.title }}</a>
          </strong>
          <span class="date small">
            <abbr class="published" title="{{ post.date }}">{{ post.date | date_to_string }}</abbr>
          </span>
          <p class="alt-font">
            Posted in&nbsp;
            {% for category in post.categories %}
            <a href="/categories/{{ category }}" title="{{ category }}" rel="category tag">{{ category }}</a>
            {% endfor %}
          </p>
          <p class="comments-link">
            <a href='{{post.url}}#disqus_thread'>Comments</a>
          </p>
        </div>
      </li>
      {% endfor %}
    {% endif %}
  {% endfor %}
</ol>

5. comments, hmmm...it's impossible for jekyll to provide comments functionality, but I guess you know the web service disqus which provides an online comment system. You can get two javascripts after you creating an forum on disqus, one for posting/displaying comments, the other is to dispaly comments count for each post. The following is the javascript to post/display comments.

<div id="disqus_thread"></div>
<script type="text/javascript">
  var disqus_shortname = 'richard-huang';

  var disqus_url = "http://www.huangzhimin.com/2011/01/20/migrate-custom-blog-to-jekyll-and-disqus/";

  (function() {
      var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
      dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
      (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
  })();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">blog comments powered by <span class="logo-disqus">Disqus</span></a>

And the javascript to display comments count.

<script type="text/javascript">
  var disqus_shortname = 'richard-huang';

  (function () {
    var s = document.createElement('script'); s.async = true;
    s.type = 'text/javascript';
    s.src = 'http://' + disqus_shortname + '.disqus.com/count.js';
    (document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
  }());
</script>

6. can't display liquid codes. I found this limitation while I'm writing this post, it's impossible to display raw liquid codes, as liquid always try to execute each liquid code. I have to write a custom tag raw to solve this issue.

module Jekyll
  class Raw < Liquid::Block

    def parse(tokens)
      @nodelist ||= []
      @nodelist.clear

      while token = tokens.shift
        case token
        when IsTag
          if token =~ FullToken
            if block_delimiter == $1
              end_tag
              return
            end
            @nodelist << token
          else
            raise SyntaxError, "Tag '#{token}' was not properly terminated with regexp: #{TagEnd.inspect} "
          end
        else
          @nodelist << token
        end
      end

      # Make sure that its ok to end parsing in the current block.
      # Effectively this method will throw and exception unless the current block is
      # of type Document
      assert_missing_delimitation!
    end
  end
end

Liquid::Template.register_tag('raw', Jekyll::Raw)

So you can use the raw tag to escape all the liquid codes as you want.

You can get the source code of my blog system on github.

Migrate Legacy Data

OK, the new blog system is complete, but what about the old blog posts and comments? I want to migrate them to the new system.

I'm a developer, so it's not too difficult for to migrate old data.

Migrate old posts

Like the common blog system, the old post is saved as html format. After working on several projects on github, I start to love markdown, so I decide to convert all the old html posts to markdown format. There is a project named reverse-markdown to do this job, I also forked it to handle code highlight (before I used syntaxhighlighter, now is {% highlight language %}...{% endhighlight %}), here is the script.

Then I began to migrate old posts

require 'rtranslate'

Post.all.each do |post|
  dir = "temp"
  translated_title = Translate.t(post.title, 'CHINESE_SIMPLIFIED', 'ENGLISH')
  filename = post.created_at.strftime("%Y-%m-%d") + "-" + translated_title.parameterize
  File.open("#{dir}/#{filename}.markdown", "w+") do |file|
    file.puts <<-EOF
---
layout: post
title: #{post.title.gsub("&#65281;", "!").gsub("&#65292;", ",")}
categories:
- #{translate.t(post.category.name, 'chinese_simplified', 'english')}
---
#{ReverseMarkdown.new.parse_string(post.body)}
    EOF
  end
end

I run the above codes in rails console, the above codes translate the post title and category name from Chinese to English, convert the body of post from html to markdown, and then save them under temp directory.

After running the codes, there are a lot of posts generated under temp directory, I just copy them to the _post directory in the new blog system, then the posts migration is complete. Cool!

Migrate old comments

Migrating comments is a bit difficult, it takes me a few days to play with disqus api. Luckily disqus provides a api console, I really like it.

The following codes are what I used to migrate comments to disqus.

require 'rubygems'
require 'rest_client'
require 'json'
require 'open-uri'

disqus_url = 'http://disqus.com/api/3.0'

secret_key = 'your secret key'
current_blog_base_url = 'http://www.huangzhimin.com'

resource = RestClient::Resource.new disqus_url

forum_id = 'richard-huang'

Comment.all.each do |comment|
  translated_title = Translate.t(comment.post.title, 'CHINESE_SIMPLIFIED', 'ENGLISH')

  filename = comment.post.created_at.strftime("%Y/%m/%d") + "/" + translated_title.parameterize
  post_url = "#{current_blog_base_url}/#{filename}/"
  title = "Richard Huang - #{comment.post.title}"

  begin
    open(post_url)

    thread_id = nil
    JSON.parse(resource['/threads/list.json?api_secret='+secret_key+'&forum='+forum_id].get)["response"].each do |thread|
      thread_id = thread["id"] if thread["link"] == post_url
    end

    unless thread_id
      request_body = {:forum => forum_id, :title => title, :url => post_url}
      thread = JSON.parse(resource['/threads/create.json?api_secret='+secret_key].post(request_body))["response"]
      thread_id = thread["id"]
    end

    request_body = {:thread => thread_id, :message => comment.body.strip, :author_name => comment.author, :date => comment.created_at.to_i}
    request_body.merge!(:author_email => comment.mail.blank? ? "anonymous@gmail.com" : comment.mail)
    request_body.merge!(:author_url => comment.website) if comment.website.present?
    if JSON.parse(resource['/posts/create.json?api_secret='+secret_key].post(request_body))["code"] == 0
      puts "Success: #{comment.author} on #{comment.post.title}"
    else
      puts "FAIL: #{comment.author} on #{comment.post.title}"
    end
  rescue
    puts "Rescue: #{post_url}"
  end
end

The aboved codes are also run in rails console, it works as follows.

  1. checks if the new post url existed.
  2. if so, it reads or creates a thread, one thread on disqus is corresponding to one post url in blog system.
  3. then create a post on disque, one post on disqus is corresponding to one comment in blog system.

There is a problem, in disqus, email of comment author can't be empty, but in my old blog system the email of comment user can be empty, so I have use "anonymous@gmail.com" instead. This is the only limitation when I migrate old comments.

Everything works well. I love my new blog system.

Posted in  jekyll disqus


Construct Nested Hash in Ruby

I just received a post request on rails-bestpractices.com from hlxwell, he recommend "Nested hash simple initialization."

Change From

cache_data = {}
cache_data['a'] ||= {}
cache_data['a']['b'] ||= {}
cache_data['a']['b']['c'] ||= {}
cache_data['a']['b']['c']['d'] ||= {}
cache_data['a']['b']['c']['d'] = something...

To

cache_data = Hash.new { |h1,k1| h1[k1] = Hash.new { |h2,k2| h2[k2] = Hash.new { |h3,k3| h3[k3] = Hash.new { |h4,k4| h4[k4] = {} } } } }
cache_data['a']['b']['c']['d'] = something...

Frankly speeking, I don't agree with him.

  1. I don't think he needs the too much level nested hash, he may reconsider his design of data structure.

  2. If he really needs such nested hash, he should use the more graceful way instead

leet = lambda {|hash, key| hash[key] = Hash.new(&leet)}
cache_data = Hash.new(&leet)
cache_data['a']['b']['c']['d'] = something..

Posted in  ruby


Fork me on GitHub