+[[!meta title="Mirroring MediaWiki with Git-Mediawiki and gitolite"]]
+[[!meta author="rohieb"]]
+[[!meta license="CC-BY-SA 3.0"]]
+
+From Murphy’s Law we can deduct that Internet failures always come when you
+least expect them. In my case, the [Stratum 0 wiki][s0wiki] was offline for a
+few minutes (only, thankfully!) when I really urgently(1!11) needed to look
+something up there. If I only had an offline clone of the wiki…
+
+[s0wiki]: https://stratum0.org/wiki/ "Stratum 0 wiki"
+
+
+Enter: Git-Mediawiki
+--------------------
+
+I had already before discovered [Git-Mediawiki][], which lets you mirror certain
+or all pages of a MediaWiki instance to a local Git repository. It achieves
+this by implementing the `mediawiki::` remote handler, which lets you configure
+the URL of the remote MediaWiki instance as a Git remote, and loads the raw
+revisions from the MediaWiki API everytime you do a `git fetch`:
+
+ $ git clone mediawiki::https://stratum0.org/mediawiki/
+ Cloning into 'mediawiki'...
+ Searching revisions...
+ No previous mediawiki revision found, fetching from beginning.
+ Fetching & writing export data by pages...
+ Listing pages on remote wiki...
+ 6 pages found.
+ page 1/78: Vorstand
+ Found 2 revision(s).
+ page 2/78: Atomuhr
+ Found 15 revision(s).
+ page 3/78: Corporate Identity
+ Found 6 revision(s).
+ page 4/78: Presse
+ Found 2 revision(s).
+ [...]
+ 1/804: Revision #738 of Presse
+ 2/804: Revision #3036 of Atomuhr
+ 3/804: Revision #3053 of Atomuhr
+ 4/804: Revision #3054 of Atomuhr
+ [...]
+ Checking connectivity... done.
+
+Not to mention, this can take a very long time if you try to import a whole wiki
+(say, Wikipedia (NO, DON’T ACTUALLY DO THIS! (or at least don’t tell them I told
+you how))), but you [can also][gmw-partialimport] import only single pages or
+pages from certain categories with the `-c remote.origin.pages=<page list>` and
+`-c remote.origin.categories=<category list>` options to `git-clone`.
+
+After the clone has finished, you can view the raw MediaWiki source files of
+the pages you imported from your computer. You can even edit them and push the
+changes back to the wiki if you [configure your wiki user account][gmw-auth] in
+your Git config!
+
+[Git-Mediawiki]: https://github.com/moy/Git-Mediawiki "Git-Mediawiki on GitHub"
+[gmw-partialimport]: https://github.com/moy/Git-Mediawiki/wiki/User-manual#partial-import-of-a-wiki "Git-Mediawiki: Partial imports"
+[gmw-auth]: https://github.com/moy/Git-Mediawiki/wiki/User-manual#authentication "Git-Mediawiki: Authentication"
+
+Since I had already played around with Git-Mediawiki, I had a local mirror of
+the Stratum 0 wiki on my laptop. Unfortunately, I had not pulled for a few
+weeks, and the information I needed was only added to the wiki some days ago. So
+for the future, it would be nice to have an automatically synchronising mirror…
+And not only one on my personal laptop, but also for other interested users,
+at least read-only.
+
+
+Mirroring repositores with gitolite
+-----------
+
+The best solution for me was a mirror repository on my own server, which was
+already running [gitolite][], the popular Git hosting solution. I would simply
+add a read-only repository in gitolite and let a cron job handle care of
+automatic synchronisation.
+
+[gitolite]: http://gitolite.com/gitolite/index.html "gitolite main page"
+
+Creating the new repository was easy, you simple add a line to your
+`gitolite.conf`, and when push the changes, gitolite creates the repository for
+you. But furthermore, I also wanted to configure the MediaWiki remote directly
+in my repository setup, for which I needed to specify the corresponding `remote`
+options for the Git configuration. [The appropriate setting to allow
+this][gitolite-config-keys] is in `.gitolite.rc` (gitolite’s main configuration
+file which resides in the gitolite base directory, say `/home/git/` in my case),
+you can simply add the Git config options you want to set from `gitolite.conf`
+to the `$GL_GITCONFIG_KEYS` variable. Mine now looks like this:
+
+ $GL_GITCONFIG_KEYS = "remote\.* gitweb\.owner gitweb\.description";
+
+[gitolite-config-keys]: http://gitolite.com/gitolite/g2/rc.html#rcsecurity "configuring gitolite's advanced features -- the .gitolite.rc file: variables with a security impact"
+
+Now I could easily add the corresponding options to my repository setup:
+
+ repo stratum0-wiki
+ config gitweb.description = "Read-only Git mirror of the Stratum 0 wiki"
+ config remote.origin.url = "mediawiki::https://stratum0.org/mediawiki"
+ config remote.origin.fetch = "+refs/heads/*:refs/remotes/origin/*"
+ config remote.origin.fetchstrategy = "by_rev"
+ RW+ = rohieb
+ R = @all daemon gitweb
+
+Note that I let Git-Mediawiki work with the `by_rev` fetch strategy, which
+queries the MediaWiki API for all recent revisions rather than first looking for
+changed pages and then fetching the revisions accordingly. This is more
+efficient since I want to import every revision nonetheless. I also found out
+the hard way (i.e. through print debugging) that adding the
+`remote.origin.fetch` option is critical for Git-Mediawiki to work correctly.
+
+Then, a simple cron job for the `git` user (which owns all the gitolite
+repositories), was created with `crontab -e` to update the mirror every 30
+minutes:
+
+ # m h dom mon dow command
+ */30 * * * * /home/git/update-stratum0-mediawiki-mirror
+
+The script which does all the work resides in
+`/home/git/update-stratum0-mediawiki-mirror`:
+
+[[!format sh <<EOF
+#!/bin/sh
+if [ "`whoami`" != "git" ]; then
+ echo "fatal: run as user 'git'."
+ exit 1;
+fi
+
+cd /home/git/git/stratum0-wiki.git/
+
+git fetch 2>&1 | grep -i 'fatal\|error\|warn'
+git update-ref refs/heads/master refs/mediawiki/origin/master
+EOF]]
+
+Note that we cannot simply `git-merge` the master branch here, because the
+gitolite repository is a bare repo and `git-merge` needs a working tree.
+Therefore, we only fetch new revisions from our MediaWiki remote (which fetches
+to `refs/mediawiki/origin/master`), and update the master branch manually. Since
+the mirror is read-only and there are no real merges to be done, this is
+sufficient here.
+
+So far, we have a fully working mirror. But since the Stratum 0 wiki has grown
+to more than 7000 revisions to date, the initial fetch would need a while. To
+reduce the load on the MediaWiki API, I figured that I could reuse my existing
+repository on my laptop.
+
+
+Re-using a previous Git-Mediawiki repo
+--------------
+
+So before activating the cron job, I pushed my exiting repository to the mirror:
+
+ ~/stratum0-wiki$ git push rohieb.name master
+ ~/stratum0-wiki$ git push rohieb.name refs/mediawiki/origin/master
+
+A test run of the mirror script however was not happy with that and wanted to
+fetch ALL THE revisions anyway. So it took me another while to find out that for
+efficiency reasons, Git-Mediawiki stores the corresponding MediaWiki revisions
+in [Git notes][git-notes] under `refs/notes/origin/mediawiki`. For example:
+
+ $ git log --notes=refs/notes/origin/mediawiki
+ commit 7e486fa8a463ebdd177e92689e45f756c05d232f
+ Author: Daniel Bohrer <Daniel Bohrer@stratum0.org/mediawiki>
+ Date: Sat Mar 15 14:42:09 2014 +0000
+
+ /* Talks am Freitag, 14. März 2014, 19:00 */ format, youtube-links
+
+ Notes (origin/mediawiki):
+ mediawiki_revision: 7444
+
+ [...]
+
+So after I also pushed `refs/notes/origin/mediawiki` to the mirror repo,
+everything was fine and a the cron job only fetched a small amount of new
+revisions.
+
+[git-notes]: http://git-scm.com/docs/git-notes "git-notes(1) Manual Page"
+
+
+Conclusion
+---------
+
+To conclude this post, I have now a working MediaWiki mirror for the Stratum 0
+wiki, which uses a cron job and Git-Mediawiki to fetch new revisions every 30
+minutes, integrated with gitolite for hosting. If you also want to keep track of
+changes in the wiki and have an offline mirror for reference, feel free to pull
+from [git://git.rohieb.name/stratum0-wiki.git][git-stratum0-wiki].
+
+[git-stratum0-wiki]: http://git.rohieb.name/stratum0-wiki.git "gitweb: stratum0-wiki.git summary"
+
+[[!tag Git gitify gitolite git-notes MediaWiki mirror]]