Creating a repository
=====================

:Author: Edward Z. Yang <ezyang@mit.edu>

.. highlight:: sh

Adding Wizard support for a web application requires some glue code
and a specially prepared repository.  Creating a new repository for an
application in Wizard involves placing :term:`pristine` versions of the source
code (from the upstream tarballs) and appropriately patched scripts versions
into a Git repository, as well as writing a :mod:`wizard.app` module for the
application that implements application specific logic, such as how to install,
upgrade or backup the installation.

Here is a tutorial for creating such a repository, using an old version of
Wordpress as an example.  We will implement only the functions necessary for
installing an application--upgrades and backups are not described here.
We assume that you have a working setup of Wizard; consult the
:doc:`setup documentation <setup>` for more details.

From this point on, we will assume you are doing development from an AFS directory
named ``$WIZARD``; note that application repositories live in ``$WIZARD/srv``.

.. supplement:: Conversions

    One of Wizard's goals is to replace the previous autoinstaller
    infrastructure.  These boxes will explain extra steps that you must perform
    in order to carry out a conversion of old-style autoinstalls to a Wizard
    autoinstall.  In brief, pre-wizard autoinstalls live in
    :file:`/mit/scripts/deploy` and consist of a tarball from upstream,
    possibly a scripts patch, and possibly some post-install munging (such as
    the creation of a :file:`php.ini` file and appropriate symlinks).
    Performing a conversion means that we will recreate these changes in our
    Wizard autoinstall, and you will start you repository with the *earliest*
    version of the application extant on our servers.

Pristine
--------

This is a tutorial centered around creating a `Wordpress <http://wordpress.org/>`_
repository.  It assumes you have an upstream; if you do not,
you can skip most of these steps: just ensure you have a Git repository
which contains a :term:`pristine` and :term:`master` branch,
as well as tags for all of the releases in the form ``appname-1.2.3``
and ``appname-1.2.3-scripts``.

For the sake of demonstration, we shall assume that this repository
hasn't been created yet.  The repository then doesn't exist, we should
create it::

    cd "$WIZARD/srv"
    mkdir wordpress
    cd wordpress
    git init

We also have to create a module for the application, so we
create :file:`$WIZARD/wizard/app/wordpress.py` and fill it in with a bare bones template:

.. code-block:: python

    import os
    import re
    import logging
    import distutils

    from wizard import app, install, resolve, sql, util
    from wizard.app import php

    class Application(app.Application):
        pass

Finally, we have to tell Wizard about this new module.  If you are
creating this new module for Scripts, the easiest way to tell Wizard
about the application is to add it to the :mod:`wizard_scripts`
`setuptools plugin <http://aroberge.blogspot.com/2008/12/plugins-part-6-setuptools-based.html>`_.
Even if you don't know anything about setuptools, it's pretty easy
to add your application: edit the file  :file:`plugins/scripts/setup.py`
and add your application to the ``wizard.app`` entry point by looking
for the following chunk of code and adding a new line::

    'wizard.app': ['mediawiki = wizard.app.mediawiki:Application',
                   'phpBB = wizard.app.phpBB:Application',
                   'wordpress = wizard.app.wordpress:Application', # NEW LINE!
                  ],

This tells Wizard that there is a new application named ``wordpress``,
with a module named ``wizard.app.wordpress`` and a class named
``Application`` in that module, which Wizard should use.

You need to refresh plugin information by running the :file:`refresh.sh`
script or by running :file:`python setup.py egg_info` in the
:file:`plugins/scripts` directory.

.. note::

    If you do not want to place your application in the Scripts plugin,
    you will need to create a :file:`setup.py` file from scratch in your
    own plugin.  A reasonable template file is::

        import setuptools

        setuptools.setup(
            name = 'wizard-myapp',
            version = '0.1.dev',
            author = 'Me',
            author_email = 'my-email@mit.edu',
            description = ('My Awesome Application'),
            license = 'My Awesome License',
            url = 'http://www.example.com/',
            packages = setuptools.find_packages(),
            entry_points = {
                'wizard.app': ['wordpress = wizard.app.wordpress:Application',
                              ],
            }
        )

    Don't forget to run :file:`python setup.py egg_info` and add your module
    to your :envvar:`PYTHON_PATH` (otherwise, Wizard won't know that
    your plugin exists!)

Now we are ready to put some code in our repository: the first thing we will
add is the :term:`pristine` branch, which contains verbatim the code from upstream.

.. supplement:: Conversions

    If we were starting a new autoinstaller, we'd pop off and use the latest version,
    but since we're dealing with legacy we want to start our repository history
    with the *oldest* version still extant on our servers.  To find this out run::

        wizard summary version APP

    You'll need to be in the ``scripts-team`` list to have access rights to the
    folder we store this information in: :file:`/mit/scripts/sec-tools/store/versions`.

For the purposes of demonstration, we'll use Wordpress 2.0.2; in reality you
should use the latest version.  Try running the following commands::

    cd "$WIZARD/srv/wordpress"
    wizard prepare-pristine wordpress-2.0.2

You should get an error complaining about :meth:`wizard.app.Application.download`
not being implemented yet. Let's fix that:

.. code-block:: python

    class Application(app.Application):
        # ...
        def download(self, version):
            return "http://wordpress.org/wordpress-%s.tar.gz" % version

We determined this by finding `Wordpress's Release Archive <http://wordpress.org/download/release-archive/>`_
and inferring the naming scheme by inspecting various links.  You should now
be able to run the prepare-pristine command successfully: when it is
done, you'll now have a bunch of files in your repository, and they
will be ready to be committed.  Inspect the files and commit (note that the
format of the commit message is a plain Appname Version.Number)::

    git status
    git commit -asm "Wordpress 2.0.2"
    git tag wordpress-2.0.2

.. note::

    Sometimes, ``http://wordpress.org/wordpress-2.0.2.tar.gz`` won't
    actually exist anymore (it didn't exist when we did it).  In this case,
    you'll probably be able to find the original tarball in
    :file:`/mit/scripts/deploy/wordpress-2.0.2`, and you can feed it
    manually to prepare pristine with
    ``wizard prepare-pristine /mit/scripts/deploy/wordpress-2.0.2/wordpress-2.0.2.tar.gz``

Some last house-keeping bits:  now that you have a commit in a repository, you
can also create a pristine branch::

    git branch pristine

Scriptsify
----------

In a perfect world, the pristine version would be equivalent to the scriptsified
version that would actually get deployed.  However, we have historically needed
to apply patches and add extra configuration files to get applications to
work correctly.  Due to the way Git's merge algorithm works, the closer we are
able to reconstruct a version of the application that was actually used, the
better off we will be when we try to subsequently upgrade those applications.

First things first: verify that we are on the master branch::

    git checkout master

.. supplement:: Conversions

    Check for pre-existing patches in the old application directory,
    :file:`/mit/scripts/deploy/wordpress-2.0.2` in the case of Wordpress,
    and apply them::

        patch -n0 < /mit/scripts/deploy/wordpress-2.0.2/wordpress.patch

If you are running a PHP application, you'll need to setup
a :file:`php.ini` and symlinks to it in all subdirectories.
As of November 2009, all PHP applications load the same :file:`php.ini` file;
so just grab one from another of the PHP projects.  We'll rob our own
crib in this case::

    cp /mit/scripts/deploy/php.ini/wordpress php.ini
    athrun scripts fix-php-ini
    git add .

Now commit, but don't get too attached to your commit; we're going
to be heavily modifying it soon::

    git commit -asm "Wordpress 2.0.2-scripts"

Installation
------------

We now need to make it possible for a user to install the application.
The :meth:`~wizard.install.Application.install` method should take the
application from a just cloned working copy into a fully functioning web
application with configuration and a working database, etc.  Most web
applications have a number of web scripts for generating a configuration
file, so creating the install script tend to involve:


    1. Deleting any placeholder files that were in the repository (there
       aren't any now, but there will be soon.)

    2. Determining what input values you will need from the user, such
       as a title for the new application or database credentials; more
       on this shortly.

    3. Determining what POST values need to be sent to what URLs or to
       what shell scripts (these are the install scripts the application
       may have supplied to you.)

.. supplement:: Conversions

    Since you're converting a repository, this job is even simpler: you
    just need to port the Perl script that was originally used into
    Python.

There's an in-depth explanation of named input values in
:mod:`wizard.install`.  The short version is that your application
contains a class-wide :data:`~wizard.app.Application.install_schema`
attribute that encodes this information.  Instantiate it with
:class:`wizard.install.ArgSchema` (passing in arguments to get
some pre-canned input values), and then add application specific
arguments by passing instances of :class:`wizard.install.Arg`
to the method :meth:`~wizard.install.ArgSchema.add`.  Usually you should
be able to get away with pre-canned attributes.  You can access
these arguments inside :meth:`~wizard.app.Application.install` via
the ``options`` value.

In particular, ``options.dsn`` is a :class:`sqlalchemy.engine.url.URL`
which contains member variables such as :meth:`~sqlalchemy.engine.url.URL.username`,
:meth:`~sqlalchemy.engine.url.URL.password`, :meth:`~sqlalchemy.engine.url.URL.host` and
:meth:`~sqlalchemy.engine.url.URL.database` which you can use to pass in POST.

Some tips and tricks for writing :meth:`wizard.app.Application.install`:

    * Some configuration file generators will get unhappy if the
      target directory is not chmod'ed to be writable; dropping
      in a ``os.chmod(dir, 0777)`` and then undoing the chmod
      when you're done is a decent workaround.

    * :func:`wizard.install.fetch` is the standard workhorse for making
      requests to applications.  It accepts three parameters; the first
      is ``options`` (which was the third argument to ``install`` itself),
      the second is the page to query, relative to the installation's
      web root, and ``post`` is a dictionary of keys to values to POST.

    * You should log any web page output using :func:`logging.debug`.

    * If you need to manually manipulate the database afterwards, you
      can use :func:`wizard.sql.connect` (passing it ``options.dsn``)
      to get a `SQLAlchemy metadata object
      <http://www.sqlalchemy.org/docs/05/sqlexpression.html>`_, which can
      consequently be queried.  For convenience, we've bound metadata
      to the connection, you can perform implicit execution.

To test if your installation function works, it's probably convenient to
create a test script in :file:`tests`; :file:`tests/wordpress-install-test.sh`
in the case of Wordpress.  It will look something like::

    #!/bin/bash -e
    cd `dirname $0`

    DEFAULT_HEAD=1
    TESTNAME="wordpress_install"
    source ./setup

    wizard install "wordpress-$VERSION-scripts" "$TESTDIR" --non-interactive -- --title="My Blog"

.. note::

    As you develop more test-scripts, you may find that you are
    frequently copy pasting install commands.  In this case, it may be
    useful to create a 'wordpress-install' helper shell fragment and
    source it whenever you need a vanilla installation.

``DEFAULT_HEAD=1`` indicates that this script can perform a reasonable
operation without any version specified (since we haven't tagged any of our
commits yet, we can't use the specific version functionality; not that we'd want
to, though).  ``TESTNAME`` is simply the name of the file with the trailing
``-test`` stripped and dashes converted to underscores.  Run the script with
verbose debugging information by using::

    env WIZARD_DEBUG=1 ./wordpress-install-test.sh

The test scripts will try to conserve databases by running ``wizard remove`` on the
old directory, but this requires :meth:`~wizard.app.remove` be implemented.
Most of the time (namely, for single database setups), this simple template will suffice:

.. code-block:: python

    class Application(app.Application):
        # ...
        def remove(self, deployment)
            app.remove_database(deployment)

Versioning config
-----------------

A design decision that was made early on during Wizard's development was that
the scriptsified versions would contain generic copies of the configuration
files.  You're going to generate this generic copy now and in doing so,
overload your previous scripts commit.   Because some installers
exhibit different behavior depending on server configuration, you should run
the installation on a Scripts server.  You can do this manually or use
the test script you created::

    env WIZARD_NO_COMMIT=1 ./wordpress-install-test.sh

:envvar:`WIZARD_NO_COMMIT` (command line equivalent to ``--no-commit``)
prevents the installer from generating a Git commit after the install, and will
make it easier for us to propagate the change back to the parent repository.

Change into the generated directory and look at the changes the installer made::

    git status

There are probably now a few unversioned files lounging around; these are probably
the configuration files that the installer generated.

You will now need to implement the following data attributes and methods in your
:class:`~wizard.app.Application` class: :attr:`~wizard.app.Application.extractors`,
:attr:`~wizard.app.Application.substitutions`, :attr:`~wizard.app.Application.parametrized_files`,
:meth:`~wizard.app.Application.checkConfig` and :meth:`~wizard.app.Application.detectVersion`.
These are all closely related to the configuration files that the installer generated.

:meth:`~wizard.app.Application.checkConfig` is the most straightforward method to
write: you will usually only need to test for the existence of the configuration file.
Note that this function will always be called with the current working directory being
the deployment, so you can simplify your code accordingly:

.. code-block:: python

    class Application(app.Application):
        # ...
        def checkConfig(self, deployment):
            return os.path.isfile("wp-config.php")

:meth:`~wizard.app.Application.detectVersion` should detect the version of the application
by regexing it out of a source file.  We first have to figure out where the version number
is stored: a quick grep tells us that it's in :file:`wp-includes/version.php`:

.. code-block:: php

    <?php

    // This just holds the version number, in a separate file so we can bump it without cluttering the SVN

    $wp_version = '2.0.4';
    $wp_db_version = 3440;

    ?>

We could now grab the :mod:`re` module and start constructing a regex to grab ``2.0.4``, but it
turns out this isn't necessary: :meth:`wizard.app.php.re_var` does this for us already!

With this function in hand, writing a version detection function is pretty straightforward:
we have a helper function that takes a file and a regex, and matches out the version number
for us.

.. code-block:: python

    class Application(app.Application):
        # ...
        def detectVersion(self, deployment):
            return self.detectVersionFromFile("wp-includes/version.php", php.re_var("wp_version"))

:attr:`~wizard.app.Application.parametrized_files` is a simple list of files that the
program's installer wrote or touched during the installation process.

.. code-block:: python

    class Application(app.Application):
        # ...
        parametrized_files = ['wp-config.php']

This is actually is a lie: we also need to include changes to :file:`php.ini` that
we made:

.. code-block:: python

    class Application(app.Application):
        # ...
        parametrized_files = ['wp-config.php'] + php.parametrized_files

.. _seed:

And finally, we have :attr:`~wizard.app.Application.extractors` and
:attr:`~wizard.app.Application.substitutions`.  At the bare metal, these
are simply dictionaries of variable names to functions: when you call the
function, it performs either an extraction or a substitution.  However, we can
use higher-level constructs to generate these functions for us.

The magic sauce is a data structure we'll refer to as ``seed``.  Its form is a
dictionary of variable names to a tuple ``(filename, regular expression)``.
The regular expression has a slightly special form (which we mentioned
earlier): it contains three (not two or four) subpatterns; the second
subpattern matches (quotes and all) the value that the regular expression is
actually looking for, and the first and third subpatterns match everything to
the left and right, respectively.

.. note::

    The flanking subpatterns make it easier to use this regular expression
    to perform a substitution: we are then allowed to use ``\1FOOBAR\3`` as
    the replace value.

If we manually coded ``seed`` out, it might look like:

.. code-block:: python

    seed = {
        'WIZARD_DBSERVER': ('wp-config.php', re.compile(r'''^(define\('DB_HOST', )(.*)(\))''', re.M)),
        'WIZARD_DBNAME': ('wp-config.php', re.compile(r'''^(define\('DB_NAME', )(.*)(\))''', re.M)),
    }

There's a lot of duplication, though.  For one thing, the regular expressions are almost
identical, safe for a single substitution within the string.  We have a function
:meth:`wizard.app.php.re_define` that does this for us:

.. code-block:: python

    seed = {
        'WIZARD_DBSERVER': ('wp-config.php', php.re_define('DB_HOST')),
        'WIZARD_DBNAME': ('wp-config.php', php.re_define('DB_NAME')),
    }

.. note::

    If you find yourself needing to define a custom regular expression generation function,
    be sure to use :func:`wizard.app.expand_re`, which will escape an incoming variable
    to be safe for inclusion in a regular expression, and also let you pass a list,
    and have correct behavior.  Check out :mod:`wizard.app.php` for some examples.

    Additionally, if you are implementing a function for another language, or a general pattern of
    variables, consider placing it in an appropriate language module instead.

We can shorten this even further: in most cases, all of the configuration values live in
one file, so let's make ourselves a function that generates the whole tuple:

.. code-block:: python

    def make_filename_regex_define(var):
        return 'wp-config.php', php.re_define(var)

Then we can use :func:`wizard.util.dictmap` to apply this:

.. code-block:: python

    seed = util.dictmap(make_filename_regex_define, {
        'WIZARD_DBSERVER': 'DB_HOST',
        'WIZARD_DBNAME': 'DB_NAME',
        'WIZARD_DBUSER': 'DB_USER',
        'WIZARD_DBPASSWORD': 'DB_PASSWORD',
    })

Short and sweet.  From there, setting up :attr:`~wizard.app.Application.extractors` and
:attr:`~wizard.app.Application.substitutions` is easy:

.. code-block:: python

    class Application(app.Application):
        # ...
        extractors = app.make_extractors(seed)
        extractors.update(php.extractors)
        substitutions = app.make_substitutions(seed)
        substitutions.update(php.substitutions)

Note how we combine our own dictionaries with the dictionaries of :mod:`wizard.app.php`, much like
we did for :attr:`~wizard.app.Application.parametrized_files`.

With all of these pieces in place, run the following command::

    wizard prepare-config

If everything is working, when you open up the configuration files, any user-specific
variables should have been replaced by ``WIZARD_FOOBAR`` variables.  If not, check
your regular expressions, and then try running the command again.

When you are satisfied with your changes, add your files, amend your previous
commit with these changes and force them back into the public repository::

    git status
    git add wp-config.php
    git commit --amend -a
    git push --force

You should test again if your install script works; it probably doesn't,
since you now have a configuration file hanging around.  Use
:func:`wizard.util.soft_unlink` to remove the file at the very beginning
of the install process.

Ending ceremonies
-----------------

Congratulations!  You have just implemented the installation code for a new install.
If you have other copies of the application checked out, you can pull the forced
change by doing::

    git reset --hard HEAD~
    git pull

One last thing to do: after you are sure that your commit is correct, tag the new
commit as ``appname-x.y.z-scripts``, or in this specific case::

    git tag wordpress-2.0.4-scripts
    git push --tags

Summary
-------

Here is short version for quick reference:

#. Create the new repository and new module,
#. Implement :meth:`~wizard.app.Application.download`,
#. Register the application at the ``wizard_scripts`` plugin,
#. *For Conversions:* Find the oldest extant version with ``wizard summary version $APP``,
#. Run ``wizard prepare-pristine $VERSION``,
#. Commit with ``-m "$APPLICATION $VERSION"`` and tag ``$APP-$VERSION``,
#. Create ``pristine`` branch, but stay on ``master`` branch,
#. *For Conversions:* Check for pre-existing patches, and apply them,
#. Run ``wizard prepare-new``,
#. *For PHP:* Copy in :file:`php.ini` file and run ``athrun scripts fix-php-ini``,
#. Commit with ``-m "$APPLICATION $VERSION"``, but *don't* tag,
#. Implement :data:`~wizard.app.Application.install_schema` and :meth:`~wizard.app.Application.install`,
#. Create :file:`tests/$APP-install-test.sh`,
#. On a scripts server, run ``wizard install $APP --no-commit`` and check changes with ``git status``,
#. Implement :attr:`~wizard.app.Application.extractors`,
   :attr:`~wizard.app.Application.substitutions`, :attr:`~wizard.app.Application.parametrized_files`,
   :meth:`~wizard.app.Application.checkConfig` and :meth:`~wizard.app.Application.detectVersion`,
#. Run ``wizard prepare-config``,
#. Amend commit and push back, and finally
#. Tag ``$APP-$VERSION-scripts``

Further reading
---------------

You've only implemented a scriptsified version for only a single version; most applications
have multiple versions--you will have to do this process again.  Fortunately, the most
time consuming parts (implementing logic for :class:`wizard.app.Application`) are already,
done so the process of :doc:`creating upgrades <upgrade>` is much simpler.

There is still functionality yet undone: namely the methods for actually performing an
upgrade are not yet implemented.  You can find instructions for this on the
:doc:`creating upgrades <upgrade>` page under "Implementation".

