Shepherd is a utility for applying code changes across many repositories.
Powerful: You can write migration scripts using your favorite Unix commands, tools like
jscodeshift, or scripts in your preferred programming language.
- Easy: With just a few commands, you can checkout dozens of repositories, apply changes, commit those changes, and open pull requests with detailed messages.
- Flexible: Ships with support for Git/GitHub, but can easily be extended to work with other version control products like Bitbucket, GitLab, or SVN.
For more high level context, this blog post covers the basics.
Install the Shepherd CLI:
npm install -g @nerdwallet/shepherd
Shepherd will now be available as the
shepherd command in your shell:
Usage: shepherd [options] [command] ...
Take a look at the tutorial for a detailed walkthrough of what Shepherd does and how it works, or read on for a higher-level and more brief look!
Motivation for using Shepherd
Moving away from monorepos and monolithic applications has generally been a good thing for developers because it allows them to move quickly and independently from each other. However, it's easy to run into problems, especially if your code relies on shared libraries. Specifically, making a change to shared code and then trying to roll that shared code out to all consumers of that code becomes difficult:
- The person updating that library must communicate the change to consumers of the library
- The consumer must understand the change and how they have to update their own code
- The consumer must make the necessary changes in their own code
- The consumer must test, merge, and deploy those changes
Shepherd aims to help shift responsibility for the first three steps to the person actually making the change to the library. Since they have the best understanding of their change, they can write a code migration to automate that change and then user Shepherd to automate the process of applying that change to all relevant repos. Then the owners of the affected repos (who have the best understanding of their own code) can review and merge the changes. This process is especially efficient for teams who rely on continuous integration: automated tests can help repository owners have confidence that the code changes are working as expected.
A migration is declaratively specified with a
shepherd.yml file called a spec. Here's an example of a migration spec that renames
.eslintrc.json in all NerdWallet repositories that have been modified in 2018:
id: 2018.07.16-eslintrc-json title: Rename all .eslintrc files to .eslintrc.json adapter: type: github search_type: code search_query: org:NerdWallet path:/ filename:.eslintrc hooks: should_migrate: - ls .eslintrc # Check that this file actually exists in the repo - git log -1 --format=%cd | grep 2018 --silent # Only migrate things that have seen commits in 2018 post_checkout: npm install apply: mv .eslintrc .eslintrc.json pr_message: echo "Hey! This PR renames `.eslintrc` to `.eslintrc.json`"
Let's go through this line-by-line:
idspecifies a unique identifier for this migration. It will be used as a branch name for this migration, and will be used internally by Shepherd to track state about the migration.
titlespecifies a human-readable title for the migration that will be used as the commit message.
adapterspecifies what version control adapter should be used for performing operations on repos, as well as extra options for that adapter. Currently Shepherd only has a GitHub adapter, but you could create a Bitbucket or GitLab adapter if you don't use GitHub. Note that
search_queryis specific to the GitHub adapter: it uses GitHub's code search qualifiers to identify repositories that are candidates for a migration. If a repository contains a file matching the search, it will be considered a candidate for this migration. As an alternative to
search_query, GitHub adapter can be configured with
org: YOURGITHUBORGANIZATION. When using
org, every repo in the organization that is visible will be considered as a candidate for this migration.
search_type(optional): specifies search type - either 'code' or 'repositories'. If repositories is specified, it does a Github repository search. Defaults to code search if not specified.
The options under
hooks specify the meat of a migration. They tell Shepherd how to determine if a repo should be migrated, how to actually perform the migration, how to generate a pull request message for each repository, and more. Each hook consists of one or more standard executables that Shepherd will execute in sequence.
should_migrateis a sequence of commands to execute to determine if a repo actually requires a migration. If any of them exit with a non-zero value, that signifies to Shepherd that the repo should not be migrated. For instance, the second step in the above
should_migratehook would fail if the repo was last modified in 2017, since
grepwould exit with a non-zero value.
post_checkoutis a sequence of commands to be executed once a repo has been checked out and passed any
should_migratechecks. This is a convenient place to do anything that will only need to be done once per repo, such as installing any dependencies.
applyis a sequence of commands that will actually execute the migration. This example is very simple: we're just using
mvto rename a file. However, this hook could contain arbitrarily many, potentially complex commands, depending on the requirements of your particular migration.
pr_messageis a sequence of commands that will be used to generate a pull request message for a repository. In the simplest case, this can just be a static message, but you could also programmatically generate a message that calls out particular things that might need human attention. Anything written to
stdoutwill be used for the message. If multiple commands are specified, the output from each one will be concatenated together.
post_checkout are optional;
pr_message are required.
Each of these commands will be executed with the working directory set to the target repository. Shepherd exposes some context to each command via specific environment variables. Some additional enviornment variables are exposed when using the
SHEPHERD_REPO_DIRis the absolute path to the repository being operated on. This will be the working directory when commands are executed.
SHEPHERD_DATA_DIRis the absolute path to a special directory that can be used to persist state between steps. This would be useful if, for instance, a
jscodeshiftcodemod in your
applyhook generates a list of files that need human attention and you want to use that list in your
SHEPHERD_BASE_BRANCHis the name of the branch Shepherd will set up a pull-request against. This will often, but not always, be master. Only available for
applyand later steps.
SHEPHERD_MIGRATION_DIRis the absolute path to the directory containing your migration's
shepherd.ymlfile. This is useful if you want to include a script with your migration spec and need to reference that command in a hook. For instance, if I have a script
pr.shthat will generate a PR message: my
pr_messagehook might look something like this:
githubadapters) is the current revision of the repository being operated on.
githubadapter) is the owner of the repository being operated on. For example, if operating on the repository
https://github.com/NerdWalletOSS/shepherd, this would be
githubadapter) is the name of the repository being operated on. For example, if operating on the repository
https://github.com/NerdWalletOSS/shepherd, this would be
Commands follow standard Unix conventions: an exit code of 0 indicates a command succeeded, a non-zero exit code indicates failure.
Shepherd is run as follows:
shepherd <command> <migration> [options]
<migration> is the path to your migration directory containing a
There are a number of commands that must be run to execute a migration:
checkout: Determines which repositories are candidates for migration and clones or updates the repositories on your machine. Clones are "shallow", containing no git history. Uses
should_migrateto decide if a repository should be kept after it's checked out.
apply: Performs the migration using the
applyhook discussed above.
commit: Makes a commit with any changes that were made during the
applystep, including adding newly-created files. The migration's
titlewill be prepended with
[shepherd]and used as the commit message.
push: Pushes all commits to their respective repositories.
pr-preview: Prints the commit message that would be used for each repository without actually creating a PR; uses the
pr: Creates a PR for each repo with the message generated from the
version: Prints Shepherd version
checkout will use the adapter to figure out which repositories to check out, and the remaining commands will operate on all checked-out repos. To only checkout a specific repo or to operate on only a subset of the checked-out repos, you can use the
--repos flag, which specifies a comma-separated list of repos:
shepherd checkout path/to/migration --repos facebook/react,google/protobuf
shepherd --help to see all available commands and descriptions for each one.
npm install to install dependencies, and then
npm install -g to make the
shepherd executable available on your
npm run build:watch in a separate terminal. This will incrementally compile the source code as you edit it.
Shepherd currently has minimal test coverage, but we're aiming to improve that with each new PR. Tests are written with Jest and should be named in a
*.test.ts alongside the file under test. To run the test suite, run
npm run test.
We use TSLint to ensure a consistent coding style and to help prevent certain classes of problems. Run
npm run lint to run the linter, and
npm run fix-lint to automatically fix applicable problems.