Scanning a file system path for all version control remote repository urls

Just thought I’d share a PHP shell script to scan a file system path, search for all .git/.svn directories recursively, and collect a unique list of all remote repository URLs. It uses “svn info” or “git remote” to get the repository URL path.

<?php
// define a path to scan
$scan_path = '/var/www/vhosts';

// keep track of current working dir
$original_cwd = $_SERVER['PWD'];

// get a list of all .git and .svn directories
$files = trim(`find "$scan_path" -type d | egrep -i '\.(git|svn)$'`);

// explode on "\n"
$files = explode("\n", $files);

// loop through files
$repo_list = array();
foreach($files as $key => $file) {

  // .git or .svn ?
  $repo_type = substr($file, -4);
  $repo_path = substr($file, 0, -4);

  switch ($repo_type) {
    case '.svn':

      // get svn repo root
      $repo_root = trim(`svn info "$repo_path" | grep ^Repository\ Root | sed 's/Repository Root: //'`);
      if (!in_array($repo_root, $repo_list)) {
        $repo_list[] = $repo_root;
      }
      break;

    case '.git':

      // change dir
      chdir($repo_path);

      // get git remote path
      $repo_root = trim(`git remote -v | grep -i fetch | awk '{print \$2}' | head -1`);
      if (!in_array($repo_root, $repo_list)) {
        $repo_list[] = $repo_root;
      }
      break;
  }

}

// sort
sort($repo_list);

// go back to cwd
chdir($original_cwd);

// output repo list into file
file_put_contents('repo_list.txt', implode("\n", $repo_list) . "\n");
?>

I put this PHP in a file called “scan-for-version-control.php”, and run it by typing:

$ php scan-for-version-control.php

It created a file in the current working directory: repo_list.txt containing stuff like..

git@myuser.someversioncontrolhost.com:myuser/somerepo1.git
git@myuser.someversioncontrolhost.com:myuser/somerepo2.git
https://myuser.svn.someversioncontrolhost.com/somerepo1
https://myuser.svn.someversioncontrolhost.com/somerepo2

Updated: