Extracting people from an image using Python, Luminoth, and ImageMagick

In this post I’ll share a tiny Python script to extract people from images using Luminoth, a deep learning toolkit for computer vision (which uses TensorFlow to detect objects in images); and ImageMagick for the image cropping.

Project setup

# setup up a new pip project space
$ pipenv --python 2.7
$ pipenv shell

# install dependencies
$ pipenv install tensorflow
$ pipenv install luminoth

# lumi checkpoint setup:
$ lumi checkpoint refresh

$ lumi checkpoint list
================================================================================
|           id |                  name |       alias | source |         status |
================================================================================
| e1c2565b51e9 |   Faster R-CNN w/COCO |    accurate | remote | NOT_DOWNLOADED |
| aad6912e94d9 |      SSD w/Pascal VOC |        fast | remote | NOT_DOWNLOADED |
================================================================================

$ lumi checkpoint download e1c2565b51e9

# [optional] start web ui
$ lumi server web

Extraction script, file: extract.py

#!/usr/bin/env python

import sys
import subprocess
import json
import os

file_arg = sys.argv[1]

file_name, file_extension = os.path.splitext(file_arg)

output = subprocess.check_output(['lumi', 'predict', file_arg])

output_lines = output.splitlines()
last_line = output_lines[-1]
parsed_json = json.loads(last_line)

matches = 0
for match in parsed_json['objects']:
    if match['label'] == 'person':
        matches += 1
        print(match)
        x1 = int(match['bbox'][0])
        y1 = int(match['bbox'][1])
        x2 = int(match['bbox'][2])
        y2 = int(match['bbox'][3])
        x_size = x2 - x1
        y_size = y2 - y1
        subprocess.call(['convert', file_arg, '-crop', '{}x{}+{}+{}'.format(x_size, y_size, x1, y1), 'person_{}.jpg'.format(matches)])

Example usage:

# make script executable
$ chmod +x extract.py

# extract people from a local image
$ ./extract.py picture.jpg

# output:
{u'label': u'person', u'prob': 0.9997, u'bbox': [331.0, 395.0, 793.0, 1877.0]}
{u'label': u'person', u'prob': 0.9995, u'bbox': [728.0, 408.0, 1090.0, 1895.0]}
{u'label': u'person', u'prob': 0.9515, u'bbox': [325.0, 404.0, 618.0, 1304.0]}

My test image:

image_extract

Output images:

person_1.jpg

image_extract

person_2.jpg

image_extract

person_3.jpg

image_extract

Updated: