Custom model training - Use DeepStack to annotate dataset?

@john your documentation and Youtube video about training custom models is very helpful. I was able to train a custom model with 100 images and it’s running successfully in DeepStack now. :slight_smile:

However, I’ve now gathered an additional 500 images and would like to train a more accurate model. I hope to make the annotation process easier by utilizing DeepStack to create the rectangle on the new images based on my original custom model, plus automatically save the XML file rather than manually doing it with LabelIMG.

Is that possible? If so, would you be willing to add documentation that describes how to do it?

Thanks in advance!

1 Like

This process is called ‘model assisted annotation’ and deepstack is not really focussed on annotation services. You could checkout which annotation platforms support this service, I belive Roboflow does for example

1 Like

@robmarkcole thanks for the info :slight_smile: I’ll check out your suggestions.

1 Like

@robmarkcole thank you for the suggestion.

@aesterling if your custom model (trained on 100 images) does provide some reasonable accuracy, you can run a python script that captures the detection for the 500 new images and generate the annotation (in YOLO format) for each image. Then you can visualize the annotation using LabelIMG or any other annotation tool to confirm the annotations are okay and adjust the ones that are bad/incorrect.

I believe this can reduce the time for 100% manually annotation drastically.

1 Like

i managed to write in the YOLO automated labeling annotations into a python script that identifies the characters from plate images. it may take you some time to get it running / configured with your file system, but should help someone who wants to do something similar. (either automated labeling, or running the detection of a custom made OCR model on a large dataset of cropped images.
I used YOLO5m, ~200 images to train. results are good, but not perfect.

latest code has a lot more filtering capabilities, and it is working very well for me.

*** edited to update code slightly - 20210701***
### edited to update code massively - 20210712. way way better now! ###
**edited to update code slightly- 20210713. **

#Custom_LPR_OCR_detection_v20210713_wAutoYoloLabeling.py
#created by u\shallowstack for the purpose of either:  autolabeling for YOLO deepstack custom model for OCR, 
# or for logging the detected characters in images within a file directory, with a number of filtering capabilties.

# before running this code, first run the below call in Powershell (not CMD) to start Deepstack custom model:
#PS C:\Windows\system32> deepstack --MODELSTORE-DETECTION "C:\AI_models\lpr_ocr\" --PORT 97
#change port and dir to whereever your custom (character recognition) AI model is...

#major changes to v20210712 - 
# deals with overlapping detected characters. as i think these are common with custom OCR models. 
# if too many detections, it now filters down to an expended number of Characters.
# now skips any left over yolo text files in the directory without error.  note: deletes txt file only when it needs to write a new one of the same name.
# better debugging info and code comments for usability

import requests
import os
import PIL
import sys
from PIL import Image

#CHANGE THESE GLOBAL VARIABLES for customized control
overlapping_threshold = 4           #how close the characters can be before the inferior one is ignored
min_conf_thres = 0.6                #if worst confidence is less than this, the result wont be written to output_logfile 
min_len_thres = 3                   #if there is not more than this number of characters detected on the plate, the result wont be written to output_logfile
plate_len_threshold = 6              #if there is more character detected than this number, program will cut out the weakest confidence characters until value met.

#CHANGE THESE DIRECTORIES TO MATCH YOUR FILESYSTEM
input_directory = 'J:\\BlueIris\\Deepstack_LPR_dataset\\results_2021'                          #where large dataset of cropped plates live
output_logfile = 'C:\\Users\\BlueIrisServer\\Desktop\\logfile.txt'            #where python will save the log file

#change this file pointer to whereever your YOLO class file is.  it will sync up the character labels with the class numbering scheme
classYoloFilename = 'J:\\BlueIris\\Deepstack_LPR_dataset\\OCR_testing\\QLD Model\\train\\classes.txt'

#reads 'class' txt file, and puts into a list to be searched every time a label is given, with the index value of the list returned so it can be written to the YOLO file.
classLabels = tuple(open(classYoloFilename).read().split('\n')) 
print(classLabels)        

#get # of x pixels for yolo file conversion from pixel # to %
def get_xnum_pixels(filepath):
    width, height = Image.open(filepath).size
    return width
#get # of y pixels for yolo file conversion from pixel # to %
def get_ynum_pixels(filepath):
    width, height = Image.open(filepath).size
    return height

#cleanup variables before run, just in case
resp_label = [0]
resp_conf = [0]
resp_pos = [0]
resp_data2write = [""]
resp_label.clear()
resp_conf.clear()
resp_pos.clear()
resp_data2write.clear()
YoloLabelFilepath = ""
data2write = ""

i = 0

#goes through entire input_directory and processes each file to look for alphanumeric characters
for filename in os.listdir(input_directory):
    #store full filepath of current file under inspection
    filepath = os.path.join(input_directory, filename)
    
    #skip file if it is a txt file
    if filename.rsplit('.',1)[1] == 'txt' :
        print(filename + " is a txt file, SKIPPED OCR DETECTION ON THIS FILE.")
        #os.remove(filepath)   #uncomment this if you want it to delete the text file from directory.  (untested)
        continue
    else :
        print(filename + " is not a txt file, sending to OCR detection AI...")

    #get # of x pixels for yolo file conversion from pixel # to %
    xSize = get_xnum_pixels(filepath)
    #get # of y pixels for yolo file conversion from pixel # to %
    ySize = get_ynum_pixels(filepath)

    #create new text file to store label annotations
    YoloLabelFilepath = filepath.rsplit('.',1)[0] + '.txt'
    image_data = open(filepath,"rb").read()

    #clear variables from last image data
    resp_label.clear()
    resp_conf.clear()
    resp_pos.clear()
    resp_data2write.clear()

    #posts to deepstack custom server and logs result
    response = requests.post("http://localhost:97/v1/vision/custom/yolo5m_best_20210623",files={"image":image_data}).json()     #change port 97 to whatever your deepstack custom server is on, and custom model name
    #print(response)

    #go through all detections in image and store to temporary lists for comparisons
    for detection in response["predictions"]:
        resp_label.append(detection["label"]) 
        resp_conf.append(round(detection["confidence"],2))
        resp_pos.append(detection["x_min"])
      
        #print annotation results for assisted labeling - for future incorporation into model
        #this will print a YOLO type txt file for each image processed, according to classes.txt file
        
        #some maths to get xy pixel values into YOLO format:  x center, y center, x width, y width (all as a % of total image size)
        xCenter = float(detection["x_min"] + detection["x_max"]) / float(2) 
        yCenter = float(detection["y_min"] + detection["y_max"]) / float(2)
        xWidth = detection["x_max"] - detection["x_min"] 
        yWidth = detection["y_max"] - detection["y_min"] 

        xCenter = format(round(xCenter / xSize, 6), '.6f')
        yCenter = format(round(yCenter / ySize, 6), '.6f')
        xWidth = format(round(xWidth / xSize, 6), '.6f')
        yWidth = format(round(yWidth / ySize, 6), '.6f')

        #check the class list to see what index value the label is that was returned from detection
        ClassValue = classLabels.index(detection["label"])
        #format is :   class (not label) xcenter% ycenter% xwidth% ywidth%
        data2write = str(ClassValue) + " " + str(xCenter) + " " + str(yCenter) + " " + str(xWidth) + " " + str(yWidth) + " "

        # resp_xCenter.append(xCenter)
        # resp_yCenter.append(yCenter)
        # resp_xWidth.append(xWidth)
        # resp_yWidth.append(yWidth)
        # resp_ClassValue.append(ClassValue)
        resp_data2write.append(data2write)

    #sort all stored arrays (label, confidence, YoloData) according to x_min position array  (so it reads left to right like we see it)
    B1=resp_pos
    B2=resp_pos
    B3=resp_pos
    A=resp_label
    C=resp_conf
    D=resp_data2write

    #if NO detections made exist with debug info
    if len(C) == 0: 
        print("No Char was found in image " + filename)
        continue
    else:
        pass

    #sort resp_label array
    zipped_lists1 = zip(B1,A)
    sorted_pairs1 = sorted(zipped_lists1)
    tuples = zip(*sorted_pairs1)
    B1, A = [list(tuple) for tuple in tuples]
    #sort resp_conf array
    zipped_lists2 = zip(B2,C)
    sorted_pairs2 = sorted(zipped_lists2)
    tuples = zip(*sorted_pairs2)
    B2, C = [list(tuple) for tuple in tuples]
    #sort resp_data2write array
    zipped_lists2 = zip(B3,D)
    sorted_pairs2 = sorted(zipped_lists2)
    tuples = zip(*sorted_pairs2)
    B3, D = [list(tuple) for tuple in tuples]

    #debug info only
    print(str(A) + ", " + str(B1) + ", " + str(C) + ", " + str(D))

   
    k=0
    #go through each position in the lists
    for m in B1 :
        i = 0
        if k == len(B1) :
            print("max of list reached already.")
            continue
        #check if any CHARs are overlapping according to their xmin value
        upperLim = m + overlapping_threshold
        lowerLim = m - overlapping_threshold
        for n in B1 :
            if i == len(B1) :
                print("max of list reached already.")
                continue
            if n < upperLim and n > lowerLim and i != k :
                if C[i] > C[k] :
                    #debug info only
                    print("deleted an overlapping CHAR: '" + str(A[k]) + "', " + str(B1[k]) + ", " + str(C[k]) + ", with yolo info: " + str(D[k]))
                    #remove detection # [n] from all lists
                    del B1[k]
                    del B2[k]
                    del B3[k]
                    del A[k]
                    del C[k]
                    del D[k]

                else:
                    #remove detection # [m] from all lists
                    #debug info only
                    print("deleted an overlapping CHAR: '" + str(A[i]) + "', " + str(B1[i]) + ", " + str(C[i]) + ", with yolo info: " + str(D[i]))
                    del B1[i]
                    del B2[i]
                    del B3[i]
                    del A[i]
                    del C[i]
                    del D[i]              
            else:
                pass
            i=i+1
        k=k+1
    
    #if results still more CHARs than expected, being deleting the lowest confidence items
    check_len = len(C)
    while check_len > plate_len_threshold :
        #remove detection # [i] from all lists
        i = C.index(min(C))    
        #debug info only
        print("more detections than expected (plate_len_threhold = " + str(plate_len_threshold) + "), so the following (low conf) CHAR was deleted : '" + str(A[i]) + "', " + str(B1[i]) + ", " + str(C[i]) + ", with yolo info: " + str(D[i]))
        #delete index i
        del B1[i]
        del B2[i]
        del B3[i]
        del A[i]
        del C[i]
        del D[i]
        check_len = len(C)


    #delete any exisiting yolo text file by that name
    if os.path.exists(YoloLabelFilepath) :
        os.remove(YoloLabelFilepath)
        print(YoloLabelFilepath + " file removed, so a new yolo txt file can be written.")
    else:
        pass
    #write to yolo text file        
    text_file = open(YoloLabelFilepath,"a")
    j=0
    for o in D :
        text_file.write(D[j] + '\n')
        j=j+1
    text_file.close()
    
    #debug
    print("I wrote all lines to Yolo .txt : " + str(D))
    
    #print results to log file (if looks OK)              change directory to where ever you like, or add info
    if C != [] and len(C)>min_len_thres:
        if min(C) > min_conf_thres :
            text_file = open(output_logfile,"a")
            text_file.write(filename)
            text_file.write("\n")
            text_file.write(str(A))
            text_file.write("\n")
            text_file.write(str(B1))
            text_file.write("\n")
            text_file.write(str(C))
            text_file.write("\n")
            text_file.write(str(D))
            text_file.write("\n")
            text_file.close()
        else:
            pass
    else :
        pass
    
    #if best found character really sucks, notify
    if max(C) < 0.1: 
        print("No Char was found in image " + filename)
    else:
        print("COMPLETED : " + filename + " *** " + str(A) + " *** " + str(B1) + " *** " + str(C) + " *** " + str(D))
  
1 Like

@shallowstack that’s awesome. Looking forward to testing it!

let me know how you go!

I created another script to look through the resulting dataset one by one and display to user. then based on human QC checking (keyboard input), move to good/bad folder.
I’m a bit stuck to make it better though, as it requires a little more programming. But i plan to in the future:
-identify overlapping characters
-allow user to remove / edit single characters (and have the yolo file update automatically)
-close the displayed images, and bring terminal back into foreground.
-make font nicer to read

#QC_AutoYoloResults_v20210713.py
#created by u\shallowstack for the purpose of:  checking the YOLO autolabeling performed using a deepstack custom model for OCR.  
#the program displays the results and allows for human checking, before adding any (good) results to a directory (incorporate with original model training dataset)

# before running this code, adjust the directory variables in the code to match your filesystem.
# There should also be a collection of cropped images (lisence plates) and yolo files (.txt) with the same filenames inside the input_directory to process.  

import requests
import os
import PIL
import sys
import shutil
#import pywinauto
#import win32gui

# Importing Image class from PIL module 
from PIL import Image, ImageDraw, ImageFont, ImageShow

#from pywinauto import Application

def get_xnum_pixels(filepath):
    width, height = Image.open(filepath).size
    return width
#get # of y pixels for yolo file conversion from pixel # to %
def get_ynum_pixels(filepath):
    width, height = Image.open(filepath).size
    return height

input_directory = 'C:\\Users\\BlueIrisServer\\Desktop\\AutoTrain'                          #where large dataset of cropped plates live
output_directory_GOOD = 'C:\\Users\\BlueIrisServer\\Desktop\\AutoTrain\\GOOD'               #where we will save the GOOD yolo results to for future training
output_directory_BAD = 'C:\\Users\\BlueIrisServer\\Desktop\\AutoTrain\\BAD'                 #where we will save the BAD yolo results to dispose of or adjust
output_logfile = 'C:\\Users\\BlueIrisServer\\Desktop\\logfile.txt'            #where python will save the log file
anno_output_directory_GOOD = 'C:\\Users\\BlueIrisServer\\Desktop\\AutoTrain\\Annotations_GOOD'               #where we will save the GOOD yolo results to for future training
anno_output_directory_BAD = 'C:\\Users\\BlueIrisServer\\Desktop\\AutoTrain\\Annotations_BAD'

#change this file pointer to whereever your YOLO class file is.  it will sync up the character labels with the class numbering scheme
classYoloFilename = 'C:\\Users\\BlueIrisServer\\Desktop\\classes.txt'

#reads class file, and puts into a list to be searched every time a label is given, with the index value of the list returned so it can be written to the YOLO file.
classLabels = tuple(open(classYoloFilename).read().split('\n')) 
print(classLabels)        

#cleanup variables before run, just in case
pos = [0]
pos.clear()
ALL_CHAR = [""]
ALL_CHAR.clear()
YoloLabelFilepath = ""
text_line_list = [0]

i = 0

#goes through entire directory and processes each file
for filename in os.listdir(input_directory):
    #pid_terminal = os.getpid()

    if filename.endswith(".jpg"):
        filepath = os.path.join(input_directory, filename)
    
        #get # of x pixels for yolo file conversion from pixel # to %
        xSize = get_xnum_pixels(filepath)
        #get # of y pixels for yolo file conversion from pixel # to %
        ySize = get_ynum_pixels(filepath)

        #new text file to store label annotations
        YoloLabelFilepath = filepath.rsplit('.',1)[0] + '.txt'
        AnnotationFilepath = filepath.rsplit('.',1)[0] + "_annotation.jpg"

        #create object
        im = Image.open(filepath)
        
        i = 0

        #make sure a Yolo file exists, otherwise skip image file
        if os.path.isfile(YoloLabelFilepath) is False :
            continue
        else:
            pass
        #format of yolo file lines should already be :   class (not label) xcenter% ycenter% xwidth% ywidth%
        #read from file        
        text_file = tuple(open(YoloLabelFilepath).read().split('\n')) 
        
        #for every line in the yolo file, inspect and annotate onto image
        for line in text_file :
            #cleanup any previous lines
            text_line_list.clear()
            #inspect and store yolo file lines
            text_line = str.split(line)
            print(text_line)
            text_line_list = list(map(float, text_line))

            #go through each line, and convert to character (from class integer)
            if text_line != [] :
                #store character information
                CHAR = classLabels[int(text_line[0])]
                ALL_CHAR.append(CHAR)
                #store x position for futher checking
                pos.append(text_line_list[1])
                #draw box annotations onto image
                im1 = ImageDraw.Draw(im)
                im1.rectangle([(text_line_list[1]*xSize,text_line_list[2]*ySize+9.0), (text_line_list[1]*xSize, text_line_list[2]*ySize+11.0)], fill = None, outline = "red")
                #draw text annotations onto image
                im2 = ImageDraw.Draw(im)
                im2.text((text_line_list[1]*xSize, text_line_list[2]*ySize+12.0), CHAR, fill = (34,139,34))
            else:
                pass
        #show image annotated with characters    
        im.resize((xSize*8,ySize*8), Image.ANTIALIAS).show()
        #im3 = im.resize((xSize*4,ySize*4))
        #im3 = im.open(im,"r",None)
        #im.thumbnail((xSize*4,ySize*4))
        
        #save and close
        #im3 = ImageShow.show(im, filename)
        
        im.save(AnnotationFilepath, quality=95)         #dont use 100
        #im3.close()
        im.close()
        #show main terminal for user input
        #app = Application().connect(process=pid_terminal)
        #app.top_window().set_focus()
        
        #print result for debugging
        print(pos)
        print(text_file)
        
        #ask for human QC
        resultQC = input("Check if result is good - 1, or bad - 0...   (or any other key to exit)")
        #if good
        if resultQC == "0" :
            #move to good folders
            print("its bad")
            os.rename(YoloLabelFilepath,output_directory_BAD + "\\" + filename.rsplit('.',1)[0] + ".txt")
            os.rename(filepath,output_directory_BAD + "\\" + filename)
            os.rename(AnnotationFilepath,anno_output_directory_BAD + "\\" + filename.rsplit('.',1)[0] + "_annotation.jpg")
        else:
            #if bad
            if resultQC == "1" :
                #move to bad folders
                print("its good, so moved file to GOOD directory.")
                os.rename(YoloLabelFilepath,output_directory_GOOD + "\\" + filename.rsplit('.',1)[0] + ".txt")
                os.rename(filepath,output_directory_GOOD + "\\" + filename)
                os.rename(AnnotationFilepath,anno_output_directory_GOOD + "\\" + filename.rsplit('.',1)[0] + "_annotation.jpg")
            else:
                print("Something went wrong")
                break
        
        #close the shown plate image manually

        print(ALL_CHAR)

        #clean up variables
        pos.clear()  
        ALL_CHAR.clear()    

        i = i + 1
        #break  #used to debug for only 1 cycle
    else:
        pass
    
    # keep this txt manipulation calls below for my reference... for future editing of single characters in yolo file
    # if X != [] and len(X)>3:
    #     if min(confidence) > 0.6 :
    #         text_file = open(output_logfile,"a")
    #         text_file.write(filename)
    #         text_file.write("\n")
    #         text_file.write(str(X))
    #         text_file.write("\n")
    #         text_file.write(str(confidence))
    #         text_file.write("\n")
    #         text_file.close()
    #     else:
    #         pass
    # else :
    #     pass
    

    #cleanup variables for next image processed
    pos.clear()

Hi aesterling, i updated the ocr / yolo code in my initial post, so v20210713 should be a lot better now. o man… I set out to add a few little things, and ended up re-writing the program multiple times this week to get it all to work just right. learned a lot! :slight_smile:

most important part of the retraining to improve the custom model accuracy will be to correct the incorrect results (either manual labeling, or editing the yolo files), as the ‘good’ results were already detected fine by the existing model, and probably wont improve it too much.

1 Like

Wow, what a great tool to process the dataset shallowstack!

Any chance you could help me do a version that is just normal object detection and not LPR? I’m guessing it’s a lot simpler. I would like to find and save objects, with some confidence limit, and then use the second script to manually sort the results. I’m struggling to understand how to omit all the LPR specific features without braking the code.

Sure, sounds like it shouldn’t be too difficult. I’ll have a look later, but from memory it should be pretty much the same, only instead of using the characters as the classes, your classes.txt will be replaced by your object labels. replacing this file with your desired labels will get you most of the way there i think. and note, the order matters (use your labeling order) and the txt file should look like the below with one item per line:
dog
person
car
(etc)

and of coarse change the directory variables to where ever works for your computer filesystem files.

you could also comment out the sections on overlapping characters, and the section where it limits the number of detections to 6 (or maybe just increase plate_len_threshold value to 100 or however many objects you want.

happy to help you get it going, lemme know whereever you get stuck or if you need me to write anything.
cheers!

I sorta got it working. I did minimal changes to the code, just added my classes and changed the min_len_thres to 0 and plate_len_threshold = 20 (i have at most 2 objects in one picture so far).

It seems to work fine, but the YOLO output ads a space at the end of each line. I’m not sure if it’s a problem for deepstack (havent tried it yet) but when i try to double-check and fix errors using labelImg it does not read the lines properly when they have the trailing space.

on line 106 i found:

data2write = str(ClassValue) + " " + str(xCenter) + " " + str(yCenter) + " " + str(xWidth) + " " + str(yWidth) + " "

I tried removing the last + " " (after str(xWidth)) but the space is still present in the output.

Summary

The python printout shows it:
COMPLETED : pic1.jpg *** ['myseconclass'] *** [361] *** [0.83] *** ['1 0.104818 0.811574 0.021615 0.062963 ']

The same thing in the logfile:

pic1.jpg
[‘myseconclass’]
[361]
[0.83]
['1 0.104818 0.811574 0.021615 0.062963 ']

Any ideas on where it is added and how to get rid of it?

EDIT:
Well, the space is now gone, I’m sure it was line 106 but i was editing a copy, and not the code I actually ran, rookie mistake!
The script now works as expected and it helps A LOT to prepare new pictures for training. Now it’s about trimming in the minimum confidence value.
Once again, thank you for this piece of code.

I tried the second script to sort the output, but it was super slow to create and display the annotated images. I’m gonna stick to opening the folder in labelImg where i can edit and manually add objects if needed.

1 Like

Awesome!
I just remembered, I was running these scripts on pre-cropped images (should be another cropping script out there), which is maybe why they weren’t too slow to QC the outputs using the PIL method i used. but it was still a bit quirky, so labellmg will probably be much better! is your plan to correct bad ones on-the-spot as you go through the files in labellmg?

thanks for finding this! it seems I can’t edit the original post anymore (time limit?), so i will leave as is to avoid clutter.

I’m also curious, Are you improving the standard deepstack object detection model with your own results? or was this a custom object detection model of yours to start with?

It’s a custom model, it started out with me wanting deepstack to identify cats on my outdoor security cameras, as the standard model failed miserably (it probably does a decent job on stock photos, not from security cameras with both day/night imagery).

I did a simple model with only one object “owncat” that i use with blue iris and AI-tool. It worked quite good, i got some nice results, even recorded a fox at night (close enough). With this model it went from no hits, to working quite well. Sometimes there’s some false positives though. To mitigate that i am now trying to identify different cats as different objects. I’m not sure it it will work to minimize confusion, but it seems logical to me that the AI does a better job if an object, like the neighbors fat white cat, is always white, instead of the generic “owncat” object that changes in color and size depending on what cat happens to be in the picture. We live quite rural, so far i have captured only five different cats (out of which two are ours).

I got the idea to make the new model that identified the specific cats when i saw this thread, as i thought the training set would have to be quite substantial to get good results doing all the work manually felt to big of an undertaking.

I manually re-tagged my training set with different per cat objects yesterday and ran the training over night, got about 200 epocs out of it. The quick model seems to do a decent job. I estimate about an 80% successrate with your script on new images, running with min_conf_thres as 0. The remaining 20% needed some manual work, like tagging a missed cat, removing a false positive tag, changing what cat-individual got tagged, or resizing the object box. Doing the last 20% touchups manually with labellmg as i check the results workes fine for me. I’ll probably make some basic automation, like a powershell script, so i can just dump new images in a folder and then hit run, and it fires up deepstack with the custom model and then starts your script to process the files, and lastly copies in the used classes.txt to the image folder so everything is set up to start labellmg and do the manual touch-up.

It’s quite easy for me to get a decent amount of images of the cats, i just playback an already tagged cat-video and save a bunch of screenshots. It could probably even be automated in Blue Iris using the timelapse feature, saving stills with a custom trigger for when cats were detected

The whole project is just stupid as the goal to watch my own CCTV cat videos are not that interesting, it’s the learning and playing with the setup that gives me satisfaction.

1 Like

lol, yep i totally get it :slight_smile: